Ray tracer language comparison

Ease of use

We can also make some statements about the ease of use of the languages and compilers.

When developing, the time taken to recompile is important. The compiles times were, in increasing order:

Language: CompilerCompile time
SML: SML/NJ0.23s
OCaml: ocamlopt0.41s
C++: g++0.71s
Java: JDK 1.50.86s
Lisp: SBCL1.1s
SML: MLton3.1s
Scheme: Stalin166s

Note that OCaml and SML/NJ compile faster than all of the other languages and Stalin compiles the Scheme implementation three orders of magnitude slower than SML/NJ, taking minutes to compile this tiny program. In practice, compile times <2s are not significant. MLton's compile time is noticeable and Stalin's compile time is a serious impediment to development. However, computer performance is continues to improve and MLton's compile times are almost fast enough to be usable for development.

OCaml provides two compilers, one byte-code (ocamlc) and one native-code (ocamlopt). The ocamlc compiler is 5 faster at compiling these ray tracers than the ocamlopt compiler but native-code executables produced by ocamlopt run much faster. The ocamlopt compiler combines one of the fastest compilation times with one of the fastest running times. OCaml also provides an interactive top-level (ocaml) that is useful for testing program fragments.

SML can be compiled by a variety of compilers. We have chosen to use the two most popular SML compilers, MLton and SML/NJ. These two compilers can fill somewhat similar roles to ocamlc and ocamlopt for OCaml in that SML/NJ is fast at compiling and MLton produces fast executables. However, ocamlopt compiles almost as quickly as SML/NJ whilst simultaneously producing executables that are roughly as fast as MLton's. SML/NJ is an interactive top-level, providing the same functionality as OCaml's top-level but with better performance as SML/NJ compiles to native-code whereas OCaml's top-level compiles to an interpreted byte-code.

Unfortunately, despite the existence of a "standard", the MLton and SML/NJ compilers cannot compile the same source code for this benchmark. Specifically, we found:

  • SML/NJ does not implement the SML library function required to compute the machine epsilon.
  • SML/NJ interprets programs as a sequence of definitions as if they were entered into a top-level whereas MLton analyses the whole program at once. At one point, this led to the two compilers inferring different types for one of the functions, which would not then compile under SML/NJ.
  • As the Standard ML language definition omits discussion of building executables, SML/NJ has a comparatively bizarre and convoluted method of invocation.

Thus, we resorted to combining code fragments using a bash script in order to factor out the SML code common to the SML/NJ and MLton implementations. Although this was a significant source of irritation when developing the SML implementations, build tools are available to smooth out the differences between compilers for more significant projects and we believe that our tiny program happened to hit upon unusual discrepancies.

Both OCaml and Standard ML go to great lengths to verify programs during compilation. This design is intended to put them in the unique position compared to the other languages that they catch many more bugs at compile time than the other languages and compilers. We have indeed found this to be the case. The OCaml and SML implementations were about 10x faster to develop than the other languages because many errors were caught by the compiler.

The g++ compiler, used to compile the C++ implementations, is the next fastest at compiling. However, C++ is the least safe language of those tested here. Minor mistakes when writing C++ programs can lead to segmentation faults or, worse, errors silently creeping in to the results of computations. This is even true when using the STL and avoiding all pointer arithmetic. As program robustness is always more important than performance, we do not hesitate to recommend any of the other languages over C++.

The Java compiler (Sun's JDK 1.5) is unique among these languages because it defers some compilation to run-time. We have been careful to test that Java's unusually slow startup time (of ~1s) does not significantly affect the results. Potentially, Java's just-in-time (JIT) compiler could use information that can only be obtained only at run time to further optimise the running program. However, Java's poor performance clearly indicates that Sun's current compiler is not succeeding in this respect on this particular benchmark.

SBCL Lisp also provides an interactive top-level, like OCaml and SML/NJ. However, the SBCL compiler does virtually nothing to check the Lisp code that it is given at compile time. Instead, testing is deferred until run time. This is a considerable hindrance to development as the program will run for an arbitrarily long time and the program state must then be examined in a debugger just to find simple type errors. During the development of this ray tracer, we often encountered type errors due to simple typographical errors in the source code (such as missing or extra pairs of parentheses) only after several minutes of execution. This hindrance is typical of dynamically typed languages.

Like Lisp, the generality of the Scheme language puts it at a great disadvantage to the other languages in terms of performance. In order to achieve competitive run-time performance, the Stalin Scheme compiler clearly does an incredible job of optimising unspecialised code and can even beat all of the other implementations in this benchmark with only a few low-level optimisations. However, Stalin's unparalleled ability to optimise comes at a grave cost in terms of the time taken to compile. The compile time is three orders of magnitude slower than that of the fastest compiler (SML/NJ). By requesting fewer optimisations on the command line, the compile time can be greatly reduced to around 30s. However, that is still an irritatingly long time and the resulting executables are considerably slower.

Conclusions

Fast development time, succinct code, fast compile time and fast run time make both OCaml and Standard ML very desirable languages for serious software development. Efficiency makes these languages suitable for performance critical work. Rapid development and brevity makes them ideal for prototyping. Static type checking makes them ideal for intrinsically complicated programs.

In order to choose between OCaml and Standard ML, it is instructive to study their differences in more detail:

OCamlMLton/SML
Two almost-entirely compatible compilers (ocamlc and ocamlopt)Many somewhat-incompatible compilers (MLton, SML/NJ, ML-kit, PolyML, MoscowML, ...)
Recent innovations, e.g. polymorphic variants and objectsFew additions to the 1997 standard
Fast 0.36s compile timeSlow 9.4s compile time with MLton or slow run time with any other compiler
Supports partial recompilationMLton must recompile the whole program
Concise grammarSlightly more verbose grammar ("end", "val", curried anonymous functions)
x86, AMD64, IA64, PowerPC, Alpha, MIPS, StrongARM, Sparc and HPPAOnly x86, PowerPC and Sparc
For and while loopsOnly while loops (Read more...)
Labelled and optional function argumentsRecords can be (ab)used to get the same effect
Functional record update ("with")No functional record update (Read more...)
Optional guards on pattern matches ("when")No guards on pattern matches
Polymorphic variantsNo polymorphic variants
Object orientationRecords can be (ab)used to get some of the effects (Read more...)
Dynamic loading of byte-codeNo dynamic loading of code with MLton
Stacks, queues, sets, maps, hash tables, 1D, 2D and 3D numerical arrays (big arrays) and many other data structuresVectors, slices, 2D polymorphic arrays and many other data structures
Maximum array length can be as small as 2,097,151 elementsMaximum array length of at least 1,073,741,823 elements
32-bit floats only in "big arrays"32-bit floats anywhere
Files are implicitly modulesFiles are independent of the module system
No immutable arrays or stringsBoth mutable and immutable arrays and strings
Polymorphic comparison and hashing can produce run-time type errorsWell-defined and total polymorphic equality that is statically type checked but no polymorphic inequalities or hashing
Define symbol infix operators with implicit precedences and associativities anywhereDefine any infix operators with any given precedences and associativities but only for the local scope (Read more...)
Type unsafe marshallingNo marshalling

Unlike Standard ML, the OCaml language is not standardized and continues to evolve. The evolution of OCaml has allowed it to adopt many new programming constructs not found in Standard ML, including polymorphic variants and objects. However, the evolution of the OCaml language sometimes renders old code uncompilable or, worse, different in terms of performance or even correctness.

We felt that the OCaml compilers were more helpful than MLton and SML/NJ when reporting errors. This is not a technical discrepancy but, rather, is due to the fact that the OCaml compilers report errors in plain English with minimal use of symbols, minimal assumed knowledge and no references to the internals of the compilers.

The evolution of functional programming languages is easily seen by comparing the functionality provided by Common Lisp (1956-1984) with Standard ML (1990-1997) and OCaml (1996-present). In the context of this benchmark, both OCaml and Standard ML clearly improve upon SBCL-compiled Lisp in terms of code size, compile-time performance and run-time performance.

However, the designers of the ML family of languages (including OCaml and Standard ML) deliberately avoided some of the functionality provided by Lisp in order to facilitate static type checking and improve performance. Specifically, Lisp provides macros to customise syntax and allows them to be entwined with ordinary code, and provides first-class code (run-time code generation and compilation). Standard ML provides neither macros nor run-time code generation. OCaml provides camlp4 macros, a limited form of macros that are separate from the language, and the derived language MetaOCaml also provides run-time code generation.

We found Stalin-compiled Scheme much easier to develop than SBCL-compiled Lisp thanks to Stalin's additional compile-time checking. Thus, we recommend Stalin and Scheme over SBCL and Lisp to people interested in learning about this approach to syntax extension and run-time code generation. Interestingly, languages based upon term rewriting (e.g. Mathematica) provide both macros and run-time code generation. However, they are typically much slower for general-purpose computation.

Many programmers are currently migrating from C++ to Java because of the increase in program stability offered by Java (e.g. no pointer arithmetic, garbage collection). For precisely those reasons, we certainly concur that Java is preferable to C++ for serious programming. However, given our results, we believe that these programmers would do well to learn either Standard ML or OCaml as well. These languages are smaller, simpler and more expressive, faster and easier to develop in and produce faster executables. Above all, they're more fun!

Previous: Performance Next: Ray tracer language comparison