Ray tracer language comparison

Verbosity

Overview

In the face of objections from Lisp and Scheme advocates, we have measured verbosity not only in terms of the number of lines of code but also in terms of bytes with whitespace collapsed. Interestingly, the two different measures of verbosity provide qualitatively similar results. OCaml remains the most concise language by a considerable margin.

The OCaml implementations of the ray tracer are all remarkably concise. The shortest OCaml implementation is only 43 lines of code or 1,723 bytes, which is far more concise than any of the other languages. There are several reasons for this incredible brevity:

  • Type inference: there are no type declarations.
  • Recursive types: even the recursive type of the scene graph is completely inferred.
  • Pattern matching: values are structurally deconstructed with almost no overhead.
  • List literals: lists can be specified directly as values.

Standard ML is the next most verbose language, with a minimal version weighing in at 72 lines of code or 2,569 bytes. Although this is significantly longer than the shortest OCaml implementation, it is worth noting that the shortest SML is 35% faster.

The remaining languages (C++, Java, Lisp and Scheme) are all similarly verbose on most of the implementations. The only notable exception is the most optimised Lisp implementation which, at 198 lines or 5,148 bytes, is considerably more verbose that all of the other languages. This Lisp implementation has 3× as lines and 2× as many bytes as the OCaml implementation that is 4% faster!

Code Comparisons

A lot can be learned from side-by-side comparisons of the code required to achieve the same thing in different languages.

Scene graph

The most elaborate type in this ray tracer is the scene graph. This recursive data structure represents the positions and sizes of the spheres in a scene. In particular, the scene graph is nested into groups of scene graphs with each group having a bounding volume that is itself a sphere. These recursive bounds underpin the efficiency of the ray tracer, allowing it to render tens of thousands of spheres with shadows in only a few seconds.

The type representing a scene graph is described by an inheritance hierachy of classes (Sphere and Group) derived from a base class (Scene) in both the C++ and the Java. The large amount of code required to describe a simple tree in these languages is in stark contrast to the OCaml (which requires no code to define the scene tree) and the Standard ML (which requires a simple sum type):

C++:
struct Scene {
  virtual ~Scene() {};
  ...
};

struct Sphere : public Scene {
  Vec center;
  double radius;
  Sphere(Vec c, double r) : center(c), radius(r) {}
  ~Sphere() {}
  ...
};

typedef list<Scene *> Scenes;

struct Group : public Scene {
  Sphere bound;
  Scenes child;
  Group(Sphere b, Scenes c) : bound(b), child(c) {}
  ~Group() {
    for (Scenes::const_iterator it=child.begin();
      it != child.end(); ++it)
    delete *it;
  }
  ...
};
SML:
datatype scene =
    Sphere
  | Group of (scene * vec * real) list
OCaml:

No code required!

In particular, both the C++ and the Java contain many redundant definitions for functions that do little or nothing. Even worse, the C++ defines destructors to deallocate memory used by the scene graph. These destructors and explicit deallocation is unnecessary in the presence of garbage collection (provided by all of the other languages) but, more importantly, the naive approach to deallocation implemented by the C++ prohibits the reuse of scenes, i.e. a scene cannot contain itself. Although this is not a serious problem in the context of this ray tracer, many applications can benefit from the reuse of recursive data structures (e.g. the representation of symbolic expressions with common subexpressions) and this problem is not easily solved in the absence of a garbage collector.

Both the C++ and the Java waste a large amount of code describing functions to construct scene graphs. Look at the amount of code required to create the simple scene graph used in this ray tracer:

C++:
Scene *s = new Sphere(c, r);
if (level == 1) return s;
Scenes child;
child.push_back(s);
double rn = 3*r/sqrt(12.);
for (int dz=-1; dz<=1; dz+=2)
  for (int dx=-1; dx<=1; dx+=2)
    child.push_back(create(level-1,
                           c + rn*Vec(dx, 1, dz),
                           r/2));
return new Group(Sphere(c, 3*r), child);
OCaml:
let obj = c, r, [] in
if level = 1 then obj else
  let a = 3. *. r /. sqrt 12. in
  let aux x' z' =
    create (level - 1) (c +| (x', a, z')) (0.5 *. r) in
  c, 3.*.r, [obj; aux (-.a) (-.a); aux a (-.a);
             aux (-.a) a; aux a a]

In this case, list literals allow small lists to be specified easily in OCaml, Standard ML, Lisp and Scheme. Both C++ and Java lack list literals and the programmer must resort to allocation and repeated appending.

The disadvantage of OCaml's strong static typing is the use of different operators for int and float arithmetic, leading to some expressions being slightly more verbose in OCaml than in C++:

C++:
double rn = 3*r/sqrt(12.);
OCaml:
let a = 3. *. r /. sqrt 12. in

However, this is a small price to pay for the substantial brevity achieved in the rest of the OCaml program.

Intersection

The intersection routines for the core of the ray tracer. Again, the amount of code required to perform this simple tree-traversal in C++ and Java is considerable compared to the equivalent OCaml:

C++:
struct Sphere : public Scene {
  double ray_sphere(const Ray &ray) const {
    Vec v = center - ray.orig;
    double b = dot(v, ray.dir),
      disc = b*b - dot(v, v) + radius * radius;
    if (disc < 0) return infinity;
    double d = sqrt(disc), t2 = b + d;
    if (t2 < 0) return infinity;
    double t1 = b - d;
    return (t1 > 0 ? t1 : t2);
  }

  Hit intersect(const Hit &hit, const Ray &ray) const {
    double lambda = ray_sphere(ray);
    if (lambda >= hit.first) return hit;
    return Hit(lambda,
               unitise(ray.orig + lambda*ray.dir - center));
  }
};

struct Group : public Scene {
  Hit intersect(const Hit &hit, const Ray &ray) const {
    Hit hit2=hit;
    double l = bound.ray_sphere(ray);
    if (l >= hit.first) return hit;
    for (Scenes::const_iterator it=child.begin();
         it!=child.end(); ++it)
      hit2 = (*it)->intersect(hit2, ray);
    return hit2;
  }
};
OCaml:
let rec intersect o d (l, _ as hit) (c, r, s) =
  let v = c -| o in
  let b = dot v d in
  let disc = sqrt(b *. b -. dot v v +. r *. r) in
  let t1 = b -. disc and t2 = b +. disc in
  let l' =
    if t2<0. then infinity else
      if t1>0. then t1 else t2 in
  if l' >= l then hit else match s with
    [] -> l', unitise (o +| l' *| d -| c)
  | ss -> List.fold_left (intersect o d) hit ss

In the shortest OCaml implementation, a node in the scene graph is represented by the center and radius of a sphere and a list of child nodes. A solid sphere in the scene is represented by a node with no children, e.g. c, r, []. A spherical bound containing child nodes is represented by a node with at least one child, e.g. c, r, [a; b; c].

In the other OCaml and in all Standard ML implementations a sum type is used to represent a scene. A solid sphere in the scene is represented by the Sphere constructor, e.g. Sphere(c, r). A spherical bound containing child nodes is represented by the Group constructor, e.g. Group(c, r, [a; b; c]).

OCaml's remarkable brevity is largely down to its ability to dissect data structures using pattern matching and traverse them using higher-order functions. Some brevity is also gained through the use of currying.

In particular, note that pattern matching is used not only after match and function constructs but also in let definitions, e.g. to dissect the current intersection point hit, extracting the parameter l. The points that leverage pattern matching are highlighted in the code below:

OCaml:
let rec intersect o d (l, _ as hit) (c, r, s) =
  let v = c -| o in
  let b = dot v d in
  let disc = sqrt(b *. b -. dot v v +. r *. r) in
  let t1 = b -. disc and t2 = b +. disc in
  let l' =
    if t2<0. then infinity else
      if t1>0. then t1 else t2 in
  if l' >= l then hit else match s with
    [] -> l', unitise (o +| l' *| d -| c)
  | ss -> List.fold_left (intersect o d) hit ss

The higher-order function List.fold_left is used to traverse the child scenes in a group of scenes:

OCaml:
let rec intersect o d (l, _ as hit) (c, r, s) =
  let v = c -| o in
  let b = dot v d in
  let disc = sqrt(b *. b -. dot v v +. r *. r) in
  let t1 = b -. disc and t2 = b +. disc in
  let l' =
    if t2<0. then infinity else
      if t1>0. then t1 else t2 in
  if l' >= l then hit else match s with
    [] -> l', unitise (o +| l' *| d -| c)
  | ss -> List.fold_left (intersect o d) hit ss

This higher-order function factors out the common operation of looping over the elements of a list and accumulating a result. In this case, the result is the closest intersection point found so far.

The curried form of the intersect function is leveraged when this function passes itself as the argument to List.fold_left:

OCaml:
let rec intersect o d (l, _ as hit) (c, r, s) =
  let v = c -| o in
  let b = dot v d in
  let disc = sqrt(b *. b -. dot v v +. r *. r) in
  let t1 = b -. disc and t2 = b +. disc in
  let l' =
    if t2<0. then infinity else
      if t1>0. then t1 else t2 in
  if l' >= l then hit else match s with
    [] -> l', unitise (o +| l' *| d -| c)
  | ss -> List.fold_left (intersect o d) hit ss

Specifically, the order of the arguments to the intersect function is chosen such that the more-constant arguments appear first (the ray origin o and ray direction d), followed by the accumulator hit and then the scene. The built-in List.fold_left function used to accumulate an intersection over a list of child scenes expects a function argument that requires an accumulator and then a list of scenes. The function argument intersect o d partially applies the ray origin and direction to the intersect function to obtain a closure that expects the appropriate arguments. This use of currying is commonplace in OCaml.

In a functional programming language without using currying (the default style in Standard ML, Lisp and Scheme), an anonymous function can be used to achieve the same effect:

OCaml with currying:
List.fold_left (intersect o d) hit ss
OCaml without currying:
List.fold_left (fun hit scene -> intersect(o, d, hit, scene)) hit ss

Given the expressiveness of pattern matching as used in the OCaml and Standard ML, it is surprising that pattern matching is rare in the functional programming languages Lisp and Scheme. As a consequence, the Lisp and Scheme implementations of the intersection routine are very verbose:

Lisp:
(defun ray-sphere (orig dir center radius)
  (let* ((v (-v center orig))
         (b (dot v dir))
         (disc (+ (- (* b b) (dot v v))
                  (* radius radius))))
    (if (< disc 0d0)
        infinity
        (let* ((disc (sqrt disc))
               (t2 (+ b disc))
               (t1 (- b disc)))
          (cond ((< t2 0d0) infinity)
                ((> t1 0d0) t1)
                (t t2))))))

(defun intersect (orig dir scene)
  (labels ((aux (lam normal scene)
             (let* ((center (sphere-center scene))
                    (lamt
                       (ray-sphere orig
                                   dir
                                   center
                                   (sphere-radius scene))))
               (if (>= lamt lam)
                   (values lam normal)
                   (etypecase scene
                     (group
                      (dolist (kid (group-children scene))
                        (setf (values lam normal)
                              (aux lam normal kid)))
                      (values lam normal))
                     (sphere
                      (values lamt
                         (unitise
                            (-v (+v orig (*v lamt dir))
                                center)))))))))
    (aux infinity zero scene)))
Scheme:
(define (ray-sphere orig dir center radius)
 (let* ((v (v-v center orig))
        (b (dot v dir))
        (disc (+ (- (* b b) (dot v v))
                 (* radius radius))))
  (if (negative? disc)
      infinity
      (let* ((disc (sqrt disc)) (t2 (+ b disc)))
       (if (negative? t2)
           infinity
           (let ((t1 (- b disc)))
              (if (positive? t1) t1 t2)))))))

(define (intersect orig dir scene)
 (let aux ((scene scene) (hit (make-hit infinity zero)))
  (let ((l (hit-l hit)))
   (if (null? (scene-scenes scene))
       (let ((l-prime (ray-sphere
                         orig dir
                         (scene-c scene)
                         (scene-r scene))))
        (if (>= l-prime l)
            hit
            (make-hit
             l-prime
             (unitise (v+v orig (v-v (s*v l-prime dir)
                                     (scene-c scene)))))))
       (if (>= (ray-sphere
                     orig dir
                     (scene-c scene) (scene-r scene)) l)
           hit
           (fold aux hit (scene-scenes scene)))))))

The Lisp and Scheme implementations are also bloated by the number of parentheses (the Lisp has 9 consecutive close parentheses near the end!) and the use of prefix notation. For example, the expression b*b - v.v + r*r is written (+ (- (* b b) (dot v v)) (* r r)) in Lisp, and the expression t<0 is written (negative? t2) in Scheme.

Verbosity of Optimizations

Although it is interesting to note that the shortest OCaml implementation is substantially shorter than all other implementations, this neglects the fact that this implementation is also the second slowest. Therefore, the verbosity of the more optimised implementations is also a point of interest.

In the faster versions of the ray tracers, the OCaml implementations remain the most concise. However, although the most concise C++ is 2.4× longer than the most concise OCaml, the fastest C++ is only 1.65× longer than the fastest OCaml. This reflects that fact that both implementations are converging on the same low-level representation of the program in order to express more optimizations, i.e. OCaml's expressiveness must be sacrificed in order to reach competitive performance.

For example, the two fastest implementations are written in C++ and OCaml. The ray-sphere intersection functions in these two programs share many low-level optimizations and the OCaml is similarly verbose as a consequence:

C++:
double ray_sphere(const Vec &dir) const {
  double b = center.x*dir.x + center.y*dir.y + center.z*dir.z;
  double disc = b*b - dot(center, center) + radius * radius;
  if (disc > 0) {
    double d = sqrt(disc), t2 = b + d;
    if (t2 > 0) {
      double t1 = b - d;
      return (t1 > 0 ? t1 : t2);
    }
  }
  return infinity;
}
OCaml:
let ray_sphere {x=dx; y=dy; z=dz} {x=vx; y=vy; z=vz} r =
  let disc = vx *. vx +. vy *. vy +. vz *. vz -. r *. r in
  if disc < 0. then infinity else
    let b = vx *. dx +. vy *. dy +. vz *. dz in
    let b2 = b *. b in
    if b2 < disc then infinity else
      let disc = sqrt(b2 -. disc) in
      let t1 = b -. disc in
      if t1 > 0. then t1 else b +. disc

Summary

OCaml is the most concise language, followed by Standard ML and then C++, Java and Scheme. Finally, Lisp is the most verbose language.

Previous: Results Next: Performance