Being a multiparadigm programming language, nim lets you code in many different styles, in this post I show you how I went from a procedural code to a functional(ish) one. Making the code gain readability and expressiveness, without any performance penalty.
Thee following it's a subset of the nim implementation of my raybench project.
Original code:
proc main() =
var data = newSeq[seq[V3]]()
let world = world_new()
let vdu = (world.camera.rt - world.camera.lt) / float32(WIDTH)
let vdv = (world.camera.lb - world.camera.lt) / float32(HEIGHT)
randomize()
for y in 0..(HEIGHT-1):
var row = newSeq[V3]()
for x in 0..(WIDTH-1):
var color = zero
var ray:Ray
ray.origin = world.camera.eye
for i in 1..SAMPLES:
ray.direction = ((world.camera.lt + (vdu * (float32(x) + float32(random(1'f32))) +
vdv * (float32(y) + float32(random(1'f32))))) -
world.camera.eye).unit
color = color + trace(world, ray, 0)
color = color / float32(SAMPLES)
row.add(color)
data.add(row)
writeppm(data)
Improved Code:
proc main() =
let world = world_new()
let vdu = (world.camera.rt - world.camera.lt) / float32(WIDTH)
let vdv = (world.camera.lb - world.camera.lt) / float32(HEIGHT)
let ss = toSeq(1..SAMPLES)
let hs = toSeq(0..<HEIGHT)
let ws = toSeq(0..<WIDTH)
randomize()
hs.map(proc (y:int): auto =
ws.map(proc (x:int): auto =
foldl(ss.mapIt(
trace(world, (
world.camera.eye,
((world.camera.lt + (vdu * (float32(x) + random(1.0)) +
vdv * (float32(y) + random(1.0)))) -
world.camera.eye).unit
), 0)), a + b) / float32(SAMPLES)
)
).writeppm
Stats
Lines | Characters | |
Original | 26 | 731 |
Improved | 21 | 628 |
The comparison shows a decrease in the number of characters used to describe the same computation.
Performance wise, the results are as shown (best out of three, running on a Core-i7 6700):
Original:
$ time ./nimrb
real 0m25.924s
user 0m25.781s
sys 0m0.125s
Improved:
$ time ./nimrb_map
real 0m25.016s
user 0m24.828s
sys 0m0.156s
There seems to be a very small improvement in speed, but I wouldn't make much of it.
While changes in the code improved readability, it also allowed me to make another small improvement. Using the Parallel Map function described here: http://blog.ubergarm.com/10-nim-one-liners-to-impress-your-friends/ and changing a single line from this:
hs.map(proc (y:int): auto =
to this:
hs.pMap(proc (y:int): auto =
We are taking advantage of nim's threadpool library and, with it, of all the resources available from the current CPU, improving the running speed.
$ time ./nimrb_pmap
real 0m8.884s
user 1m4.156s
sys 0m0.250s
Making the code run almost 3 times as fast as the single threaded version, with no significant changes to the code.
I find this very interesting, and will be looking forward to keep playing around with nim some more.