Functional Programming is Slow – revisited

I have written before about Functional Program with a rather negative stand point (it sucks, it is slow). Those posts have some readers, but they are a few years old, and I wanted to do some new benchmarks.

Please note:

  • This is written with JavaScript (and Node.js) in mind. I don’t know if these findings apply to other programming languages.
  • Performance is important, but it is far from everything. Apart from performance, there are both good and bad aspects of Functional Programming.

Basic Chaining

One of the most common ways to use functional programming (style) in JavaScript is chaining. It can look like this:

v = a.map(map_func).filter(filter_func).reduce(reduce_func)

In this case a is an array, and three functions are sequentially called to each element (except reduce is not called on those that filter gets rid of). The return value of reduce (typically a single value) is stored in v.

  • What is the cost of this?
  • What are the alternatives?

I decided to calculate the value of pi by

  1. evenly distribute points in the[0,0][1,1] rectangle.
  2. for each point calculate the distance to origo (a simple map)
  3. get rid of each point beyond distance 1.0 (a simple filter)
  4. count the number of remaing points (a simple reduce – although in this simple case it would be enough to check the length of the array)

The map, filter and reduce functions looks like:

const pi_map_f = (xy) => {
  return xy.x * xy.x + xy.y * xy.y;
};
const pi_filter_f = (xxyy) => {
  return xxyy <= 1.0;
};
const pi_reduce_f = (acc /* ,xxyy */) => {
  return 1 + acc;
};

In chained functional code this looks like:

const pi_higherorder = (pts) => {
  return 4.0
       * pts.map(pi_map_f)
            .filter(pi_filter_f)
            .reduce(pi_reduce_f,0)
       / pts.length;
};

I could use the same three functions in a regular loop:

const pi_funcs = (pts) => {
  let i,v;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    v = pi_map_f(pts[i]);
    if ( pi_filter_f(v) ) inside = pi_reduce_f(inside,v);
  }
  return 4.0 * inside / pts.length;
};

I could also write everything in a single loop and function:

const pi_iterate = (pts) => {
  let i,p;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    p = pts[i];
    if ( p.x * p.x + p.y*p.y <= 1.0 ) inside++;
  }
  return 4.0 * inside / pts.length;
};

What about performance? Here are some results from a Celeron J1900 CPU and Node.js 14.15.0:

Iterate (ms)Funcs (ms)Higher Order (ms)Pi
10k88103.1428
40k34193.1419
160k33473.141575
360k661963.141711
640k11114043.141625
1000k17175593.141676
1440k252511603.14160278

There are some obvious observations to make:

  • Adding more points does not necessarily give a better result (160k seems to be best, so far)
  • All these are run in a single program, waiting 250ms between each test (to let GC and optimizer run). Obvously it took until after 10k for the Node.js optimizer to get things quite optimal (40k is faster than 10k).
  • The cost of writing and calling named functions is zero. Iterate and Funcs are practially identical.
  • The cost of chaining (making arrays to use once only) is significant.

Obviously, if this has any practical significance depends on how large arrays you are looping over, and how often you do it. But lets assume 100k is a practical size for your program (that is for example 100 events per day for three years). We are then talking about wasting 20-30ms every time we do a common map-filter-reduce-style loop. Is that much?

  • If it happens server side or client side, in a way that it affects user latency or UI refresh time, it is significant (especially since this loop is perhaps not the only thing you do)
  • If it happens server side, and often, this chaining choice will start eating up significant part of your server side CPU time

You may have a faster CPU or a smaller problem. But the key point here is that you choose to waste significant amount of CPU cycles because you choose to write pi_higherorder rather than pi_funcs.

Different Node Versions

Here is the same thing, executed with different versions of node.

1000kIterate (ms)Funcs (ms)Higher Order (ms)
8.17.01111635
10.23.01111612
12.19.01111805
14.15.01819583
15.1.01719556

A few findings and comments on this:

  • Different node version show rather different performance
  • Although these results are stable on my machine, what you see here may not be valid for a different CPU or a different problem size (for 1440k points, node version 8 is the fastest).
  • I have noted before, that functional code gets faster, iterative code slower, with newer versions of node.

Conclusion

My conclusions are quite consistent with what I have found before.

  • Writing small, NAMED, testable, reusable, pure functions is good programming, and good functional programming. As you can see above, the overhead of using a function in Node.js is practially zero.
  • Chaining – or other functional programming practices, that are heavy on the memory/garbage collection – is expensive
  • Higher order functions (map, filter, reduce, and so on) are great when
    1. you have a named, testable, reusable function
    2. you actually need the result, just not for using once and throwing away
  • Anonymous functions fed directly into higher order functions have no advantages whatsoever (read here)
  • The code using higher order functions is often harder to
    1. debug, becuase you can’t just put debug outputs in the middle of it
    2. refactor, because you cant just insert code in the middle
    3. use for more complex agorithms, becuase you are stuck with the primitive higher order functions, and sometimes they don’t easily allow you to do what you need

Feature Wish

JavaScript is hardly an optimal programming language for functional programming. One thing I miss is truly pure functions (functions with no side effects – especially no mutation of input data).

I have often seen people change input data in a map.

I believe (not being a JavaScript engine expert) that if Node.js knew that the functions passed to map, filter and reduce above were truly pure, it would allow for crazy optimizations, and the Higher Order scenario could be made as fast as the other ones. However, as it is now, Node.js can not get rid of the temporary arrays (created by map and filter), because of possible side effects (not present in my code).

I tried to write what Node.js could make of the code, if it knew it was pure:

const pi_allinone_f = (acc,xy) => {
  return acc + ( ( xy.x * xy.x + xy.y * xy.y <= 1.0 ) ? 1 : 0);
};

const pi_allinone = (pts) => {
  return 4.0
       * pts.reduce(pi_allinone_f,0)
       / pts.length;
};

However, this code is still 4-5 times slower than the regular loop.

All the code

Here is all the code, if you want to run it yourself.

const points = (n) => {
  const ret = [];
  const start = 0.5 / n;
  const step = 1.0 / n;
  let x, y;
  for ( x=start ; x<1.0 ; x+=step ) {
    for ( y=start ; y<1.0 ; y+=step ) {
      ret.push({ x:x, y:y });
    }
  }
  return ret;
};

const pi_map_f = (xy) => {
  return xy.x * xy.x + xy.y * xy.y;
};
const pi_filter_f = (xxyy) => {
  return xxyy <= 1.0;
};
const pi_reduce_f = (acc /* ,xxyy */) => {
  return 1 + acc;
};
const pi_allinone_f = (acc,xy) => {
  return acc + ( ( xy.x * xy.x + xy.y * xy.y <= 1.0 ) ? 1 : 0);
};

const pi_iterate = (pts) => {
  let i,p;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    p = pts[i];
    if ( p.x * p.x + p.y*p.y <= 1.0 ) inside++;
  }
  return 4.0 * inside / pts.length;
};

const pi_funcs = (pts) => {
  let i,v;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    v = pi_map_f(pts[i]);
    if ( pi_filter_f(v) ) inside = pi_reduce_f(inside,v);
  }
  return 4.0 * inside / pts.length;
};

const pi_allinone = (pts) => {
  return 4.0
       * pts.reduce(pi_allinone_f,0)
       / pts.length;
};

const pi_higherorder = (pts) => {
  return 4.0
       * pts.map(pi_map_f).filter(pi_filter_f).reduce(pi_reduce_f,0)
       / pts.length;
};

const pad = (s) => {
  let r = '' + s;
  while ( r.length < 14 ) r = ' ' + r;
  return r;
}

const funcs = {
  higherorder : pi_higherorder,
  allinone : pi_allinone,
  functions : pi_funcs,
  iterate : pi_iterate
};

const test = (pts,func) => {
  const start = Date.now();
  const pi = funcsfunc;
  const ms = Date.now() - start;
  console.log(pad(func) + pad(pts.length) + pad(ms) + 'ms ' + pi);
};

const test_r = (pts,fs,done) => {
  if ( 0 === fs.length ) return done();
  setTimeout(() => 
    test(pts,fs.shift());
    test_r(pts,fs,done);
  }, 1000);
};

const tests = (ns,done) => {
  if ( 0 === ns.length ) return done();
  const fs = Object.keys(funcs);
  const pts = points(ns.shift());
  test_r(pts,fs,() => {
    tests(ns,done);
  });
};

const main = (args) => {
  tests(args,() => {
    console.log('done');
  });
};

main([10,100,200,400,600,800,1000,1200]);

  1. Functional Programming Sucks* | TechFindings - pingback on 2020/11/07 at 19:16

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trackbacks and Pingbacks: