Decades ago engineers wrote computer programs in ways that modern programmers scorn at. We learn that functions were long, global variables were used frequently and changed everywhere, variable naming was poor and gotos jumped across the program in ways that were impossible to understand. It was all harmful.
Elsewhere matematicians were improving on Lisp and functional programming was developed: pure, stateless, provable code focusing on what to do rather than how to do it. Functions became first class citizens and they could even be anonymous lambda functions.
Despite the apparent conflict between object oriented, functional and imperative programming there are some universally good things:
- Functions that are not too long
- Functions that do one thing well
- Functions that have no side effects
- Functions that can be tested, and that also are tested
- Functions that can be reused, perhaps even being general
- Functions and variables that are clearly named
So, how are we doing?
Comparing different styles
I read code and I talk to people who have different opinions about what is good and bad code. I decided to implement the same thing following different principles and discuss the different options. I particularly want to explore different ways to do functional programming.
My language of choice is JavaScript because it allows different styles, it requires quite little code to be written, and many people should be able to read it.
My artificial problem is that I have two arrays of N numbers. One number from each array can be added in NxN different ways. How many of these are prime? That is, for N=2, if I have [10,15] and [2,5] i get [12,15,17,20] of which one number (17) is prime. In all code below I decide if a number is prime in the same simple way.
Old imperative style (imperative)
The old imperative style would use variables and loops. If I had goto in JavaScript I would use goto instead of setting a variable (p) before I break out of the inner loop. This code allows for nothing to be tested nor reused, although the function itself is testable, reusable and pure (for practical purposes and correct input, just as all the other examples).
const primecount = (a1,a2) => {
let i, j;
let d, n, p;
let retval = 0;
for ( i=0 ; i<a1.length ; i++ ) {
for ( j=0 ; j<a2.length ; j++ ) {
n = a1[i] + a2[j];
p = 1;
for ( d=2 ; d*d<=n ; d++ ) {
if ( 0 === n % d ) {
p = 0;
break;
}
}
retval += p;
}
}
return retval;
}
Functional style with lambda-functions (lambda)
The functional programming equivalent would look like the below code. I have focused on avoiding declaring variables (which would lead to a mutable state) and rather using the higher order function reduce to iterate over the two lists. This code also allows for no parts to be tested or reused. In a few lines of code there are three unnamed functions, none of them trivial.
const primecount = (a1,a2) => {
return a1.reduce((sum1,a1val) => {
return sum1 + a2.reduce((sum2,a2val) => {
return sum2 + ((n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
})(a1val+a2val);
}, 0);
}, 0);
};
Imperative style with separate test function (imperative_alt)
The imperative code can be improved by breaking out the prime test function. The advantage is clearly that the prime function can be modified in a more clean way, and it can be tested and reused. Also note that the usefulness of goto disappeared because return fulfills the same task.
const is_prime = (n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
};
const primecount = (a1,a2) => {
let retval = 0;
for ( let i=0 ; i<a1.length ; i++ )
for ( let j=0 ; j<a2.length ; j++ )
retval += is_prime(a1[i] + a2[j]);
return retval;
};
const test = () => {
if ( 1 !== is_prime(19) ) throw new Error('is_prime(19) failed');
};
Functional style with lambda and separate test function (lambda_alt)
In the same way, the reduce+lambda-code can be improved by breaking out the prime test function. That function, but nothing else, is now testable and reausable.
const is_prime = (n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
};
const primecount = (a1,a2) => {
return a1.reduce((sum1,a1val) => {
return sum1 + a2.reduce((sum2,a2val) => {
return sum2 + is_prime(a1val+a2val);
}, 0);
}, 0);
};
const test = () => {
if ( 1 !== is_prime(19) ) throw new Error('is_prime(19) failed');
};
I think I can do better than any of the four above examples.
Functional style with reduce and named functions (reducer)
I don’t need to feed anonymous functions to reduce: I can give it named, testable and reusable functions instead. Now a challenge with reduce is that it is not very intuitive. filter can be used with any has* or is* function that you may already have. map can be used with any x_to_y function or some get_x_from_y getter or reader function that are also often useful. sort requires a cmpAB function. But reduce? I decided to name the below functions that are used with reduce reducer_*. It works quite nice. The first one reducer_count_primes simply counts primes in a list. That is (re)useful, testable all in itself. The next function reducer_count_primes_for_offset is less likely to be generally reused (with offset=1 it considers 12+1 to be prime, but 17+1 is not), but it makes sense and it can be tested. Doing the same trick one more time with reducer_count_primes_for_offset_array and we are done. These functions may not be reused. But they can be tested and that is often a great advantage during development. You can build up your program part by part and every step is a little more potent but still completely pure and testable (I remember this from my Haskell course long ago). This is how to solve hard problems using test driven development and to have all tests in place when you are done.
const is_prime = (n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
};
const reducer_count_primes = (s,n) => {
return s + is_prime(n);
};
const reducer_count_primes_for_offset = (o) => {
return (s,n) => { return reducer_count_primes(s,o+n); };
};
const reducer_count_primes_for_offset_array = (a) => {
return (s,b) => { return s + a.reduce(reducer_count_primes_for_offset(b), 0); };
};
const primecount = (a1,a2) => {
return a1.reduce(reducer_count_primes_for_offset_array(a2), 0);
};
const test = () => {
if ( 1 !== [12,13,14].reduce(reducer_count_primes, 0) )
throw new Error('reducer_count_primes failed');
if ( 1 !== [9,10,11].reduce(reducer_count_primes_for_offset(3), 0) )
throw new Error('reducer_count_primes_for_offset failed');
if ( 2 !== [2,5].reduce(reducer_count_primes_for_offset_array([8,15]),0) )
throw new Error('reducer_count_primes_for_offset_array failed');
};
Using recursion (recursive)
Personally I like recursion. I think it is easier to use than reduce, and it is great for acync code. The bad thing with recursion is that your stack will eventually get full (if you dont know what I mean, try my code – available below) for recursion depths that are far from unrealistic. My problem can be solved in the same step by step test driven way using recursion.
const is_prime = (n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
};
const primes_for_offset = (a,o,i=0) => {
if ( i === a.length )
return 0;
else
return is_prime(a[i]+o) + primes_for_offset(a,o,i+1);
}
const primes_for_offsets = (a,oa,i=0) => {
if ( i === oa.length )
return 0;
else
return primes_for_offset(a,oa[i]) + primes_for_offsets(a,oa,i+1);
}
const primecount = (a1,a2) => {
return primes_for_offsets(a1,a2);
};
const test = () => {
if ( 2 !== primes_for_offset([15,16,17],2) )
throw new Error('primes_with_offset failed');
};
Custom Higher Order Function (custom_higher_order)
Clearly reduce is not a perfect fit for my problem since I need to nest it. What if I had a reduce-like function that produced the sum of all NxN possible pairs from two arrays, given a custom value function? Well that would be quite great and it is not particularly hard either. In my opinion this is a very functional approach (despite its implemented with for-loops). All the functions written are independently reusable in a way not seen in the other examples. The problem with higher order functions is that they are pretty abstract, so they are hard to name, and they need to be general enough to ever be reused for practical purposes. Nevertheless, if I see it right away, I can do it. But I don’t spend time inventing generic stuff instead of solving the actual problem at hand.
const is_prime = (n) => {
for ( let d=2 ; d*d<=n ; d++ ) if ( 0 === n % d ) return 0;
return 1;
};
const combination_is_prime = (a,b) => {
return is_prime(a+b);
};
const sum_of_combinations = (a1,a2,f) => {
let retval = 0;
for ( let i=0 ; i<a1.length ; i++ )
for ( let j=0 ; j<a2.length ; j++ )
retval += f(a1[i],a2[j]);
return retval;
};
const primecount = (a1,a2) => {
return sum_of_combinations(a1,a2,combination_is_prime);
};
const test = () => {
if ( 1 !== is_prime(19) )
throw new Error('is_prime(19) failed');
if ( 0 !== combination_is_prime(5,7) )
throw new Error('combination_is_prime(5,7) failed');
if ( 1 !== sum_of_combinations([5,7],[7,9],(a,b)=> { return a===b; }) )
throw new Error('sum_of_combinations failed');
};
Lambda Functions considered harmful?
Just as there are many bad and some good applications for goto, there are both good and bad uses for lambdas.
I actually dont know if you – the reader – agrees with me that the second example (lambda) offers no real improvement to the first example (imperative). On the contrary, it is arguably a more complex thing conceptually to nest anonymous functions than to nest for loops. I may have done the lambda-example wrong, but there is much code out there, written in that style.
I think the beauty of functional programming is the testable and reusable aspects, among other things. Long, or even nested, lambda functions offer no improvement over old spaghetti code there.
All the code and performance
You can download my code and run it using any recent version of Node.js:
$ node functional-styles-1.js 1000
The argument (1000) is N, and if you double N execution time shall quadruple. I did some benchmarks and your results my vary depending on plenty of things. The below figures are just one run for N=3000, but nesting reduce clearly comes at a cost. As always, if what you do inside reduce is quite expensive the overhead is negligable. But using reduce (or any of the built in higher order functions) for the innermost and tightest loop is wasteful.
834 ms : imperative
874 ms : custom_higher_order
890 ms : recursive
896 ms : imperative_alt
1015 ms : reducer
1018 ms : lambda_alt
1109 ms : lambda
Other findings on this topic
Functional Programming Sucks