Category Archives: Programming

My new basic .vimrc

I decided to improve my Vim situation a bit from disabling most everything to a basic .vimrc I stole from someone online and modified slightly.

set nocompatible
syntax on
set modelines=0
set ruler
set encoding=utf-8
set wrap

set tabstop=2
set shiftwidth=2
set softtabstop=2
set autoindent
set copyindent
set expandtab
set noshiftround

set hlsearch
set incsearch
set showmatch
set smartcase

set hidden
set ttyfast
set laststatus=2

set showcmd
set background=dark

" from ThePrimeagen
nnoremap <C-d> <C-d>zz
nnoremap <C-u> <C-u>zz

set colorcolumn=80
set relativenumber

With that done, I had a few more questions.

Q: How do I stop search highlight when I am done searching?
A: :nohls

Q: How (outside vim) do I check number of columns of my terminal?
A: $tput cols

Practical Fullstack approach to Vue3

I am a fullstack developer. I do user support, requirement analysis, nginx configuration, backup handling, integrations and css. Much of my work is centered around Node.js and Vue.

I do not love Vue and I am not fascinated with it, nor interested in its internal working. It is a tool to write reactive (2-way-binding) web applications and to modularize web code (write small reusable components).

I started using AngularJS (v1) years ago. That changed the way I thought of web development and I became more productive making better web applications. However AngularJS is kind of abandoned now, and it is a much heavier framework than need. So I started transitioning to Vue2 some years ago thinking of it as AngularJS-light. In most ways Vue2 is nicer than AngularJS, except sometimes it (the reactivity system) does not work (predictably). Well, Vue2 probably works exactly as it is written to work, but for me as a developer I think I do what works, and then it fails for some delicate detail that I can figure out if I have time, maybe. That is waste of time and it adds uncertainty to my job.

I have this feeling with Vue2 that I need a plan (a design pattern) and I still have not found one that works well. Now I started migrating Vue2 to Vue3 and things change a bit (and things break a bit). So now I really want to find simple design principles for my Vue3 applications, that I can follow so everything works without me thinking about it.

State

The real challenge when writing an (single page web) application is managing the state:

  • Information about session
  • I/O and error handling
  • Incoming data, updating state and UI, avoiding conflicts
  • Application settings, like filters and user choices
  • Data validation and invalid-data-feedback to user
  • Data modelling

This is harder than writing a good user interface. So the client-side-design needs to be state-first, not UI-first. Also

  1. I am migrating things from Vue2 to Vue3 that were originally AngularJS. If the state, business logic, I/O, were dependent on AngularJS-stuff I would be stuck in AngularJS.
  2. I can write this state in generic JS (not specific for web/UI) and thus make it possible to also run it on a server, exposing it as APIs and do automated testing

So I am not interested in Veux. Perhaps it can run backend, I do not care. This is how I think of my architecture:

  1. GUI (as thin layer as possible)
  2. Web component business logic (plain JS, no Vue, can run in Node.js)
  3. State (plain JS, no Vue, can run in Node.js)
  4. Server Side (Node.js)

Most code can (ideally) run in 2-4, and can thus be tested in Node.js. Vue is only about (1).

State – Vue – Interface

Some applications are large and state consists of many modules. Some are small. For a simple example I think of state as (this is, admittedly based on Vue2-experience, where you need to consume state data in a way that makes it reactive in Vue).

STATE = {
  ro : { ... things that are readonly to Vue (like server data) ... }
  rw : { ... things that Vue can change (like filter choices or user input) ... }
  api: { ... functions that can do stuff with state ...}
}

Typically I would get this from a plain JS factory function where I can supply dependencies that are possibly implemented (for I/O) differently on Browser and Node.js:

function MyStateFactory(libIO, lib1, lib2) {
const state = { ro:{} , rw:{} , api:{} };
... set up state ...
return state;
}

How do I make this state available (and reactive) in my Vue application and Vue components? I can admit that in the past I have been lazy and done things like (pseudo code):

Vue.Component({
  data : function() {
    return { state : STATE };
  },
  ... more stuff
});

That kind of works in many cases! But I have also ran into problems. In a real application you have 3 types of components:

  1. Those that depend only on supplied props
  2. Those that depend only on state
  3. Those that depend on a mix of props and state

In theory, all components should be (1) and all other design is rotten (you may think). In practice 2 and 3 are useful too.

So the real questions are:

  1. How do you expose state, as data and functions, so changes can be picked up by the right components?
  2. How do you design components so they update when they should?

AngularJS – Vue2 – Vue3

AngularJS has a function that recalculates everything, over and over again until it gets the same result twice. Then it updates the DOM accordingly. AngularJS does not detect changes to data automatically, however when you use its api to manipulate data, you are also telling AngularJS that you are changing data, and the function mentioned before can run. If you change data from outside angular you may need to use $scope.$apply() to tell Angular to refresh.

In Vue2, when you make an object part of “data” in a component, it becomes “reactive”. This is very implicit and it has some limitations (with Arrays, for example). What if a prop changes? What if you have a method that uses external data (as a state outside Vue), does that trigger refresh of the component? If a parent refreshes does that refresh the children? What if a child element deep down in data/prop changes? What if properties in data are replaced/added rather than modified? All I say is that to someone so stupid and ignorant as me it is not obvous how to make Vue2 just work.

Vue3 is more explicit (I interpret that as Vue2 was unpredictable not only for me, and that the Vue2 abstraction was leaky). While AngularJS and Vue2 hide the mechanism that drives reactivity Vue3 has the very explicit functions:

  • ref
  • reactive
  • readonly
  • shallowRef
  • shallowReactive
  • shallowReadonly

These functions are what connects the Vue controllers to the outside-of-Vue-world, and makes it possible for Vue to detect changes.

Vue3 – a first simple practical design

So, now we can think that we have something like this:

var state = { a:1 };
state = Vue.reactive(state);

Vue.component({
  props : { /* parameters to component here */ },
  data : { /* component local data here */ },
  methods : { /* access state via functions here */ }
});

Lets explore what happens and what works!

Vue.reactive()

We need to understand – exactly – the difference between the argument and the result of Vue.reactive():

var state_reactive = Vue.Reactive(state_original)

I have experimented (test0) and found that:

  • They share the same data, that is, the data lives in state_original
  • If you add/delete/modify data in one of them, it is immediately available in the other
  • state_orginal, and none of its children, will ever be reactive
  • state_reactive, and all of its children, will always (more on this later) be reactive
  • state_orginal and state_reactive (and all of its corresponding children) are never the same object (everything in state_reactive is Proxy to state_orginal)

The consequence of this is that your state-library must modify data in state_reactive, if you want Vue to be notified of the change (when Vue anyway refreshes for any reason, it will get the changes).

So the problem becomes:

var state = StateFactory();

// will not work because the internal state of StateFactory is not affected
state = Vue.reactive(state)

// this has severe implications:
//  1) Vue is modifying something inside the state library
//     (that was explicitely supposed to be read-only)
//  2) How does the state library protect itself?
//     possibly keeping an internal ro, and replacing the state.ro regularly
state.ro = Vue.Reactive(state.ro)

So I would say that you will need to design your state library with this in mind.

  1. Be clear that state.ro, state.ro can be made reactive this way, or
  2. Expose something else that can be made reactive (state.updated), or
  3. If the “Vue” object is present, the library could use it and take care of making the right things reactive itself, or
  4. The library has some way to notify of updates (an update callback function), and in the Vue-world, you let that callback be a function that updates something that IS reactive already.

Regardless of this little problem it is a much better situation than the Vue2-way of adding external objects in data and hoping for the best.

What Vue3 Components are updated?

Now that we know what data outside of the component that is reactive, the questions are:

  • what changes trigger the component to be updated?
  • when do child components update?
  • can any number of components, independently, be triggered by the same reactive variable?
  • if a reactive variable is used in a method, does that trigger an update
    • even if the method is not always called
    • even if the variable is not always used in the method (only some branches)
    • even if the result of the method does not change
    • even if the result of the method is not rendered
    • when the method called on any other occation (like polling)
  • is there any risk that some components are updated multiple times (in one cycle)?
  • in what order are the components updated, and can it matter?
  • what is enough to trick/force a component to update?
  • can a component update only partly?

I do not really want to know. I just want a way of designing my components so they work predictably. What I have found in the past is that when I write one big Vue-component (or application) I have no problems. But when I start modularizing it, with components that are sometimes but not always nested, that is when problems start. I have asked myself questions (with Vue2) like:

  • can one component override/replace what triggers another component, making the earlier “trigger-subscriber” dead?
  • can I end up with deadlocks, infinite recursion or mutual dependencies?
    (it has happened)

We kind of can not keep all these details in mind when building an application. We need to have a simple design that just works every time.

Single Application

I have written a simple test (test1) application with no child components. I want to explore how little reactivity works.

// Start with a STATE:
  STATE = { lines: [] };

  setInterval(() => {
    updateLines(STATE);  // implemented elsewhere somehow
  }, 2000);

// Make reactive
  STATE.lines = Vue.reactive(STATE.lines);

// Make a simple app (not showing everything)
  Vue.createApp({
    data: {
      function() {
        return { lines : STATE.lines }
      }
    }
  });

// In HTML Template, do something like
  <tr v-for="l in lines">...</tr>

This works (as expected) IF updateLines does not replace the lines array. So I have experimented and found:

Vue.reactive(STATE.lines)updateLines keeps linesworks
Vue.reactive(STATE.lines)updateLines replaces linesweird behaviour
Vue.reactive(STATE)updateLines keeps linesworks
Vue.reactive(STATE)updateLines replaces linesweird behaviour
Vue.shallowReactive(STATE)updateLines keeps linesnot at all
Vue.shallowReactive(STATE)updateLines replaces linesnot at all

The problem here is that when STATE.lines becomes a new array, the data:function in createApp does not re-run, so we keep track of and old array that STATE no longer cares about (the weird behaviour is that updateLines kept part of the old lines-structure and that “garbage” is still reactive).

It is clearly a sub-optimal situation that the implementation of state, and not just what STATE looks like, matters. 4 alternatives do not work, 2 work but are bad design. What about:

  STATE = Vue.shallowReactive(STATE);

// Make a simple app (not showing everything)
  Vue.createApp({
    data: {
      function() {
        return { state : STATE }
      }
    }
  });

// In HTML Template, do something like
  <tr v-for="l in state.lines">...</tr>

This works also sometimes:

Vue.shallowReactive(state)updateLines keeps linesnot at all
Vue.shallowReactive(state)updateLines replaces linesworks
Vue.reactive(state)updateLines keeps linesworks
Vue.reactive(state)updateLines replaces linesworks

The only thing that works regardless how updateLines is implemented, is to make all of STATE recursively reactive and make all of it data in every component. Exactly what I admitted above that I had been doing with Vue2.

shallowReactive is appealing, but it depends on the inner implementation of state, and that kind of design will quite possibly give you nasty bugs later when something changes.

So, making your data embrace STATE, or part of states works only if you know how state updates itself, unless you make exactly all of STATE recursively reactive. I think more people than I find that sledgehammer approach unsatisfying.

How about state signalling that something is updated, by updating a primitive variable (value, to avoid using ref and then ending upp with value anyway)? Note that updated.value is truthy from the beginning, and it will alwasy be truthy (++ only operation on it ever), but Vue does not know that so it needs to read and check.

  STATE = { updated: {value:1} , lines: [] };

// Make just updated reactive
  STATE.updated = Vue.reactive(STATE.updated);

  setInterval(() => {
    updateLines(STATE);
    STATE.updated.value++;
  }, 2000);

// And in Vue.createApp
  data : function() {
           return {
             lines : STATE.lines,
             updated : STATE.updated
           }
         }

Now, we can not rely on

  • lines, because it is not reactive at all
  • updated in data because it is not used in the template

However, there are some things that work. First it seems nice to replace lines in data with a method:

  methods : {
    getLines : () => { return STATE.lines; }
  }

  <tr v-for="l in getLines()">...<tr>

However, that is not enough because lines is still not reactive. But here are 3 simple little hacks that do work (and of course there are or more):

  <-- output updated in the template - not typically wanted -->
  Updated: {{ updated.value }}

  <-- display table (parent of lines) "conditionally" (always truthy)-->
  <table v-if="updated.value">

  // use updated.value in getLines()-function
  getLines() => { return STATE.updated.value ? STATE.lines : []; }
  getLines() => { console.log(STATE.updated.value); return STATE.lines; }
  // assuming devnull exists and does nothing
  getLines() => { devnull(STATE.updated.value); return STATE.lines; }

You can use this.updated.value if you use function() instead of ()=>{}. I find it quite a positive surprise that the above works even if neither lines nor updated is in data:

  STATE.updated = Vue.reactive(STATE.updated);

  data: {}
  methods: {
    getLines : () => { return STATE.lines; }
  }

  <table v-if="updated()">
    <tr v-for="l in getLines()">...</tr>
  </table>

This is beginning to look like something that appeals to my sense of simplicity. The conclusion for simple application, external state, and no components is that you have two (simple to explain) options:

  1. Make all of STATE (the exposed data, not functions) recursively reactive. Use it directly from everywhere.
  2. Make only one variable, STATE.updated, reactive. Make sure to poke (++ is probably ok) that variable whenever you want anything else to update.

Beware of doing things like value = STATE.some.child.value for anything that is expected to work beyond the next “tick”. Please note that I have not build a big single-application-zero-components application this way. So for now, this is just a qualified hypothesys.

You can check out the final result as test1.

Application with components

I split my application into three components. The application itself has no data or methods.

  • Application (test2)
    • Market
    • Portfolio
    • Copyrigh notice (nothing reactive, should fail to update)

This worked fine with no changes to reactivity compared to test1. So I got confident and made components for the lines:

  • Application (test3)
    • Market
      • Market-quote
    • Portfolio
      • Portfolio-stock

This did not immediately work. The template in Market had this content:

  <test-market-quote v-for="q in getQuotes()" :q="q">

getQuotes is still depending on update.value and is called every time, but it “happens” to return the same, modified, quote-lines every time. So the test-market-quote does not realise anything changed:

  APP.component('test-market-quote',{
    props : {
      q : Object
    },
    data : function() { return {}; },
    template: '#test-market-quote',
    ...

So I needed to replace (in a method in the test-market-quote component):

  return API.valueOf(stock) < STATE.trader.cash;
//with
  return STATE.updated.value && API.valueOf(stock) < STATE.trader.cash;

in order to make sure some method in the child component also is dependent on updated.value. This was not needed when there were no child components, becuase the parent component had another dependency on updated.value and that cause the update of the entire component (but obviously not forcing its children to update). That worked in one component, but the other component had no methods to add a dependency to, so I successfully replaced

  <td>{{ s.name }}</td>
<!-- with -->
  <td>{{ upd(s.name) }}</td>

//and added
  methods : {
    upd : (v) => { return STATE.updated.value ? v : 0; }
  }

Reality check!

This is beginning to be ridiculous. How many obscure hacks are we going to use to not have to rely on the reactivity system that is at the heart of Vue? These kind of hacks are also a source of (subtle) bugs and can make refactoring harder. The problem I have experienced with Vue is that I need to understand how things work under the hood. But with these different updated.value hacks the cure is getting as bad as the disease (that said, these hacks are probably anyway things you need to do, if components do not update when you want them to).

So I was thinking about a universal fix for updated.value (test4):

// first a reusable function
  API.yes = () => { return !!updated.value; }

// second every (child) component makes it part of its methods
  methods: {
    yes : API.yes
  }

// third, every component template has a root element, add a v-if
  <div v-if="yes()">
    .. component template body ..
  </div>

This works! It is rather simple.

Two working practial design choices

So, we have arrived at two rather simple (conceptually) ways to build predictable Vue3 applications that rely on external (Vue-independent) state code:

  1. Make all state recursively reactive, from the root
  2. Only make an update-variable reactive, and make all components depend on it

Obviously there does not have to be exactly ONE state. There can be some log-state, some io-state, some user-setting-state, some data-state and they could possibly work differently.

Regardless, it is important to understand that if there is an exposed state, the code responsible for the state may update objects, replace object, reuse objects and move objects around in the state (tree), unless the state is very clear and strict about this.

How is the recursive reactivity-tree updated?

We have a plain object:

  x1.a.b.c.d = 5;

// make it recursively reactive

  x2 = Vue.reactive(x1); 

// we add something in x1

  x1.a.b.c2: {d : 3.14} };

// NOW c2 and c2.d can not possibly be reactive, because if
// Vue could know that we did that to a child of x1, we would
// not need to make a reactive copy (proxy) of x1 in the first
// place. The above operation does not trigger anything in Vue.

// How does it happen that c2 and c2.d *eventually* end up
// begin reactive?, that is
//   isReactive(x2.a.b.c2.d) => true

// Well I think when you do that (isReactive) you are accessing
// x2, x2.a, x2.a.b and eventually x2.a.b.c2. That means, you
// are asking the Proxy b to give you its child c2 - which it was
// not aware of. But now it is!
// So it can find c2, make it recursively reactive before giving it
// to you so you can resolve x2.a.b.c2.d and pass it to isReactive().

That is how I think it works.

Vue2 vs Vue3

I really have not code very much in Vue3 yet. But I have done enough research to write this post. I think Vue3 seems to be MUCH better.

In Vue2 I ended up doing:

STATE = { ro:{}, rw:{}, api:{} };

Component({
  data : function() {
    state_ro : STATE.ro
  },
  methods : {
    stuff : function(
      state_ro.foo ...     // is there any difference?
      ... STATE.ro.bar     // does it matter if I use
    )                      // state_ro or STATE.ro?
  }

This mess is GONE with Vue3. Making things reactive is 100% explicit. data can be used exclusively for local variables to that component instance. methods can be used to get external state data, and if it is reactive that methods will trigger and the component will update.

I can imagine that this can essentially be done with Vue2 as well. But with the implicit reactivity system based on putting external state in data I never managed to figure it out.

When my code is migrated to Vue3 I will never look back at Vue2. I think.

Performance

I get a bad feeling when I think of thousands of lines, where each line is a complex object (like an invoice), and everything is made reactive, recursively. And that there are thousands of components on my web page, that each depend on several objects/properties in every invoice. And that these thousands of lines may just be discarded when something new arrives over the network, garbage collector trying to make sense of it, and new complex lines comes in to be made reactive. Perhaps it takes 0.05s and it is no problem. Perhaps it always works.

To just render everything again is usually no problem, very predictable, and rather easy to optimize if needed.

But I think like I did with AngularJS. Use the reactivity-system inside the components. But do not let the components, or the Vue Application for that matter, be made aware of everything. Give to it what it needs to do, to render everything and to allow for user input/edit where required.

A second thought…

I wrote a minesweeper game (test5) with the intention of abusing the reactivity system and see if performance got bad. It turned out I wrote a too good implementation, and the reactivity system worked perfectly. So I failed to prove my idea, but I learnt something on the way: that the reactivity system is better than I first thought – not the implementation in Vue but the idea, fundamentally.

My fear was that I essentially have two trees of objects

  • Data, nested data, deeply nested data, complicated data and separated data
  • UI, nested components, deeply nested components and separate components

My fear was that data consists of tens of thousands of reactive nodes, and each of them will trigger an update to hundreds or thousands of UI-components. However…

…in reality the data-tree will much resemble the ui-component-tree. Often there will be 1-to-1 relationships. This means that nothing will be updated in vain, and everything (and only that) which should be updated will be updated, typically each triggered by one or a few reactive “events”.

What would be wasteful?

  • If a single small component depends on a lot of spread out data – but that is already wasteful design because everything needs to be calculated every time – which is probably a bigger problem than the reactivity system
  • If the data tree is much larger than needed (for the current presentation). Lets assume we have a game with a few AI players. Each AI player is just presented as name, score and a few more fields. But the AI implementation may be several megabytes of complex data in memory. There is NO reason to make that data reactive out of laziness.

My basic design in AngularJS was:

  STATE ==> Presentation Data ==> Apply Filters ==> Apply Paging ==> Give to Angular

This means that the Presentation data is already structured quite much as the UI and its components, and it does not contain more data than necessary (or it does not have to, if it gives problems).

Disclaimer

I will be doing work based on this “reasearch” in the near future. I will update this post if I discover something relevant.

Qnap, SonarQube and Elastic Search

Update 2021-10-20: Solution in the end of the post

I use SonarQube (and SonarScanner) to analyze my JavaScript source code in a project that has almost 200k lines of code.

SonarQube is a non-trivial install and it should be on a server so different developers can access it. Thus, SonarQube is an application that it would make sense to put in a packaged container, easy to set up.

I have a QNAP, with Container Station installed. That allows me to run Docker images and Linux (LXC/LXD) image. To me, this sounds like a perfect match: Just install a SonarQube Docker container on the QNAP and be happy!

Well, that was not how they thought it.

Last time I checked the SonarQube docker image did not come with a database. That would have been the entire point! Most work related to setting up SonarQube is related to the database. Docker support data folders, so it would be easy to configure the docker container with a single datafolder for the database and everything. No. You need two docker images.

The next problem is that SonarQube comes bundled with ElasticSearch which has some remarkable system requirements. Your operating system needs to be configured to support

  • 65535 open file descriptors (link)
  • 262144 vm.max_map_count (link)

Now the first problem is that Docker in QNAP does not support this. However it works with LXC.
The second problem is that QNAP is getting rid of LXC in favour of LXD, but you cant have 65535 open file descriptors with LXD (on QNAP – hopefully they fix it). So I am stuck with unsupported LXC.

But the real problem is – who the f**k at Elastic Search thought these were reasonable requirements?

I understand that if you have hundreds of programmers working on tens of millions of code you need a powerful server. And perhaps at some point the values above make sense. But that these are minimum requirements to just START ElasticSearch? How f***ing arrogant can you be to expect people to modify /etc/security and kernel control parameters to run an in-memory database as a priviliged user?

The above numbers seem absolutely arbitrary (I see that it is 2^16-1 of course). How can 65535 file descriptors be kind of fine, if 32000 are not? Or 1000? I understand if you need to scale enormously. But before you need to scale to enormous amounts of data, it would be absolutely wasteful, stupid and complicated to open 50k+ files at the same time. And if 32000 file descriptors are not enough for your clients big data, how long are you going to be fine with 65535? For a few more weeks?

This is arrogant, rotten, low-quality engineering (and I will change my mind and apologize if anyone can provide a reasonable answer).

All the data SonarQube of goes to a regular database. ElasticSearch is just some kind of report-processing thing serving the frontend. I did a backup of mine today, a simple pg_dump that produces an INSERT line in a text file for every database entry. Not very optimized. My database was 36Mb. So if Elastic Search would use just 36000 file descriptors, each file correspond to 1k of actual data.

I don’t know if I am the most disappointed with the idiots at ElasticSearch, or the idiots of SonarQube who made their quite ordinary looking GUI dependent on this tyrannosaurus of a dependence.

Hopefully the QNAP people can raise the limits to ridiculous values, so nobody at ElasticSearch needs to write sensible code.

And if anyone knows a hack so you can make ElasticSearch start with lower values (at my own risk), please let me know!

Solution!

QNAP support helped me with the perhaps obvious solution. Log in as admin with ssh to the QNAP and run:

[~] # sysctl -w vm.max_map_count=262144
[~] # lx config set deb11-sonarqube limits.kernel.nofile 65535

The first command I already knew. You have to run it whenever the QNAP has restarted.

The second command is for setting the file limit in the particular container (deb11-sonarqube is the name of my container). I guess you just need to do it once (and then restart the container), and that the setting remains.

Simple Loops in JavaScript

I let SonarQube inspect my JavaScript code and it had opinions about my loops. I learnt about the for-of-loop. Let us see what we have.

Below four loop-constructions are for most practical purposes the same.

  // good old for-loop
  for ( let i=0 ; i<array.length ; i++ ) {
    const x = array[i];
    ...
  }

  // for-in-loop
  for ( const i in array ) {
    const x = array[i];
    ...
  }

  // for-of-loop
  for ( const x of array ) {
    ...
  }

  // forEach
  array.forEach((x) => {
     ...
  });

Well, if they are all practially the same, why bother? Why not pick one for all cases? Well, in the details, they are different when it comes to

  • simplicity to write / verboseness
  • performance
  • flexibility and explicitness

Lets discuss the loops.

The good old for-loop

The good old for-loop requires you to write the name of the array twice, and you need to explicitely increment the loop variable and compare it the length of the array. This is very easy, but it is possible to make silly mistakes.

In many/most cases it is unnecessarily explicit and verbose. However, as soon as you want to do things like:

  • skip first, or any other element
  • access several items in the array each (most commonly adjacent items: 01, 12, 23, 34, 45)
  • break / continue
  • modify the array – even the length of it – during the loop
  • sparse arrays, with undefined, it is obvious what you get

this becomes very natural with the good old loop. Doing it with the others will make it appear a bit contrived or the result may not be so obviously correct.

There is also something very explicit about the order. It may be true (or not?) that every implementation of JavaScript will always execute the other three loops in order. But you need to know that, to be absolutely sure, when reading the code. Not so with the good old for-loop. If order is a critical part of the algorithm and you may want to be explicit about it.

This is also the fastest loop.

The for-in-loop

for-in enumerates properties and loops over them. Do not use it for arrays:

  • it makes more sense to use for-in for Object, so the reader of the code may think your array is an object
  • are you 100% sure your array has no other enumerable properties, ever?
  • performance – this is by far the slowest loop
  • it is quite verbose

The for-of-loop

The for-of-loop is a bit “newer” and may not work in old browsers or JavaScript engines. That can be a reason to avoid it, but even more a reason why you do not see it in code you read.

I would argue this is the most practical, clean and simple loop, that should be used in most cases.

It is slightly slower than the good old for-loop, but faster than the other alternatives.

Array.forEach

I have been ranting about functional style code elsewhere. forEach is kind of an antipattern, because it is a functional construction that does nothing without a side-effect. A functional way to do something non-functional.

The callback function gets not ONE argument (as shown above), but actually 4 arguments. If you pass some standard function into forEach that can give you very strange results if the standard function happens to accept more than one argument and you did not know or think about it.

You get both index and array, so you can do horrible things like:

  array.forEach((current,i,array) => {
    const last = array[i-1];
    ..
  });

I have seen worse. Don’t do it. Functional programming is about being clear about your intentions. Use a good old for loop, or write your own higher-order-loop-function if you do the above thing often.

According to the documentation forEach loops in order. JavaScript is singlethreaded. But other languages may parallellize things like forEach, so I think the right way to think about forEach is that order should not matter. Best use for forEach are things like:

  ['gif','jpg','png'].forEach(registerImageFormat);
  players.forEach(updatePosition);

forEach is slower than the good old for-loop and for-of.

Sparse Arrays

I made an experment with a sparse (and worse) array:

  const array = ['first'];
  array[2] = 'last';
  array.x = 'off-side';
 
  let r = 'for';
  for ( let i=0 ; i<array.length ; i++ ) {
    r += ':' + array[i];
  }
  console.log(r);
 
  r = 'for-in';
  for ( const i in array ) {
    r += ':' + array[i];
  }
  console.log(r);

  r = 'for-of';
  for ( const x of array ) {
    r += ':' + x;
  }
  console.log(r);
 
  r = 'forEach';
  array.forEach((x) => {
    r += ':' + x;
  });
  console.log(r);

The output of this program is:

  for:first:undefined:last
  for-in:first:last:off-side
  for-of:first:undefined:last
  forEach:first:last

If this surprises you, think about how you code and what loops you use.

Performance

For a rather simple loop body here are some benchmarks

~160 M loopsMacBook Air 2014
node 14.16.0
RPI v2 900MHz
node 14.15.3
i7-8809G
node 12.18.3
i7-8809G
node 14.16.0
for ( i=0 ; i<array.length ; i++ )280ms3300ms200ms180ms
for ( i of array )440ms6500ms470ms340ms
for ( i in array )6100ms74000ms5900ms4100ms
array.forEach560ms10400ms480ms470ms

On one hand, a few milliseconds for a millions loops may not mean anything. On the other hand that could be a few milliseconds more latency or UI refresh delay.

Optimizing Objects in JavaScript

I have web applications with JavaScript business objects coming from constructors, like:

function Animal() {
  this.type      = null;   // 'Cat'
  this.color     = null;   // 'Black'
  this.legs      = 0;      // 4
  this.carnivore = false   // true
}

These objects may be created on the web, but quite quickly this happens:

  1. Client
    1. creates an Object
    2. serializes Object with JSON.stringify()
  2. The JSON text is sent over the network from client to the server
  3. Server
    1. uses JSON.parse() to get the object back
    2. validates the object
    3. stores it (perhaps by doing JSON.stringify())
  4. Client (perhaps the same) asks the server which sends it over the network
  5. Client
    1. uses JSON.parse to get the object back
    2. keeps the object in memory for application logic use

This works fine! If you do not add prototypes/functions to your constructors (like Animal.prototype.eat()) the client can use the object it got from JSON.

Since I send millions of such objects in this way over the network every day, I can’t help asking myself if what I do is reasonably efficient, or rather wasteful?

One idea I have had is that for this purpose I could turn Objects into Arrays (when shipping over the network, or perhaps storing them), like:

// As Object                          // As Array
{                                     [
  type      : 'Cat',                    'Cat',
  color     : 'Black',                  'Black',
  legs      : 4,                         4,
  carnivore : true                       true
}                                     ]

Another idea has been to create an object with “new Animal()” rather than using the raw Object I get from JSON.parse().

Possible benefits could be

  1. Arrays are smaller to store on disk
  2. Arrays are smaller to send over the network
  3. A real Animal Object may be stored more efficiently in client RAM than the raw Object
  4. A real Animal Object may be faster to operate on, in the client, than the raw Object

So rather than just sending, receiving and processing raw JSON, I could be sending and receiving Arrays, and create objects using their constructors.

Test Results

I implemented some Node.js code to test my ideas. I was using objects like:

// As Object                                 // As Array
{                                            [
  "theWorst":0.1560387568813406,               0.1560387568813406,
  "lowerQuartile":0.2984895507275531,          0.2984895507275531,
  "median":0.47865973555734964,                0.47865973555734964,
  "higherQuartile":0.7832137265963346,         0.7832137265963346,
  "theBest":0.8893834668143412                 0.8893834668143412
}                                            ]

When there is a memory range (below), the low value is after the GC has run, and the high value is the peak value. JSON means an object received from JSON.parse. Object means an Object created with a constructor.

Intel i7-8809GRAM/DiskCPU
1M Arrays94MB
-> gzip43MB7.7s
-> gunzip1.1s
1M Objects154MB
-> gzip48MB5.6s
-> gunzip1.3s
Receive & Convert data
Arrays->Arrays100-240MB0ms
Arrays->Objects76-240MB334ms
JSONs->JSONs123-310MB0ms
JSONs->Objects76-382MB280ms
Access & Use data
Arrays21ms
JSONs25ms
Objects9-11ms

Well, I find that:

  1. It is surprising that GZIP is more expensive on the smaller array than the larger object file.
  2. Costs (CPU) to compress/decompress much higher (~10x) than the cost of “packing/unpacking” JSON-data in JavaScript code.
  3. If we are using gzip for network traffic the benefit of sending the more compact arrays rather than the more wordy objects, is questionanable (higher CPU cost, 10% smaller end result).
  4. Arrays like this require basically the same amount of RAM in Node.js as disk space.
  5. Objects like this require less RAM in Node.js than the corresponding JSON file.
  6. Both when it comes to RAM usage and performance on the client side, Arrays are better than raw JSON objects, but worse than real objects from a Constructor.
  7. Unless an object is used many times on the client (10+) it is not worth it from a strict performance perspective to make it with its constructor, instead of the raw JSON.

When it comes to the different strategies I thus find:

IO/Stored formatJavaScript formatConclusion
ArrayArrayanimal[1] or animal[COLOR] (where COLOR is a global constant) is generally not acceptable compared to animal.color. And it is not justified from performance perspective either.
ArrayJSONThis would not happen
ArrayObjectGiven the extra cost of gzip, and the significant complexity of serializing/deserializing, this is hardly a good general strategy. It requires the least disk space, the least network traffic, and the least RAM on the client though.
JSONArrayThis would not happen
JSONJSONThis is the most simple, readable way of doing things, at a surprisingly low overhead. You can not use prototype though.
JSONObjectIf you find a simple and reliable way to create objects with a construtor and populate it from a JSON object, this method is probably best for performance and efficiency.

Conclusion

JSON is very simple and using it thoughout your full stack is very productive. Unless you really need to optimize things, write your code for pure raw JSON objects.

Great JavaScript Stuff 2020

I never write about great new JavaScript features on this blog. That is because there is really nothing you can not do without great new features, and I prefer writing compatible code (that runs in Internet Explorer).

However some code is backend-only, and if Node.js supports a feature I can use it.

So, here are two great new things in JavaScript that I will start using in Node.js.

Optional Chaining

It is very common in JavaScript with code like:

price = inventory.articles.apple.price;

If inventory, articles or apple is null or undefined this will throw an error. One common way to do this is:

price = inventory &&
        inventory.articles &&
        inventory.articles.apple &&
        inventory.articles.apple.price;

That is obviously not optimal. I myself have implemented a little function, so I do:

price = safeget(inventory,'articles','apple','price');

The elegant 2020 solution is:

price = inventory?.articles?.apple?.price;

Who wouldn’t want to do that? Well, you need to be aware that it will “fail silently” if something is missing, so you should probably not use it when you dont expect anything to be null or undefined, and not without handling the “error” properly. If you replace . with ?. everywhere in your code, you will just detect your errors later when your code gives the wrong result rather than crashing.

Nullish Coalescing

It is very common to write code like:

connection.port = port || 80;
connection.host = server || 'www.bing.com';
array = new Array(size || default_size);

The problem is that sometimes a “falsy value” (0, false, empty string) is a valid value, but it will be replaced by the default value. The port above can not be set to 0, and the host can not be set to ”;

I often do things like:

connection.port = 'number' === typeof port ? port : 80;
connection.host = null == server ? 'www.bing.com' : server;

However, there is now a better way in JavaScript:

connection.port = port ?? 80;
connection.host = server ?? 'www.bing.com';
array = new Array(size ?? default_size);

This will only fall back to the default value if port/server/size is null/undefined. Much of the time, this is what you want.

However, you still may need to do proper validation, so if you used to do:

connection.port = validatePort(port) ? port : 80;

you shall probably keep doing it.

Conclusion

If your target environment supports Optional chaining and Nullish coalescing, take advantage of it. Node.js 14 supports it.

Functional Programming is Slow – revisited

I have written before about Functional Program with a rather negative stand point (it sucks, it is slow). Those posts have some readers, but they are a few years old, and I wanted to do some new benchmarks.

Please note:

  • This is written with JavaScript (and Node.js) in mind. I don’t know if these findings apply to other programming languages.
  • Performance is important, but it is far from everything. Apart from performance, there are both good and bad aspects of Functional Programming.

Basic Chaining

One of the most common ways to use functional programming (style) in JavaScript is chaining. It can look like this:

v = a.map(map_func).filter(filter_func).reduce(reduce_func)

In this case a is an array, and three functions are sequentially called to each element (except reduce is not called on those that filter gets rid of). The return value of reduce (typically a single value) is stored in v.

  • What is the cost of this?
  • What are the alternatives?

I decided to calculate the value of pi by

  1. evenly distribute points in the[0,0][1,1] rectangle.
  2. for each point calculate the (squared) distance to origo (a simple map)
  3. get rid of each point beyond distance 1.0 (a simple filter)
  4. count the number of remaing points (a simple reduce – although in this simple case it would be enough to check the length of the array)

The map, filter and reduce functions looks like:

const pi_map_f = (xy) => {
  return xy.x * xy.x + xy.y * xy.y;
};
const pi_filter_f = (xxyy) => {
  return xxyy <= 1.0;
};
const pi_reduce_f = (acc /* ,xxyy */) => {
  return 1 + acc;
};

In chained functional code this looks like:

const pi_higherorder = (pts) => {
  return 4.0
       * pts.map(pi_map_f)
            .filter(pi_filter_f)
            .reduce(pi_reduce_f,0)
       / pts.length;
};

I could use the same three functions in a regular loop:

const pi_funcs = (pts) => {
  let i,v;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    v = pi_map_f(pts[i]);
    if ( pi_filter_f(v) ) inside = pi_reduce_f(inside,v);
  }
  return 4.0 * inside / pts.length;
};

I could also write everything in a single loop and function:

const pi_iterate = (pts) => {
  let i,p;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    p = pts[i];
    if ( p.x * p.x + p.y*p.y <= 1.0 ) inside++;
  }
  return 4.0 * inside / pts.length;
};

What about performance? Here are some results from a Celeron J1900 CPU and Node.js 14.15.0:

Iterate (ms)Funcs (ms)Higher Order (ms)Pi
10k88103.1428
40k34193.1419
160k33473.141575
360k661963.141711
640k11114043.141625
1000k17175593.141676
1440k252511603.14160278

There are some obvious observations to make:

  • Adding more points does not necessarily give a better result (160k seems to be best, so far)
  • All these are run in a single program, waiting 250ms between each test (to let GC and optimizer run). Obvously it took until after 10k for the Node.js optimizer to get things quite optimal (40k is faster than 10k).
  • The cost of writing and calling named functions is zero. Iterate and Funcs are practially identical.
  • The cost of chaining (making arrays to use once only) is significant.

Obviously, if this has any practical significance depends on how large arrays you are looping over, and how often you do it. But lets assume 100k is a practical size for your program (that is for example 100 events per day for three years). We are then talking about wasting 20-30ms every time we do a common map-filter-reduce-style loop. Is that much?

  • If it happens server side or client side, in a way that it affects user latency or UI refresh time, it is significant (especially since this loop is perhaps not the only thing you do)
  • If it happens server side, and often, this chaining choice will start eating up significant part of your server side CPU time

You may have a faster CPU or a smaller problem. But the key point here is that you choose to waste significant amount of CPU cycles because you choose to write pi_higherorder rather than pi_funcs.

Different Node Versions

Here is the same thing, executed with different versions of node.

1000kIterate (ms)Funcs (ms)Higher Order (ms)
8.17.01111635
10.23.01111612
12.19.01111805
14.15.01819583
15.1.01719556

A few findings and comments on this:

  • Different node version show rather different performance
  • Although these results are stable on my machine, what you see here may not be valid for a different CPU or a different problem size (for 1440k points, node version 8 is the fastest).
  • I have noted before, that functional code gets faster, iterative code slower, with newer versions of node.

Conclusion

My conclusions are quite consistent with what I have found before.

  • Writing small, NAMED, testable, reusable, pure functions is good programming, and good functional programming. As you can see above, the overhead of using a function in Node.js is practially zero.
  • Chaining – or other functional programming practices, that are heavy on the memory/garbage collection – is expensive
  • Higher order functions (map, filter, reduce, and so on) are great when
    1. you have a named, testable, reusable function
    2. you actually need the result, just not for using once and throwing away
  • Anonymous functions fed directly into higher order functions have no advantages whatsoever (read here)
  • The code using higher order functions is often harder to
    1. debug, becuase you can’t just put debug outputs in the middle of it
    2. refactor, because you cant just insert code in the middle
    3. use for more complex agorithms, becuase you are stuck with the primitive higher order functions, and sometimes they don’t easily allow you to do what you need

Feature Wish

JavaScript is hardly an optimal programming language for functional programming. One thing I miss is truly pure functions (functions with no side effects – especially no mutation of input data).

I have often seen people change input data in a map.

I believe (not being a JavaScript engine expert) that if Node.js knew that the functions passed to map, filter and reduce above were truly pure, it would allow for crazy optimizations, and the Higher Order scenario could be made as fast as the other ones. However, as it is now, Node.js can not get rid of the temporary arrays (created by map and filter), because of possible side effects (not present in my code).

I tried to write what Node.js could make of the code, if it knew it was pure:

const pi_allinone_f = (acc,xy) => {
  return acc + ( ( xy.x * xy.x + xy.y * xy.y <= 1.0 ) ? 1 : 0);
};

const pi_allinone = (pts) => {
  return 4.0
       * pts.reduce(pi_allinone_f,0)
       / pts.length;
};

However, this code is still 4-5 times slower than the regular loop.

All the code

Here is all the code, if you want to run it yourself.

const points = (n) => {
  const ret = [];
  const start = 0.5 / n;
  const step = 1.0 / n;
  let x, y;
  for ( x=start ; x<1.0 ; x+=step ) {
    for ( y=start ; y<1.0 ; y+=step ) {
      ret.push({ x:x, y:y });
    }
  }
  return ret;
};

const pi_map_f = (xy) => {
  return xy.x * xy.x + xy.y * xy.y;
};
const pi_filter_f = (xxyy) => {
  return xxyy <= 1.0;
};
const pi_reduce_f = (acc /* ,xxyy */) => {
  return 1 + acc;
};
const pi_allinone_f = (acc,xy) => {
  return acc + ( ( xy.x * xy.x + xy.y * xy.y <= 1.0 ) ? 1 : 0);
};

const pi_iterate = (pts) => {
  let i,p;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    p = pts[i];
    if ( p.x * p.x + p.y*p.y <= 1.0 ) inside++;
  }
  return 4.0 * inside / pts.length;
};

const pi_funcs = (pts) => {
  let i,v;
  let inside = 0;
  for ( i=0 ; i<pts.length ; i++ ) {
    v = pi_map_f(pts[i]);
    if ( pi_filter_f(v) ) inside = pi_reduce_f(inside,v);
  }
  return 4.0 * inside / pts.length;
};

const pi_allinone = (pts) => {
  return 4.0
       * pts.reduce(pi_allinone_f,0)
       / pts.length;
};

const pi_higherorder = (pts) => {
  return 4.0
       * pts.map(pi_map_f).filter(pi_filter_f).reduce(pi_reduce_f,0)
       / pts.length;
};

const pad = (s) => {
  let r = '' + s;
  while ( r.length < 14 ) r = ' ' + r;
  return r;
}

const funcs = {
  higherorder : pi_higherorder,
  allinone : pi_allinone,
  functions : pi_funcs,
  iterate : pi_iterate
};

const test = (pts,func) => {
  const start = Date.now();
  const pi = funcsfunc;
  const ms = Date.now() - start;
  console.log(pad(func) + pad(pts.length) + pad(ms) + 'ms ' + pi);
};

const test_r = (pts,fs,done) => {
  if ( 0 === fs.length ) return done();
  setTimeout(() => 
    test(pts,fs.shift());
    test_r(pts,fs,done);
  }, 1000);
};

const tests = (ns,done) => {
  if ( 0 === ns.length ) return done();
  const fs = Object.keys(funcs);
  const pts = points(ns.shift());
  test_r(pts,fs,() => {
    tests(ns,done);
  });
};

const main = (args) => {
  tests(args,() => {
    console.log('done');
  });
};

main([10,100,200,400,600,800,1000,1200]);

Testing Business Central with Docker

There are many articles and sources on the internet about Business Central in Docker. Most of them are very specific about some detail. With this post I hope to share some ideas about why we run BC in docker, and the challenges from a top-down perspective.

When you set up any non-trivial system, automated testing is helpful. It can go something like:

  1. Write new tests
  2. Update system (code or configuration)
  3. Start system (initiate from scratch)
  4. Run tests
  5. Stop system (discard everything)
  6. Repeat

The key here is repeatability. So you want to know that the system starts to identical state every time, so the tests work every time and you know exactly what you are testing.

This used to be very hard with complex systems like Business Central (NAV). It is still not very easy, but with Business Central being available as Docker images, automated tests are viable.

Assets

I think it is important to understand exactly what defines the running system. In my Business Central tests, those essential assets are:

  • A docker image (mcr.microsoft.com/dynamicsnav:10.0.19041.508-generic)
  • An artifact (https://bcartifacts.azureedge.net/sandbox/16.5.15897.16650/se)
  • Parameters for docker run (to create container from image)
  • A Business Central license file
  • A custom script (AdditionalSetup.ps1)
  • Several Business Central Extensions
  • A rapid start configuration package

Other non-BC assets could be

  • Version of Windows and Docker
  • Code for automating 3-5 (start-test-stop) above
  • Test code

Sharing those assets with my colleagues, we shall be able to set up identical Business Central systems and run the same tests with the same result. Any upgrade of an asset may break something or everything and that can be reproduced. Also the fix can be reproduced.

Business Central in Docker

Business Central is a rather large and complex beast to run in Docker. It is not just start and stop. And you will run into complications. Primary resources are:

  • Freddys blog (you will end up there when using Google anyway)
  • NAV Container Helper (a set of PS-scripts, even reading the source code has helped me)
  • Official Documentation: APIs, Automation APIs, Powershell Tools

This is still far from easy. You need to design how you automate everything. My entire start-to-stop-cycle looks something like:

  1. I download image
  2. I run image (with parameters) to create container
  3. I start container (happens automatically after 2)
  4. Artifact is being downloaded (unless cached from before)
  5. Initial container setup is being done (user+pass created)
  6. Business Central is starting up
  7. AdditionalSetup.ps1 is run (my opportunity to run custom PS code in container)
  8. I install extensions
  9. I add (Damage Inc) and delete (CRONUS) companies
  10. I install rapid start package
  11. I run read-only-tests
  12. I run read-write-tests
  13. I stop container
  14. I remove container

There are a few things to note.

  • 1 and 3 only if not already downloaded
  • 4,5,6,7 is happening automatically inside BC docker, all I can do is observe the result (like keep user+pass)
  • It is possible to run 3-13 (when using same image and artifact, and as long as container works and gives expected results) only
  • It is possible to run 8-12 (on already running container)
  • It is possible to run 11 only (on already running container)
  • 8/9 should probably switch order in the future

Tooling

In order to automate, and automate tests, you need some tool. It can be just a scripting language or something more complicated. You need to pick tools for:

  • Starting, testing, stopping the whole thing
  • Step 8-10 can be done using Powershell (invoked step 7) or using Microsoft Automation API (so you need a tool to make http-requests)
  • Step 11-12 is about testing the BC APIs, using http-requests, so you need a tool that can handle that

I already have other systems that are being tested in a similar way, so for me Business Central is just a part of a bigger integration test process. I am using Node.js and Mocha since before, so I use it for most everything above. However, some things need to be done in Powershell (AdditionalSetup.ps1) as well, more on that later.

System Requirements

You need a reasonably good Windows 10 computer. 16GB or RAM is acceptable so far, but if you have other heavy things running or perhaps later when you get more data in BC you will find that 16GB is too little. I am doing quite fine with my 8th gen i7 CPU.

The number 19041.508 in the docker image name corresponds to my Windows version. You may not find images for some older version of Windows 10.

You are probably fine with a recent Windows Server. I have not tried.

Bascially, Windows docker images can only run on Windows computers, so Linux and Mac OS will not just work (there may be ways with virtualization, Wine or something, I dont know).

Performance

Ideally, when you do automated testing you want to be able to iterate fast. I have found that two steps take particularly long time (~10 min each).

  1. Downloading image and artifact
  2. Importing the Rapid Start Configuration Package (your company data)

Fortunately, #1 is only done the first time (or when you upgrade version).

Unfortunately, #2 is something I would like to do every time (so my tests can update data, but always run on the same data set).

Given the unfortunate #2 it does not make so much sense to put much effort into reusing the container (docker start container, instead of docker run image). I think eventually I will attempt to write read-write-tests that clean up after themselves, or perhaps divide the rapid package into several packages so I can only a last small one every test run. This is not optimal, but it is about optimization.

Nav Container Helper

Freddy (and friends) have written Nav Container Helper. You should probably use it. Since I am a bit backwards, Nav Container Helper is not part of my automated test run. But I use it to learn.

I can invoke Nav Container Helper with version and country arguments to learn what image and artifact to use.

Unfortunately documentation of BC in docker itself is quite thin. I have needed to read the source of Nav Container Helper, and run Nav Container Helper, to understand what options are available when creating a container.

Nav Container Helper annoys me. It kind of prefers to be installed and run as administrator. It can update the hosts-file when it creates a container, but that is optional. However, when removing a container it is not optional to check the hosts file, so I need to remove containers as administrator. I am also not very used to PowerShell, admittedly.

Nav Container Helper will eventually be replaced by the newer BC Container Helper.

Image and Artifact

The images are managed by docker. The artifacts are downloaded the first time you need them and stored in c:\bcartifacts.cache. You can change that folder to anything you like (see below). The image is capable of downloading the artifacts itself (to the cache folder you assign), so you don’t need NavContainerHelper for this.

To find the best generic image for you computer:

Get-BestGenericImageName

To find artifact URLs for BC, run in powershell (you need to install Nav-ContainerHelper first):

Get-BCArtifactUrl -version 16.4

Docker option and environment

When you run a docker image, which creates and starts a container, you can give options and parameters. When you later start an already existing container, it will use the same options as when created.

Since I don’t use NavContainerHelper to run image, here are options (arguments to docker run) I have found useful.

  -e accept_eula=Y
  -e accept_outdated=Y
  -e usessl=N
  -e enableApiServices=Y
  -e multitenant=Y
  -m 6G
  -e artifactUrl=https://bcartifacts.azureedge.net/sandbox/16.5.15897.16650/se
  -e licenseFile=c:\run\my\license.flf
  --volume myData:c:\Run\my
  --volume cacheData:c:\dl
  -p 8882:80
  -p 7048:7048

I will not get into too many details but:

  • You just need to accept EULA
  • The image may be old (whatever that is), use it anyway
  • I don’t care about SSL when testing things locally
  • You need to enable API to use it (port 7048)
  • Since 16.4 multitentant is required to be able to create or remove companies (you usually need to add ?tenant=default to all URLs)
  • 4GB is kind of recommended, I use 6GB now when importing rapid start packages of significant size
  • For doing anything real you most likely will need a valid license file. The path given is in the container (not on your host)
  • I have a folder (replace myData with your absolute path) on the host computer with a license file, my AdditionalSetup.ps1, and possibly more data. –volume makes that folder available (rw) as c:\run\my inside the docker container.
  • I have a folder (replace cacheData with your absolute path) where artifacts are downloaded. This way they are saved for next container.
  • Business Central UI listens to http:80. I expose that on my host on 8882.
  • Business Central API services are available on http:7048. I expose that on my host on 7048.

NavContainerHelper will do some of these things automatically and allow you to control other things with parameters. You can run docker inspect on a container to see how it was actually created.

Username & Password

First time you run a container (that is, when you create it using docker run) it will output username, password and other connection information to stdout. You may want to collect and save this information so you can connect to BC. There are ways to set a password too – I am fine with a generated one.

AdditionalSetup.ps1

If there is a file c:\run\my\AdditionalSetup.ps1 in the container, it will be run (last). You can do nothing, or a lot with this. It turned out that installing extensions via the API requires something to be installed first. So right now I have this in my AdditionalSetup.ps1:

Write-Host 'Starting AdditionalSetup.ps1'
if ( -not (Get-Module -ListAvailable -Name ALOps.ExternalDeployer) ) {
  Write-Host 'Starting ALOps installation'
  Write-Host 'ALOps 1/5: Set Provider'
  Install-PackageProvider -Name NuGet -Force
  Write-Host 'ALOps 2/5: Install from Internet'
  install-module ALOps.ExternalDeployer -Force
  Write-Host 'ALOps 3/5 import'
  import-module -Name ALOps.ExternalDeployer
  Write-Host 'ALOps 4/5 install'
  Install-ALOpsExternalDeployer
  Write-Host 'ALOps 5/5 create deployer'
  New-ALOpsExternalDeployer -ServerInstance BC
  Write-Host 'ALOps Complete'
} else {
  Write-Host 'ALOps Already installed'
}

This is horrible, because it downloads something from the internet every time I create a new container, and it occationally fails. I tried to download this module in advance and just install/import it. That did not work (there is something about this NuGet-provider that requires extra magic offline). The Microsoft ecosystem is still painfully imature.

To try things out in the container, you can get a powershell shell inside the container:

docker exec -ti <containername> powershell

Install Extensions

A usually install extensions with the cmdlets:

  1. Publish-NAVApp
  2. Sync-NAVApp
  3. Install-NAVApp

in AdditionalSetup.ps1 (before setting up companies – that seems to not matter so much). You need to “import” those cmdlets before using them:

import-module 'c:\Program Files\Microsoft Dynamics NAV\160\Service\Microsoft.Dynamics.Nav.Apps.Management.psd1'

I can also use the automation API, if I first install ALOps.ExternalDeployer as above (but that is a download, which I don’t like)

Set Up Companies

Depending on your artifact you may get different companies from the beginning. It seems you always get “My Company”. And then there is a localized CRONUS company (except for the w1 artifact), that can be named “CRONUS USA” or “CRONUS International Inc” or something.

I work for Damage Inc, so that is the only company I want. However, it seems not to be possible to delete the last company. This is what I have automated:

  1. If “Company Zero” does not exist, create it
  2. Delete all companies, except “Company Zero”
  3. Create “Damage Inc”
  4. Delete “Company Zero” (optional – if it distrubs you)

This works the first time (regardless of CRONUS presence), when creating the container. This also works if I run it over and over again (for example when restarting an already created container, or just running some tests on an already started container): I get the same result, just a new “Damage Inc” every time, just as the first time.

Install Rapid Start Package

I install a rapid start package using the automation API. It should be possible to do it from AdditionalSetup.ps1 as well. This takes long time. I see some advantage using the API because I can monitor and control the status/progress in my integration-test-scripts (I could output things from AdditionalSetup.ps1 and monitor that, too).

Rapidstart packages are tricky – by far the most difficult step of all:

  1. Exporting a correct Rapidstart package is not trivial
  2. Importing takes long time
  3. The GUI inside business central (Rapidstart packates are called Configuration Packages) gives more control than the API – and you can see the detailed errors in the GUI (only).
  4. I have found that I get errors when importing using the API, but not in the GUI. In fact, just logging in to the GUI, doing nothing, and logging out again, before using the API makes the API import successful. Perhaps there are triggers being run when the GUI is activated, setting up data?

Run tests!

Finally, you can run your tests and profit!

Use the UI

You can log in to BC with the username and password that you collected above. I am not telling you to do manual testing, but the opportunities are endless.

Stopping

When done, you can stop the container. If run was invoked with “–rm” the container will be automatically removed.

Depending on your architecture and strategy, you may be able to (re)use this container for later use.

Webhooks & Subscriptions

Business Central has a feature called webhooks (in the api it is subscriptions). It is a feature that makes BC call you (your service) when something has updated, so you don’t need to poll regularly.

This is good, but beware, it is a bit tricky.

First M$ has decided BC will only call an HTTPS service. When I run everything on localhost and BC in a container, i am fine with HTTP, actually. Worse, even if I run HTTPS, BC is not accepting my self signed certificate. This sucks! Perhaps there is a way to allow BC to call an HTTP service. I couldn’t find out so now I let my BC Container call a proxy on the internet. That is crap.

Also, note that the webhooks trigger after about 30s. That is probably fine, for production. For automated testing it sucks. Perhaps there is a way to speed this up on a local docker container, please let me know.

Finally, the documentation for deleting a webhook is wrong. In short, what you need to (also) do is:

  1. add ‘ around the id, as in v1.0/subscriptions(‘asdfasdfasdfasdfasf’)
  2. set header if-match to * (or something more sofisticated)

I found it in this article.

Docker Issues

Occationally, not so rarely unfortunately, when I try to start a container I get something like:

Error response from daemon: hcsshim::CreateComputeSystem 3239b7231b2e3d1
b5aa46aa484e526e454fdd8ca230b324a34cfa91f5625583b: The requested resource is in
use.

It is hard to predict and it is hard to solve. Sometimes a restart of the computer works. Sometimes reinstalling Docker (or making som reset) works. Mostly the problem can not be solved for the moment.

Sometimes changing isolation from hyperv to process helps.

It seems the problem is that the current Windows version you are on (every cumulative update matters) is not working with the BC Image. But it is not like there is always a more current BC image that works. And on rare occations one BC image works and the other does not.

When this happens, I simply accept Business Central in Docker does not work on this computer, and then I try another day when I have applied a new cumulative update. So if you really NEED Business Central in Docker to be working, you need two working computers, and update one at a time. If one breaks do not update the other.

To be completely clear, this is the kind of bullshit that makes Microsoft technology not mature and stable enough for real applications. If what you are doing is important – don’t run it using Microsoft technology.

Conclusion

This – running NAV/BC inside Docker and automate test cases – is somewhat new technology. There have been recent changes and sources of confusion:

  • NAV rebranding to Business Central
  • Replacing images with Artifacts
  • Multitenant needed from 16.4
  • The onprem vs SAS thing

To do this right requires effort and patience. But to me, not doing this at all (or doing it wrong) is not an option.

PHP validation of UTF-8 input

Last weeks I have done some PHP programming (my web hotel where I run wordpress supports PHP, and it is trickier to run Node.js on a simple web hotel). I like to do input validation:

function err($status,$msg) {
  http_response_code($status);
  echo $msg;
}

if ( 1 !== preg_match('/^[a-z_]+$/',$_REQUEST['configval']) ) {
  return err(400,'invalid param value: configval=' . $_REQUEST['configval']);
}

Well, that is good until I wanted a name of something (like Düsseldorf, that becomes D%C3%BCsseldorf when sent from the browser to PHP). It turned out such international characters encoded as Unicode/UTF-8 can not be matched/tested in a nice way with PHP regular expressions.

PHP does not support UTF-8. So ü in this case becomes two characters, neither of them matches [A-Za-z] or [[:alpha:]]. However, PHP can process it as text, use it in array keys, and output valid JSON without corrupting it, so not all is lost. Just validation is hard.

I needed to come up with something good enough for my purposes.

  • I can consider ALL such unicode characters (first byte 128+) valid (even though there may be strange characters, like extra long spaces and stuff, I don’t expect them to cause me problems if anyone bothers to enter them)
  • I don’t need to consider case of Ü/ü and Å/å
  • I don’t need full regexp support
  • It is nice to be able to check length correctly, and international characters like ü and å counts as two bytes in PHP.
  • I don’t need to match specific characters in the ranges A-Z, a-z or 0-9, but when it comes to special characters: .,:,#”!@$, I want to be able to include them explictly

So I wrote a simple (well) validation function in PHP that accepts arguments for

  • minimum length
  • maximum length
  • valid characters for first position (optional)
  • valid characters
  • valid characters for last position (optional)

When it comes to valid characters it is simply a string where characters mean:

  • u: any unicode character
  • 0: any digit 0-9
  • A: any capital A-Z
  • a: any a-z
  • anything else matches only itself

So to match all letters, & and space: “Aau &”.

Some full examples:

utf8validate(2,10,’Aau’,’Aau 0′,”,$str)

This would match $str starting with any letter, containing letters, spaces and digits, and with a length of 2-10. It allows $str to end with space. If you dont like that, you can do.

utf8validate(2,10,’Aau’,’Aau -&0′,’Aau0′,$str)

Now the last character can not be a space anymore, but we have also allowed – and & inside $str.

utf8validate_error

The utf8validate function returns true on success and false on failure. Sometimes you want to know why it failed to match. That is when utf8validate_error can be used instead, returning a string on error, and false on success.

Code

I am not an experienced PHP programmer, but here we go.

function utf8validate($minlen, $maxlen, $first, $middle, $last, $lbl) {
  return false === utf8validate_error($minlen, $maxlen,   
                                      $first, $middle, $last, $lbl);
}

function utf8validate_error($minlen, $maxlen, $first, $middle, $last, $lbl) {
  $lbl_array = unpack('C*', $lbl);
  return utf8validate_a(1, 0, $minlen, $maxlen,
                        $first, $middle, $last, $lbl_array);
}

function utf8validate_utfwidth($pos,$lbl) {
  $w = 0;
  $c = $lbl[$pos];
  if ( 240 <= $c ) $w++;
  if ( 224 <= $c ) $w++;
  if ( 192 <= $c ) $w++;
  if ( count($lbl) < $pos + $w ) return -1;
  for ( $i=1 ;$i<=$w ; $i++ ) {
    $c = $lbl[$pos+$i];
    if ( $c < 128 || 191 < $c ) return -2;
  }
  return $w;
}

function utf8validate_a($pos,$len,$minlen,$maxlen,$first,$middle,$last,$lbl) {
  $rem = 1 + count($lbl) - $pos;
  if ( $rem + $len < $minlen )
    return 'Too short';
  if ( $rem < 0 )
    return 'Rem negative - internal error';
  if ( $rem === 0 )
    return false;
  if ( $maxlen <= $len )
    return 'Too long';

  $type = NULL;
  $utfwidth = utf8validate_utfwidth($pos,$lbl);
  if ( $utfwidth < 0 ) {
    return 'UTF-8 error: ' . $utfwidth;
  } else if ( 0 < $utfwidth ) {
    $type = 'u';
  } else {
    $cv = $lbl[$pos];
    if ( 48 <= $cv && $cv <= 57 ) $type = '0';
    else if ( 65 <= $cv && $cv <= 90 ) $type = 'A';
    else if ( 97 <= $cv && $cv <= 122 ) $type = 'a';
    else $type = pack('C',$cv);
  }

// type is u=unicode, 0=number, a=small, A=capital, or another character

  $validstr = NULL;
  if ( 1 === $pos && '' !== $first ) {
    $validstr = $first;
  } else if ( '' === $last || $pos+$utfwidth < count($lbl) ) {
    $validstr = $middle;
  } else {
    $validstr = $last;
  }

  if ( false === strpos($validstr,$type) ) {
    return 'Pos ' . $pos . ' ('
         . ( 'u'===$type ? 'utf8-char' : pack('C',$lbl[$pos]) )
         . ') not found in [' . $validstr . ']';
  }
  return utf8validate_a(1+$pos+$utfwidth,1+$len,$minlen,$maxlen,
                        $first,$middle,$last,$lbl);
}

That is all.

Tests

I wrote some tests as well.

$err = false;
if (false!==($err=utf8validate_error(1,1,'','a','','g')))
  throw new Exception('g failed: ' . $err);
if (false===($err=utf8validate_error(1,1,'','a','','H'))) 
  throw new Exception('H should have failed');
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Edmund')))
  throw new Exception('Edmund failed: ' . $err);
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Kött')))
  throw new Exception('Kött failed: ' . $err);
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Kött-Jan')))
  throw new Exception('Kött-Jan failed: ' . $err);
if (false!==($err=utf8validate_error(3,3,'A','a0','0','X10')))
  throw new Exception('X10 failed: ' . $err);
if (false!==($err=utf8validate_error(3,3,'A','a0','0','Yx1')))
  throw new Exception('Yx1 failed: ' . $err);
if (false===($err=utf8validate_error(3,3,'A','a0','0','a10')))
  throw new Exception('a10 should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','Aaa')))
  throw new Exception('Aaa should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','Ax10')))
  throw new Exception('Ax10 should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','B0')))
  throw new Exception('B0 should have failed');
if (false!==($err=utf8validate_error(3,3,'u','u','u','äää')))
  throw new Exception('äää failed: ' . $err);
if (false===($err=utf8validate_error(3,3,'','u','','abc'))) 
  throw new Exception('abc should have failed');
if (false!==($err=utf8validate_error(2,5,'Aau','u','Aau','XY')))
  throw new Exception('XY failed: ' . $err);
if (false===($err=utf8validate_error(2,5,'Aau','u','Aau','XxY')))
  throw new Exception('XxY should have failed');
if (false!==($err=utf8validate_error(0,5,'','0','',''))) 
  throw new Exception('"" failed: ' . $err);
if (false!==($err=utf8validate_error(0,5,'','0','','123'))) 
  throw new Exception('123 failed: ' . $err);
if (false===($err=utf8validate_error(0,5,'','0','','123456')))
  throw new Exception('123456 should have failed');
if (false===($err=utf8validate_error(2,3,'','0','','1'))) 
  throw new Exception('1 should have failed');
if (false===($err=utf8validate_error(2,3,'','0','','1234'))) 
  throw new Exception('1234 should have failed');

Conclusions

I think input validation should be taken seriously, also in PHP. And I think limiting input to ASCII is not quite enough 2020.

There are obviously ways to work with regular expressions and UTF8 too, but I do not find it pretty.

My code/strategy above should obviously only be used for labels and names where international characters make sense and where the form of the input is relatively free. For other parameters, use a more accurate validation method.

Simple Password Hashing with Node & Argon2

When you build a service backend you should keep your users’ passwords safe. That is not so easy anymore. You should

  1. hash and salt (md5)
  2. but rather use strong hash (sha)
  3. but rather use a very expensive hash (pbkdf2, bcrypt)
  4. but rather use a hash that is very expensive on GPUs and cryptominers (argon2)

Argon2 seems to be the best choice (read elsewhere about it)!

node-argon2

Argon2 is very easy to use on Node.js. You basically just:

$ npm install argon2

Then your code is:

/* To hash a password */
hash = await argon2.hash('password');

/* To test a password */
if ( await argon2.verify(hash,'password') )
  console.log('OK');
else
  console.log('Not OK');

Great! What is not to like about that?

$ du -sh node_modules/*
 20K  node_modules/abbrev
 20K  node_modules/ansi-regex
 20K  node_modules/aproba
 44K  node_modules/are-we-there-yet
348K  node_modules/argon2
 24K  node_modules/balanced-match
 24K  node_modules/brace-expansion
 24K  node_modules/chownr
 20K  node_modules/code-point-at
 40K  node_modules/concat-map
 32K  node_modules/console-control-strings
 44K  node_modules/core-util-is
120K  node_modules/debug
 36K  node_modules/deep-extend
 40K  node_modules/delegates
 44K  node_modules/detect-libc
 28K  node_modules/fs-minipass
 32K  node_modules/fs.realpath
104K  node_modules/gauge
 72K  node_modules/glob
 20K  node_modules/has-unicode
412K  node_modules/iconv-lite
 24K  node_modules/ignore-walk
 20K  node_modules/inflight
 24K  node_modules/inherits
 24K  node_modules/ini
 36K  node_modules/isarray
 20K  node_modules/is-fullwidth-code-point
 48K  node_modules/minimatch
108K  node_modules/minimist
 52K  node_modules/minipass
 32K  node_modules/minizlib
 32K  node_modules/mkdirp
 20K  node_modules/ms
332K  node_modules/needle
956K  node_modules/node-addon-api
240K  node_modules/node-pre-gyp
 48K  node_modules/nopt
 24K  node_modules/npm-bundled
 36K  node_modules/npmlog
172K  node_modules/npm-normalize-package-bin
 28K  node_modules/npm-packlist
 20K  node_modules/number-is-nan
 20K  node_modules/object-assign
 20K  node_modules/once
 20K  node_modules/osenv
 20K  node_modules/os-homedir
 20K  node_modules/os-tmpdir
 20K  node_modules/path-is-absolute
 32K  node_modules/@phc
 20K  node_modules/process-nextick-args
 64K  node_modules/rc
224K  node_modules/readable-stream
 32K  node_modules/rimraf
 48K  node_modules/safe-buffer
 64K  node_modules/safer-buffer
 72K  node_modules/sax
 88K  node_modules/semver
 24K  node_modules/set-blocking
 32K  node_modules/signal-exit
 88K  node_modules/string_decoder
 20K  node_modules/string-width
 20K  node_modules/strip-ansi
 20K  node_modules/strip-json-comments
196K  node_modules/tar
 28K  node_modules/util-deprecate
 20K  node_modules/wide-align
 20K  node_modules/wrappy
 36K  node_modules/yallist

That is 69 node modules of 5.1MB. If you think that is cool for your backend password encyption code (in order to provide two functions: encrypt and verify) you can stop reading here.

I am NOT fine with it, because:

  • it will cause me trouble, one day, when I run npm install, and something is not exactly as I expected, perhaps in production
  • how safe is this? it is the password encryption code we are talking about – what if any of these libraries are compromised?
  • it is outright ugly and wasteful

Well, argon2 has a reference implementation written in C (link). If you download it you can compile, run test and try it like:

$ make
$ make test
$ ./argon2 -h
Usage: ./argon2-linux-x64 [-h] salt [-i|-d|-id] [-t iterations] [-m log2(memory in KiB) | -k memory in KiB] [-p parallelism] [-l hash length] [-e|-r] [-v (10|13)]
Password is read from stdin
Parameters:
 salt The salt to use, at least 8 characters
 -i   Use Argon2i (this is the default)
 -d   Use Argon2d instead of Argon2i
 -id  Use Argon2id instead of Argon2i
 -t N Sets the number of iterations to N (default = 3)
 -m N Sets the memory usage of 2^N KiB (default 12)
 -k N Sets the memory usage of N KiB (default 4096)
 -p N Sets parallelism to N threads (default 1)
 -l N Sets hash output length to N bytes (default 32)
 -e   Output only encoded hash
- r   Output only the raw bytes of the hash
 -v (10|13) Argon2 version (defaults to the most recent version, currently 13)
 -h   Print ./argon2-linux-x64 usage

It builds to a single binary (mine is 280kb on linux-x64). It does most everything you need. How many lines of code do you think you need to write for node.js to use that binary instead of the 69 npm packages? The answer is less than 69. Here comes some notes and all the code (implementing argon2.hash and argon2.verify as used above):

  1. you can make binaries for different platforms and name them accordingly (argon2-linux-x64, argon2-darwin-x64 and so on), so you can move your code (and binaries) between different computers with no hazzle (as JavaScript should be)
  2. if you want to change argon2-parameters you can do it here, and if you want to pass an option-object to the hash function that is an easy fix
  3. options are parsed from hash (just as the node-argon2 package) when verifying, so you don’t need to “remember” what parameters you used when hashing to be able to verify
/* argon2-wrapper.js */

const nodeCrypto = require('crypto');
const nodeOs    = require('os');
const nodeSpawn = require('child_process').spawn;
/* NOTE 1 */
const binary    = './argon2';
// st binary    = './argon2-' + nodeOs.platform() + '-' + nodeOs.arch();

const run = (args,pass,callback) => {
  const proc = nodeSpawn(binary,args);
  let hash = '';
  let err = '';
  let inerr = false;
  proc.stdout.on('data', (data) => { hash += data; });
  proc.stderr.on('data', (data) => { err += data; });
  proc.stdin.on('error', () => { inerr = true; });
  proc.on('exit', (code) => {
    if ( err ) callback(err);
    else if ( inerr ) callback('I/O error');
    else if ( 0 !== code ) callback('Nonzero exit code ' + code);
    else if ( !hash ) callback('No hash');
    else callback(null,hash.trim());
  });
  proc.stdin.end(pass);
};

exports.hash = (pass) => {
  return new Promise((resolve,reject) => {
    nodeCrypto.randomBytes(12,(e,b) => {
      if ( e ) return reject(e);
      const salt = b.toString('base64');
      const args = [salt,'-id','-e'];
/* NOTE 2 */
//    const args = [salt,'-d','-v','13','-m','12','-t','3','-p','1','-e'];
      run(args,pass,(e,h) => {
        if ( e ) reject(e);
        else resolve(h);
      });
    });
  });
};

exports.verify = (hash,pass) => {
  return new Promise((resolve,reject) => {
    const hashfields = hash.split('$');
    const perffields = hashfields[3].split(',');
/* NOTE 3 */
    const args = [
        Buffer.from(hashfields[4],'base64').toString()
      , '-' + hashfields[1].substring(6) // -i, -d, -id
      , '-v', (+hashfields[2].split('=')[1]).toString(16)
      , '-k', perffields[0].split('=')[1]
      , '-t', perffields[1].split('=')[1]
      , '-p', perffields[2].split('=')[1]
      , '-e'
    ];
    run(args,pass,(e,h) => {
      if ( e ) reject(e);
      else resolve(h===hash);
    });
  });
};

And for those of you who want to test it, here is a little test program that you can run. It requires

  • npm install argon2
  • argon2 reference implementation binary
const argon2package = require('argon2');
const argon2wrapper = require('./argon2-wrapper.js');

const bench = async (n,argon2) => {
  const passwords = [];
  const hashes = [];
  const start = Date.now();
  let errors = 0;

  for ( let i=0 ; i<n ; i++ ) {
    let pw = 'password-' + i;
    passwords.push(pw);
    hashes.push(await argon2.hash(pw));
  }
  const half = Date.now();
  console.log('Hashed ' + n + ' passwords in ' + (half-start) + ' ms');

  for ( let i=0 ; i<n ; i++ ) {
    // first try wrong password
    if ( await argon2.verify(hashes[i],'password-ill-typed') ) {
      console.log('ERROR: wrong password was verified as correct');
      errors++;
    }
    if ( !(await argon2.verify(hashes[i],passwords[i]) ) ) {
      console.log('ERROR: correct password failed to verify');
      errors++;
    }
  }
  const end = Date.now();
  console.log('Verified 2x' + n + ' passwords in ' + (end-half) + ' ms');
  console.log('Error count: ' + errors);
  console.log('Hash example:\n' + hashes[0]);
};

const main = async (n) => {
  console.log('Testing with package');
  await bench(n,argon2package);
  console.log('\n\n');

  console.log('Testing with binary wrapper');
  await bench(n,argon2wrapper);
  console.log('\n\n');
}
main(100);

Give it a try!

Performance

I find that in Linux x64, wrapping the binary is slightly faster than using the node-package. That is weird. But perhaps those 69 dependencies don’t come for free after all.

Problems?

I see one problem. The node-argon2 package generates binary hashes random salts and sends to the hash algorithm. Those binary salts come out base64-encoded in the hash. However, a binary value (a byte array using 0-255) is not very easy to pass on the command line to the reference implementation (as first parameter). My wrapper-implementation also generate a random salt, but it base64-encodes it before it passes it to argon2 as salt (and argon2 then base64-encodes it again in the hash string).

So if you already use the node-package the reference c-implementation is not immediately compatible with the hashes you already have produced. The other way around is fine: “my” hashes are easily consumed by the node package.

If this is a real problem for you that you want to solve I see two solutions:

  1. make a minor modification to the C-program so it expects a salt in hex format (it will be twice as long on the command line)
  2. start supplying your own compatible hashes using the option-object now, and don’t switch to the wrapper+c until the passwords have been updated

Conclusion

There are bindings between languages and node-packages for stuff. But unix already comes with an API for programs written in different languages to use: process forking and pipes.

In Linux it is extremely cheap. It is quite easy to use and test, since you easily have access to the command line. And the spawn in node is easy to use.

Argon2 is nice and easy to use! Use it! Forget about bcrypt.

The best thing you can do without any dependencies is pbkdf2 which comes with node.js and is accessible in its crypto module. It is standardized/certified, that is why it is included.