Update 2014-05-04: Added -N1/-N2 flags for sorting rows/columns numerically, and uploaded a windows binary.
I found myself having data in CSV-files with three columns; two dimensions and a value. It could look like:
20080102,AAPL,194.84 20090102,AAPL,90.75 20100104,AAPL,214.04 20110103,AAPL,329.57 20120103,AAPL,411.23 20080102,MSFT,35.22 20090102,MSFT,20.33 20100104,MSFT,30.95 20110103,MSFT,27.98 20120103,MSFT,26.765
Data typically looks like this because it is very easy to output transactions on this format. That is very nice if you want to load it into a database. But for other purposes (like plotting a graph using LibreOffice Calc or even Excel) it would be much nicer with a table/matrix-layout:
,AAPL,MSFT 20080102,194.84,35.22 20090102,90.75,20.33 20100104,214.04,30.95 20110103,329.57,27.98 20120103,411.23,26.765
I could not find a standard tool for this. I thought about different options, and finally decided it was quite easy to just write a little program. So I did. You use it like this:
$ ./csv-list2table -t < list.csv > table.csv
There are a few things to think about:
- The switches -t or -T decides if column 1 or 2 will be rows
- -N1 and -N2 can be used to treat/sort column 1 and 2 numerically
- Rows and columns are outputed sorted
- Holes/missing values outputed as ,,
- Comma is the only accepted delimiter
- Input must have exactly 3 columns
- Pre/post-process data with sed and cut
As the last item mentions, sed can fix a file with other delimiters than comma, and cut can pick the columns you need from a list with more data than you need.
Finally, the code written in standard C: csv-list2table-1.1.c.
Old version: csv-list2table-1.0.c
Windows binary: csv-list2table-1.1.exe (should be no tricky dependencies)
Test file: csv-list2table-test.txt
Building should be trivial:
gcc -O2 -o csv-list2table csv-list2table.c
I dont think the code contains anything that should confuse any c-compiler on any reasonable platform.