Converting a CSV to SQL using 1 line in bash

 - The command line is very powerful and can do amazing stuff in one single line by pipelining a series of commands. This post is inspired after creating a line that mixed sed and awk, but with just only awk I'll show you an example on how to convert a CSV file to an SQL insert

Let's take an input CSV named events-2013-06-06.csv with 16 columns per line. It looks like this:

k51b04876036e2,192.168.54.67,3a8d6196,2013-06-06,03:29:42,started,,,,no active campaign,Spain,ca-es,,v1.0,1370482182
k51b04876036e2,192.168.54.67,3a8d6196,2013-06-06,03:29:43,first-run,,,,no active campaign,Spain,ca-es,,v1.0,1370482183
k51b04876036e2,192.168.54.67,3a8d6196,2013-06-06,03:30:17,close,34,,,no active campaign,Spain,ca-es,34,v1.0,1370482217
k51b0494a76071,192.168.54.67,febd870c,2013-06-06,03:33:14,started,,,,no active campaign,Spain,ca-es,,v1.0,1370482394
...

I took 16 because is more or less a real-life example. I added some empty columns too to make it more real.

And this is the SQL output we want:

INSERT INTO tracking VALUES ('k51b04876036e2',INET_ATON('192.168.54.67'),'3a8d6196','2013-06-06','03:29:42','started','','','','no active campaign','Spain','ca-es','','v1.0','1370482182','');
INSERT INTO tracking VALUES ('k51b04876036e2',INET_ATON('192.168.54.67'),'3a8d6196','2013-06-06','03:29:43','first-run','','','','no active campaign','Spain','ca-es','','v1.0','1370482183','');
INSERT INTO tracking VALUES ('k51b04876036e2',INET_ATON('192.168.54.67'),'3a8d6196','2013-06-06','03:30:17','close','34','','','no active campaign','Spain','ca-es','34','v1.0','1370482217','');
INSERT INTO tracking VALUES ('k51b0494a76071',INET_ATON('192.168.54.67'),'febd870c','2013-06-06','03:33:14','started','','','','no active campaign','Spain','ca-es','','v1.0','1370482394','');
...

The one-liner that produced the output above was:

cat events-2013-06-06.csv | awk -F',' '{ printf "INSERT INTO tracking VALUES (\x27%s\x27,INET_ATON(\x27%s\x27),\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27);",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16;print ""}'

Don't get scared yet... more generically:

cat your_file.csv | awk -F',' '{ printf "INSERT INTO table VALUES (\x27%s\x27,\x27%s\x27);",$1,$2;print ""}'

The big line decomposed:

As a final note, do notice that I used printf in the command instead of print. The only difference is that using printf you put the %s placeholder in the text, while if you use print you write the number right away. So, if your CSV has a few fields maybe is more visual for you the print command, like this:

cat yourfile | awk -F',' '{ print "INSERT INTO tracking values(" $1 ")" }'

The only precaution here is that the $1 must be out of the quotes.

Let me know if any of these worked for you! Works like a charm!

And if you want to know more, there is a very cheap reference book for awk and sed: