László Böszörményi (GCS) wrote: > On Mon, May 28, 2018 at 1:33 PM Trent W. Buck <trentb...@gmail.com> wrote: >> Per /usr/share/doc/sqlite3-doc/csv.html & https://sqlite.org/csv.html >> >> sqlite> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='thefile.csv'); >> Error: no such module: csv >> >> ⋮ >> >> If it's easy to do, can you please build the CSV module >> (and any other similar first-party extensions)? > > Again, don't need to load any module. What's wrong with the following? > > $ cat test.csv > a,b > 1,2 > 3,4 > 5,6 > 7,8 > > $ sqlite3 > SQLite version 3.23.1 > sqlite> .mode csv > sqlite> .import test.csv test_table > sqlite> .schema test_table > CREATE TABLE test_table( > "a" TEXT, > "b" TEXT > ); > sqlite> select * from test_table; > 1,2 > 3,4 > 5,6 > 7,8
That reads the test.csv into a native sqlite table. The csv module leaves it as an external table. Reading into a native table (as you suggest) is definitely safer, and probably queries are much faster, so I agree it's USUALLY the right answer. My actual use case is to migrate a mess of bash and CSV to python and sqlite, but I want to do it slowly, which means keeping the old CSV files working. The workload is read-mostly and a small percentage of appends, so using CSV as an external table looks/looked like a good bet. A quick proof-of-concept test using the .import method you suggested shows about 1m to round-trip the database from CSV to sqlite3 (45s) and back to CSV (22s). That's the main thing I'm hoping to avoid. Anyway, I did say "if it's easy to do", and it looks like the upstream configure.ac does *not* make this easy. For some reason upstream provides --enable options for only SOME modules, e.g. json1 and rtree, but not interesting-looking things like csv # read-only CSV virtual tables zip # read-write ZIP virtual tables (each row is one file in the .zip) icu # Unicode-aware LIKE, lower(), upper(), &c lsm1 # key/value storage backend (a la BDB) eval # SQL function eval() percentile # SQL aggregate function for https://en.wikipedia.org/wiki/Percentile regexp # SQL function regexp(re, str), for e.g. SELECT WHERE scrub # Like VACUUM but stop-and-copy(?); optionally zeroes unused blocks, like ext4's zerofree. series # SQL function equivalent of GNU coreutils's seq(1) shathree # SQL function for SHA-3 Keccak spellfix # http://www.sqlite.org/spellfix1.html totype # Coercion a la postgres ::INT and ::FLOAT unionvtab # virtual table merging e.g. students.db and teachers.db tables vfsstat # monitoring for questions like "why is my database so slow?" For someone just using regular Debian /usr/bin/sqlite3 (or libsqlite3 via python3), it's not at all obvious how to actually get any of these features. Following http://sqlite.org/loadext.html, this doesn't error, but does seem to hang: (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# gcc -g -fPIC -shared ext/misc/csv.c -o ext/misc/csv.so (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# seq 100 >test.csv (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# sqlite3 SQLite version 3.23.1 2018-04-10 17:39:29 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='test.csv'); Error: no such module: csv sqlite> .load ext/misc/csv.so sqlite> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='test.csv'); sqlite> .schema CREATE VIRTUAL TABLE temp.t1 USING csv(filename='test.csv') /* temp.t1(c0) */; sqlite> select count(*) from temp.t1; [hangs] Probably because RFC-compliant CSV files need CRLF not LF, so… (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# unix2dos test.csv unix2dos: converting file test.csv to DOS format... [same behaviour from sqlite3] The sha1 module *DOES* work when tested in this way: (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# gcc -g -fPIC -shared ext/misc/sha1.c -o ext/misc/sha1.so (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# sqlite3 <<< $'.load ext/misc/sha1.so\n select sha1("hello world");' 2aae6c35c94fcfb415dbe95f408b9ce91ee846ed (BUILDROOT:BUSTER)root@zygon:/tmp/sqlite3# printf 'hello world' | sha1sum 2aae6c35c94fcfb415dbe95f408b9ce91ee846ed - Of course, I don't want to have to carry a csv.so around with me — I want Debian to provide it in a standard place, so that when I change architectures, or there's a security update to the sqlite3 package, I get the newer csv.so automatically. Oh, also, that way I get debhelper's standard hardening flags ☺