On Thu, Dec 16, 2010 at 3:06 PM, Dennis Gearon <gear...@sbcglobal.net> wrote: > That easy, huh? Heck, this gets better and better. > > BTW, how about escaping?
The CSV escaping? It's configurable to allow for loading different CSV dialects. http://wiki.apache.org/solr/UpdateCSV By default it uses double quote encapsulation, like excel would. The bottom of the wiki page shows how to configure tab separators and backslash escaping like MySQL produces by default. -Yonik http://www.lucidimagination.com > > Dennis Gearon > > > Signature Warning > ---------------- > It is always a good idea to learn from your own mistakes. It is usually a > better > idea to learn from others’ mistakes, so you do not have to make them yourself. > from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' > > > EARTH has a Right To Life, > otherwise we all die. > > > > ----- Original Message ---- > From: Adam Estrada <estrada.adam.gro...@gmail.com> > To: Dennis Gearon <gear...@sbcglobal.net>; solr-user@lucene.apache.org > Sent: Thu, December 16, 2010 10:58:47 AM > Subject: Re: bulk commits > > This is how I import a lot of data from a cvs file. There are close to 100k > records in there. Note that you can either pre-define the column names using > the fieldnames param like I did here *or* include header=true which will > automatically pick up the column header if your file has it. > > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,population,elevation,gtopo30,timezone,modificationdate,cat&stream.file=C > > :\tmp\cities1000.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > > This seems to load everything in to some kind of temporary location before > it's actually committed. If something goes wrong there is a rollback feature > that will undo anything that happened before the commit. > > As far as batching a bunch of files, I copied and pasted the following in to > Cygwin and it worked just fine. > > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,population,elevation,gtopo30,timezone,modificationdate,cat&stream.file=C > > :\tmp\cities1000.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xab.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xac.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xad.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xae.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xaf.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xag.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xah.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xai.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xaj.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xak.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xal.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xam.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xan.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xao.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl " > http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C > > :\tmp\xap.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8" > curl http://localhost:8983/solr/update -H "Content-Type: text/xml" > --data-binary '<optimize/>' > > Adam > > On Thu, Dec 16, 2010 at 1:44 PM, Dennis Gearon <gear...@sbcglobal.net>wrote: > >> Might be Csv or tab delimited text. >> >> Sent from Yahoo! Mail on Android >> >> ------------------------------ >> * From: * Adam Estrada <estrada.adam.gro...@gmail.com>; >> * To: * <solr-user@lucene.apache.org>; >> * Subject: * Re: bulk commits >> * Sent: * Thu, Dec 16, 2010 6:35:17 PM >> >> what is it that you are trying to commit? >> >> a >> >> On Thu, Dec 16, 2010 at 1:03 PM, Dennis Gearon <gear...@sbcglobal.net >> >wrote: >> >> > What have people found as the best way to do bulk commits either from the >> > web or >> > from a file on the system? >> > >> > Dennis Gearon >> > >> > >> > Signature Warning >> > ---------------- >> > It is always a good idea to learn from your own mistakes. It is usually a >> > better >> > idea to learn from others’ mistakes, so you do not have to make them >> > yourself. >> > from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' >> > >> > >> > EARTH has a Right To Life, >> > otherwise we all die. >> > >> > >> > >