Hi all, I have TSV file that contains 1.2 million rows. I want to bulk import this file into solr where each row becomes a solr document. The TSV has 24 columns. I am using the streaming API like so:
curl -v ' http://localhost:8983/solr/example/update?stream.file=/opt/solr/results.tsv&separator=%09&escape=%5c&stream.contentType=text/csv;charset=utf-8&commit=true ' The ingestion rate is 167,000 rows a minute and takes about 7.5 minutes to complete. I have a few questions. - is there a way to increase the performance of the ingestion rate? I am open to doing something other than bulk import of a TSV up to and including writing a small program. I am just not sure what that would look like at a high level. - if the file is a TSV, I noticed that solr never closes a HTTP connection with a 200 OK after all the documents are uploaded. The connection seems to be held open indefinitely. If however, i upload the same file as a CSV, then solr does close the http connection. Is this a bug?