Hi all,

I have TSV file that contains 1.2 million rows. I want to bulk import this
file into solr where each row becomes a solr document. The TSV has 24
columns. I am using the streaming API like so:

curl -v '
http://localhost:8983/solr/example/update?stream.file=/opt/solr/results.tsv&separator=%09&escape=%5c&stream.contentType=text/csv;charset=utf-8&commit=true
'

The ingestion rate is 167,000 rows a minute and takes about 7.5 minutes to
complete. I have a few questions.

- is there a way to increase the performance of the ingestion rate? I am
open to doing something other than bulk import of a TSV up to and including
writing a small program. I am just not sure what that would look like at a
high level.
- if the file is a TSV, I noticed that solr never closes a HTTP connection
with a 200 OK after all the documents are uploaded. The connection seems to
be held open indefinitely. If however, i upload the same file as a CSV,
then solr does close the http connection. Is this a bug?

Reply via email to