I don’t quite know how TolerantUpdateProcessor works with importing CSV
files, see: https://issues.apache.org/jira/browse/SOLR-445. That is about
sending batches of docs to Solr and frankly I don’t know what path your
process will take. It’s worth a try though.
Otherwise, I typically go with SolrJ
Hi Shawn/Erick,
This information has been very helpful. Thank you.
So I did some more investigation into our ETL process and I verified that
with the exception of the text I sent above they are all obviously invalid
dates. For example, one field value had 00 for a day so would guess that
field ha
On 2/2/2020 8:47 AM, Joseph Lorenzini wrote:
1000
1
That autoSoftCommit setting is far too aggressive, especially for bulk
indexing. I don't know whether it's causing the specific problem you're
asking about here, but it's still a setting tha
You’re opening new searchers very often, every second at least.
I do not recommended this except under vary unusual circumstances.
This shouldn’t be the root of your problem, but it’s not helping
either. But I’d bump that up to 60 seconds or so.
I usually just specify maxTime and not maxDocs, I t
Hi Eric,
Thanks for the help.
For commit settings, you are referring to
https://lucene.apache.org/solr/guide/8_3/updatehandlers-in-solrconfig.html.
If so, yes, i have soft commits on. According to the docs, open search is
turned by default. Here are the settings.
60
What are your commit settings? Solr keeps certain in-memory structures
between commits, so it’s important to commit periodically. Say every 60
seconds as a straw-man proposal (and openSearcher should be set to
true or soft commits should be enabled).
When firing a zillion docs at Solr, it’s also b
Hi all,
I have three node solr cloud cluster. The collection has a single shard. I
am importing 140 GB CSV file into solr using curl with a URL that looks
roughly like this. I am streaming the file from disk for performance
reasons.
http://localhost:8983/solr/example/update?separator=%09&stream.f