This seems extremely inefficient. Let’s claim you have a document without a category. How do you determine what the category should be? Query the index? Reach out into another database and create a category? Add a default value? Depending on the source of the category, there are several options.
My first approach would be to insure the doc had a category when it was being indexed. Either write a SolrJ client that parses your very large data fie and adds the category when you index the doc or perhaps add a ScriptUpdateProcessor on the Solr end if you can’t parse the file on your client that added the category. Well actually the very easiest thing to do would be to define a default value for the field as: <field name="timestamp" type="date" indexed="true" stored="true" default="NOW" />, but that may not serve well. Best, Erick > On May 16, 2019, at 6:10 AM, Derrick Cui <derrick...@gmail.com> wrote: > > Hi, > I have a use case, but don’t know how to implement, please help. > > I have one large data file, let’s say 500b data, which doesn’t have > category in the source. What I want to do is that execute a query on > indexing documents, if the query hits great than 0, add category field and > save to solr. > > Currently I do two steps , indexing, then query on collection and add field > of hits great than 0, but it takes several days to complete. > > Any idea or solution please . > > Thanks > -- > Regards, > > Derrick Cui > Email: derrick...@gmail.com