I know how to do indexing on file system like single file or folder, but
how do I do that in a parallel way? The data I need to index is of huge
volume and can't be put on HDFS.
Thank you
**
*Sincerely yours,*
*Raymond*
Simplest would be to host multiple shards on the same machine. Use
ADDREPLICA/DELETEREPLICA (collections API calls) to move the replicas
hosted on the nodes you want to use for another purpose and, when all
replicas are moved you can repurpose those machines.
Another option would be to create a _n
would you consider to include the filename as another meta data fields for
being indexed? I think your downstream python can do that easily.
**
*Sincerely yours,*
*Raymond*
On Fri, May 18, 2018 at 3:47 PM, S.Ashwath wrote:
> Hello,
>
> I have 2
Hi,
I know few about groping component, but I think it is very hard. Because
query result cache has {query and conditions} -> {DocList} structure.
(
https://github.com/apache/lucene-solr/blob/e30264b31400a147507aabd121b1152020b8aa6d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#
Hi Yasufumi
Thanks for the reply. Yes, you are correct. I also checked the code and it
seems the same.
We are facing performance issues due to grouping so wanted to be sure that
we are not leaving out any possibility of caching the same in Query Result
Cache.
was just exploring field collapsing