Re: does copyFields increase indexe size ?
> So what will be added is just another set of pointers to each relevant > term. That's not going to be very large. Probably Hi Shawn. This explains much ! Thanks. In case of text fields, the highlight is done on the source fields and the _text_ field is only used for lookup. This behavior is perfect for my needs. On Fri, Dec 27, 2019 at 05:28:25PM -0700, Shawn Heisey wrote: > On 12/26/2019 1:21 PM, Nicolas Paris wrote: > > Below a part of the managed-schema. There is 1k section* fields. The > > second experience, I removed the copyField, droped the collection and > > re-indexed the whole. To mesure the index size, I went to solr-cloud and > > looked in the cloud part: 40GO per shard. I also look at the folder > > size. I made some tests and the _text_ field is indexed. > > Your schema says that the destination field is not stored and doesn't have > docValues. So the only thing it has is indexed. > > All of the terms generated by index analysis will already be in the index > from the source fields. So what will be added is just another set of > pointers to each relevant term. That's not going to be very large. Probably > only a few bytes for each term. > > So with this copyField, the index will get larger, but probably not > significantly. > > Thanks, > Shawn > -- nicolas
RE: Exceptions in solr log
Hi, I'm facing the same problem with Solrcloud 7x - 8x. I have TLOG type of replicas and when I delete Leader, log is always full of this: 2019-12-28 14:46:56.239 ERROR (indexFetcher-45942-thread-1) [ ] o.a.s.h.IndexFetcher No files to download for index generation: 7166 2019-12-28 14:48:03.157 ERROR (indexFetcher-45881-thread-1) [ ] o.a.s.h.IndexFetcher No files to download for index generation: 10588 Unfortunately, by this error it's hard to say even what exact replica, shard and collection is in trouble. Sometimes, indexing helps - my guess that after commit slave replicas somehow understands what index generation should be retrieved from new leader. Sometimes I have to restart node. -- Vadim > -Original Message- > From: Akreeti Agarwal [mailto:akree...@hcl.com] > Sent: Friday, December 27, 2019 8:20 AM > To: solr-user@lucene.apache.org > Subject: Exceptions in solr log > > Hi All, > > Please help me with these exceptions and their workarounds: > > 1. org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: > Cannot parse > 2. o.a.s.h.IndexFetcher No files to download for index generation: 1394327 > 3. o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_b] (this > one is warning as discussed) > > I am getting these errors always in my solr logs, what can be the reason > behind them and how should I resolve it. > > > Thanks & Regards, > Akreeti Agarwal > ::DISCLAIMER:: > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. E-mail transmission is not > guaranteed to be secure or error-free as information could be intercepted, > corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents (with or without referred errors) > shall therefore not attach any liability on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the views or opinions of HCL or its > affiliates. Any form of reproduction, dissemination, copying, disclosure, > modification, distribution and / or publication of this message without the > prior written consent of authorized representative of HCL is strictly > prohibited. If you have received this email in error please delete it and notify > the sender immediately. Before opening any email and/or attachments, > please check them for viruses and other defects. >
Re: Solr 7.3 cluster issue
Wonder what clusterstate actually says. I can think of two things that could possibly heal the cluster: A rolling restart of all nodes may make Solr heal itself, but the risk is that some shards may not have a replica and if you get stuck in recovery during restart you have downtime. Another way could be to use admin UI and remove all replicas from the defunct node. Then reboot/reinstall that node and then add back missing replicas and let solr replicate shards to the new node. A third more defensive way is to add a fourth node, add replicas to it to make all collections redundant and then remove replicas from the defunct node and finally decommission it. Jan Høydahl > 28. des. 2019 kl. 02:17 skrev David Barnett : > > Happy holidays folks, we have a production deployment usage Solr 7.3 in a > three node cluster we have a number of collections setup, three shards with a > replica factor of 2. The system has been fine, but we experienced issues with > disk space one of the nodes. > > Node 0 starts but does not show any cores / replicas, the solr.log is full of > these "o.a.s.c.ZkController org.apache.solr.common.SolrException: Replica > core_node7 is not present in cluster state: null” > > Node 1 and Node 2 are OK, all data from all collections is accessible. > > Can I recreate node 0 as though it had failed completely ?, is it OK to > remove the references to the replicas (missing) and recreate. Would you be > able to provide me some guidance of the safest way to reintroduce node 0 > given our situation. > > Many thanks > > Dave
Solr 7.5 seed up, accuracy details
Hi all, Is there any way I can get the speed up,accuracy details i.e. performance improvements of solr 7.5 in comparison with solr 4.6 Currently,we are using solr 4.6 and we are in a process to upgrade to solr 7.5. Need these details. Thanks in advance
Re: Solr 7.5 seed up, accuracy details
This highly depends on how you designed your collections etc. - there is no general answer. You have to do a performance test based on your configuration and documents. I also recommend to check the Solr documentation on how to design a collection for 7.x and maybe start even from scratch defining it with a new fresh schema (using the schema api instead of schema.xml and solrconfig.xml etc). You will have anyway to reindex everything so it is a also a good opportunity to look at your existing processes and optimize them. > Am 28.12.2019 um 15:19 schrieb Rajdeep Sahoo : > > Hi all, > Is there any way I can get the speed up,accuracy details i.e. performance > improvements of solr 7.5 in comparison with solr 4.6 > Currently,we are using solr 4.6 and we are in a process to upgrade to > solr 7.5. Need these details. > > Thanks in advance
Re: Solr 7.3 cluster issue
+1 to Jan’s comments, especially the idea of adding a 4th node and doing your ADDREPLICAs to that before doing the DELETEREPLICAS for the replicas on the sick node. I’ve used this to bring clusters back to health. This assumes you have at least one active leader for all shards. That ZK error is weird, what’s the full stack trace? Best, Erick > On Dec 28, 2019, at 9:10 AM, Jan Høydahl wrote: > > Wonder what clusterstate actually says. I can think of two things that could > possibly heal the cluster: > > A rolling restart of all nodes may make Solr heal itself, but the risk is > that some shards may not have a replica and if you get stuck in recovery > during restart you have downtime. > > Another way could be to use admin UI and remove all replicas from the defunct > node. Then reboot/reinstall that node and then add back missing replicas and > let solr replicate shards to the new node. > > A third more defensive way is to add a fourth node, add replicas to it to > make all collections redundant and then remove replicas from the defunct node > and finally decommission it. > > Jan Høydahl > >> 28. des. 2019 kl. 02:17 skrev David Barnett : >> >> Happy holidays folks, we have a production deployment usage Solr 7.3 in a >> three node cluster we have a number of collections setup, three shards with >> a replica factor of 2. The system has been fine, but we experienced issues >> with disk space one of the nodes. >> >> Node 0 starts but does not show any cores / replicas, the solr.log is full >> of these "o.a.s.c.ZkController org.apache.solr.common.SolrException: Replica >> core_node7 is not present in cluster state: null” >> >> Node 1 and Node 2 are OK, all data from all collections is accessible. >> >> Can I recreate node 0 as though it had failed completely ?, is it OK to >> remove the references to the replicas (missing) and recreate. Would you be >> able to provide me some guidance of the safest way to reintroduce node 0 >> given our situation. >> >> Many thanks >> >> Dave
Re: Solr 7.5 seed up, accuracy details
Thank you for the information Why you are recommending to use the schema api instead of schema xml? On Sat, 28 Dec, 2019, 8:01 PM Jörn Franke, wrote: > This highly depends on how you designed your collections etc. - there is > no general answer. You have to do a performance test based on your > configuration and documents. > > I also recommend to check the Solr documentation on how to design a > collection for 7.x and maybe start even from scratch defining it with a new > fresh schema (using the schema api instead of schema.xml and solrconfig.xml > etc). You will have anyway to reindex everything so it is a also a good > opportunity to look at your existing processes and optimize them. > > > Am 28.12.2019 um 15:19 schrieb Rajdeep Sahoo >: > > > > Hi all, > > Is there any way I can get the speed up,accuracy details i.e. performance > > improvements of solr 7.5 in comparison with solr 4.6 > > Currently,we are using solr 4.6 and we are in a process to upgrade to > > solr 7.5. Need these details. > > > > Thanks in advance >
Re: Solr 7.3 cluster issue
Hi Jan et all clusterstate shows all cores and replicas on node 1 and 2 but node 0 is empty. All three nodes live_nodes shows the correct 3 node addresses. Thanks for the advice, we will use a 4th node. On 28 Dec 2019, 14:10 +, Jan Høydahl , wrote: > Wonder what clusterstate actually says. I can think of two things that could > possibly heal the cluster: > > A rolling restart of all nodes may make Solr heal itself, but the risk is > that some shards may not have a replica and if you get stuck in recovery > during restart you have downtime. > > Another way could be to use admin UI and remove all replicas from the defunct > node. Then reboot/reinstall that node and then add back missing replicas and > let solr replicate shards to the new node. > > A third more defensive way is to add a fourth node, add replicas to it to > make all collections redundant and then remove replicas from the defunct node > and finally decommission it. > > Jan Høydahl > > > 28. des. 2019 kl. 02:17 skrev David Barnett : > > > > Happy holidays folks, we have a production deployment usage Solr 7.3 in a > > three node cluster we have a number of collections setup, three shards with > > a replica factor of 2. The system has been fine, but we experienced issues > > with disk space one of the nodes. > > > > Node 0 starts but does not show any cores / replicas, the solr.log is full > > of these "o.a.s.c.ZkController org.apache.solr.common.SolrException: > > Replica core_node7 is not present in cluster state: null” > > > > Node 1 and Node 2 are OK, all data from all collections is accessible. > > > > Can I recreate node 0 as though it had failed completely ?, is it OK to > > remove the references to the replicas (missing) and recreate. Would you be > > able to provide me some guidance of the safest way to reintroduce node 0 > > given our situation. > > > > Many thanks > > > > Dave
Re: Solr 7.5 speed up, accuracy details
Hi all, How can I get the performance improvement features in indexing and search in solr 7.5... On Sat, 28 Dec, 2019, 9:18 PM Rajdeep Sahoo, wrote: > Thank you for the information > Why you are recommending to use the schema api instead of schema xml? > > > On Sat, 28 Dec, 2019, 8:01 PM Jörn Franke, wrote: > >> This highly depends on how you designed your collections etc. - there is >> no general answer. You have to do a performance test based on your >> configuration and documents. >> >> I also recommend to check the Solr documentation on how to design a >> collection for 7.x and maybe start even from scratch defining it with a new >> fresh schema (using the schema api instead of schema.xml and solrconfig.xml >> etc). You will have anyway to reindex everything so it is a also a good >> opportunity to look at your existing processes and optimize them. >> >> > Am 28.12.2019 um 15:19 schrieb Rajdeep Sahoo < >> rajdeepsahoo2...@gmail.com>: >> > >> > Hi all, >> > Is there any way I can get the speed up,accuracy details i.e. >> performance >> > improvements of solr 7.5 in comparison with solr 4.6 >> > Currently,we are using solr 4.6 and we are in a process to upgrade to >> > solr 7.5. Need these details. >> > >> > Thanks in advance >> >
Re: Solr 7.5 speed up, accuracy details
There is no increase in speed, but features. Doc values add some but it’s hard to quantify, and some people think solr cloud has speed increases but I don’t think they exist when hardware cost is nonexistent and it adds too much complexity to something that should be simple. > On Dec 28, 2019, at 12:52 PM, Rajdeep Sahoo > wrote: > > Hi all, > How can I get the performance improvement features in indexing and search > in solr 7.5... > >> On Sat, 28 Dec, 2019, 9:18 PM Rajdeep Sahoo, >> wrote: >> >> Thank you for the information >> Why you are recommending to use the schema api instead of schema xml? >> >> >>> On Sat, 28 Dec, 2019, 8:01 PM Jörn Franke, wrote: >>> >>> This highly depends on how you designed your collections etc. - there is >>> no general answer. You have to do a performance test based on your >>> configuration and documents. >>> >>> I also recommend to check the Solr documentation on how to design a >>> collection for 7.x and maybe start even from scratch defining it with a new >>> fresh schema (using the schema api instead of schema.xml and solrconfig.xml >>> etc). You will have anyway to reindex everything so it is a also a good >>> opportunity to look at your existing processes and optimize them. >>> Am 28.12.2019 um 15:19 schrieb Rajdeep Sahoo < >>> rajdeepsahoo2...@gmail.com>: Hi all, Is there any way I can get the speed up,accuracy details i.e. >>> performance improvements of solr 7.5 in comparison with solr 4.6 Currently,we are using solr 4.6 and we are in a process to upgrade to solr 7.5. Need these details. Thanks in advance >>> >>
Re: Boosting only top n results that match a criteria
You could try and see if field collapsing can help you. That could let you return top 5 from each class if that is acceptable. Otherwise, you’ll have to go with two queries. HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 27 Dec 2019, at 19:08, Nitin Arora wrote: > > Simply boosting on class A1 won't work since there may be many documents > from that class, all getting equal boost. I want only top 5 docs of that > class to get the boost. > > On Fri, 27 Dec 2019 at 22:42, Erick Erickson > wrote: > >> Yes. Rerank essentially takes the top N results of one query and re-scores >> them through another query. So just boost the secondary query. >> >> But you may not even have to do that. Just add a boost clause to a single >> query and boost your class A1 quite high. See “boost” and/or “bq”. >> >> Best, >> Erick >> >>> On Dec 27, 2019, at 10:57 AM, Nitin Arora >> wrote: >>> >>> Hi Erick, I was not able to figure how exactly I will use >>> RerankQParserPlugin to achieve the desired reranking. I see that I can >>> rerank all the top RERANK_DOCS results - it is possible that they >> contain a >>> hundred results of class A1 or none. But the desired behaviour I want is >> to >>> pick (only) the top 5 results of class A1 from my potentially 100s of >>> results. Then boost them to first page. >>> Do you think this(or near this) behaviour is possible >>> using RerankQParserPlugin? Please shed more light how. >>> >>> On Fri, 27 Dec 2019 at 19:48, Erick Erickson >>> wrote: >>> Have you seen RerankQParserPlugin? Best, Erick > On Dec 27, 2019, at 8:49 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > > Hi Nitin, > Can you simply filter and return top 5: > > ….&fq=class:A1&rows=5 > > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - >> http://sematext.com/ > > > >> On 27 Dec 2019, at 13:55, Nitin Arora wrote: >> >> Hello, I have a complex solr query with various boosts applied that >> returns, say a few hundred results. Out of these hundreds of results I want >> to further boost, say the top 5 results that satisfy a particular criteria >> - e.g. class=A1. So I want the top 5 results from class A1 in my existing >> results set to come further higher, so that I can show them on the >> first >> page of my final results. How do I achieve this? >> I am new to SOLR and this community so apologies if this is trivial/repeat. >> >> Thanks, >> Nitin > >> >>