Reverse-engineering existing installation
The documentation for SOLR is good. However it is oriented toward setting up a new installation, with the data model known. I have inherited an existing installation. Aspects of the data model I know, but there's a lot of ways things could have been configured in SOLR, and for some cases, I don't know what SOLR was supposed to do. Can you reccomend any documentation on working out the configuration of an existing installation?
Re: Reverse-engineering existing installation
Thanks! Alexandre's presentation is helpful in understanding what's not essential. David's suggesting of comparing config files is good - I'll have to see if I can dig up the config files for version 4.2, which we're currently running. I'll also look into updating to a supported version. I guess I'll be reading https://lucene.apache.org/solr/guide/6_6/upgrading-solr.html and the similar ones for later versions. Is an upgrade guide for version 4 to 5 still around somewhere? On Fri, May 3, 2019 at 12:21 AM David Smiley wrote: > Consider trying to diff configs from a default at the version it was copied > from, if possible. Even better, the configs should be in source control and > then you can browse history with commentary and sometimes links to issue > trackers and code reviews. > > Also a big part that you can’t see by staring at configs is what the > queries look like. You should examine the system interacting with Solr to > observe embedded comments/docs for insights. > > On Thu, May 2, 2019 at 11:21 PM Doug Reeder > wrote: > > > The documentation for SOLR is good. However it is oriented toward > setting > > up a new installation, with the data model known. > > > > I have inherited an existing installation. Aspects of the data model I > > know, but there's a lot of ways things could have been configured in > SOLR, > > and for some cases, I don't know what SOLR was supposed to do. > > > > Can you reccomend any documentation on working out the configuration of > an > > existing installation? > > > -- > Sent from Gmail Mobile >
Re: Reverse-engineering existing installation
Thanks! Diffs for solr.xml and zoo.cfg were easy, but it looks like we'll need to strip the comments before we can get a useful diff of solrconfig.xml or schema.xml. Can you recommend tools to normalize XML files? XMLStarlet is hosted on SourceForge, which I no longer trust, and hasn't been updated in years. On Fri, May 3, 2019 at 4:24 PM Shawn Heisey wrote: > On 5/3/2019 1:44 PM, Erick Erickson wrote: > > Then git will let you check out any previous branch. 4.2 is from before > we switched to Git, co I’m not sure you can go that far back, but 4x is > probably close enough for comparing configs. > > Git has all of Lucene's history, and most of Solr's history, back to > when Lucene and Solr were merged before the 3.1.0 release. So the 4.x > releases are there: > > > elyograg@smeagol:~/asf/lucene-solr$ git checkout > releases/lucene-solr/4.2.1 > Checking out files: 100% (13209/13209), done. > Note: checking out 'releases/lucene-solr/4.2.1'. > > You are in 'detached HEAD' state. You can look around, make experimental > changes and commit them, and you can discard any commits you make in > this state without impacting any branches by performing another checkout. > > If you want to create a new branch to retain commits you create, you may > do so (now or later) by using -b with the checkout command again. Example: > >git checkout -b > > HEAD is now at 50c41a3e5c Lucene Java 4.2.1 release. > > > Thanks, > Shawn >
Re: Reverse-engineering existing installation
Thanks, xmlstarlet makes it straightforward to get the canonical XML. It looks like our schema.xml files are rather different from files like solr/example/solr/collection1/conf/schema.xml Any suggestions of sections I should focus on? On Sat, May 4, 2019 at 8:11 AM Alexandre Rafalovitch wrote: > XMLStarlet still works just fine. So if you want the fast way, that is the > one. > > Otherwise, some xml editors can do it (not sure which ones) or you can look > for XSLT or XQuery examples on the web. > > XMLStarlet actually just spits out XSLT internally, or even externally if > you ask. > > Regards, > Alex > > > On Fri, May 3, 2019, 10:30 PM Doug Reeder, > wrote: > > > Thanks! Diffs for solr.xml and zoo.cfg were easy, but it looks like we'll > > need to strip the comments before we can get a useful diff of > > solrconfig.xml or schema.xml. Can you recommend tools to normalize XML > > files? XMLStarlet is hosted on SourceForge, which I no longer trust, and > > hasn't been updated in years. > > > > > > On Fri, May 3, 2019 at 4:24 PM Shawn Heisey wrote: > > > > > On 5/3/2019 1:44 PM, Erick Erickson wrote: > > > > Then git will let you check out any previous branch. 4.2 is from > before > > > we switched to Git, co I’m not sure you can go that far back, but 4x is > > > probably close enough for comparing configs. > > > > > > Git has all of Lucene's history, and most of Solr's history, back to > > > when Lucene and Solr were merged before the 3.1.0 release. So the 4.x > > > releases are there: > > > > > > > > > elyograg@smeagol:~/asf/lucene-solr$ git checkout > > > releases/lucene-solr/4.2.1 > > > Checking out files: 100% (13209/13209), done. > > > Note: checking out 'releases/lucene-solr/4.2.1'. > > > > > > You are in 'detached HEAD' state. You can look around, make > experimental > > > changes and commit them, and you can discard any commits you make in > > > this state without impacting any branches by performing another > checkout. > > > > > > If you want to create a new branch to retain commits you create, you > may > > > do so (now or later) by using -b with the checkout command again. > > Example: > > > > > >git checkout -b > > > > > > HEAD is now at 50c41a3e5c Lucene Java 4.2.1 release. > > > > > > > > > Thanks, > > > Shawn > > > > > >
Softer version of grouping and/or filter query
We have a query to return products related to a given product. To give some variety to the results, we group by vendor: group=true&group.main=true&group.field=merchantId We need at least four results to display. Unfortunately, some categories don't have a lot of products, and grouping takes us (say) from five results to three. Can I "soften" the grouping, so other products by the same vendor will appear in the results, but with much lower score? Similarly, we have a filter query that only returns products over $150: fq=price:[150+TO+*] Can this be changed to a q or qf parameter where products less than $150 have score less than any product priced $150 or more? (A price higher than $150 should not increase the score.)
Re: Softer version of grouping and/or filter query
Thanks much! I dropped price from the fq term, changed to an edismax parser, and boosted with bq=price:[150+TO+*]^100 On Thu, May 9, 2019 at 7:21 AM Edward Ribeiro wrote: > Em qua, 8 de mai de 2019 18:56, Doug Reeder > escreveu: > > > > > Similarly, we have a filter query that only returns products over $150: > > fq=price:[150+TO+*] > > > > Can this be changed to a q or qf parameter where products less than $150 > > have score less than any product priced $150 or more? (A price higher > than > > $150 should not increase the score.) > > > > If you are using edismax then you could use boost function. Maybe something > along those: bf=if(lt(price, 150), 0.5, 100) > > Your fq already filters out documents with prices less than 150. Using a > boost (function/query) will retrieve back docs with prices less than 150, > but probably with smaller scores. > > Edward > > > >