David, Thank you for your insights and thank you for your work on Apache Solr.
You are correct, this is for end users. Many typically build up their Solr query using StringBuilder or String concatenation, then instantiate a SolrJ SolrQuery object and pass in the query string. I would like to get these classes named correctly for the Solr domain. QueryTerm is actually legacy from my work where we use an in-house indexing and query engine and have over the years moved some things to Solr and some to Elastic Search. I base the naming of classes somewhat on : https://solr.apache.org/guide/solr/latest/query-guide/standard-query-parser.html Term - single word Phrase - group of words surrounded by quotes. Field - item defined by the schema Grouping is used to form "sub-queries" What is the proper domain name for: title:"pink panther" title -> field "pink panther" -> clause? Terms? I have been making SolrJ queries for many years now and I am comfortable doing so. Other developers look at the interface of Solr and compare that to the query builder of elastic-search and decide it is easier to work in Elastic Search. I am building a new system that has a small amount of data and decided to look at Elastic Search as a possible solution. The first review I read said that building queries is easier in Elastic Search. Since I have been using my own query building tools for years I hadn't considered that it was a "selling point". - Geoffrey > On Aug 16, 2024, at 3:50 PM, David Smiley <dsmi...@apache.org> wrote: > > Hello Geoffrey, > > Thanks for your message and offer. > > I think the overall idea is nice but if we got more serious, we'd want > to bike-shed on a number of details. Like naming ("grouper" is > dubious to me) and explore further simplifications. Like why > "addTerm(new QueryTerm("cd", "back in black")" when you could do > "addTerm("cd", "back in black")" ? And I suspect you are confusing a > "query term" (ultimately a TermQuery in Lucene) with a query/clause > generally. So much bikeshedding here that we'd probably start from > scratch to be honest. > > If we hypothetically incorporated this immediately, where in the > codebase would it be used (give a specific example)? If it's nowhere > at all, it might be an awkward thing to include. Maybe there's 100% > test coverage but I suspect no tests for if the string is actually > parseable and parsed as-intended (i.e. has the Query structure). If > it's for users of Solr (which I believe is your intention), it should > live in SolrJ but you placed it in solr-core. > > ~ David > > On Fri, Aug 16, 2024 at 3:48 PM Geoffrey Slinker > <geoffrey_slin...@yahoo.com.invalid> wrote: >> >> I have been using Apache Solr for many years in a live environment that >> services queries at 3K rpm (unless there is a campaign in progress) and >> updates from 3K to 10K rpm. The schema is quite robust with each record >> potentially having 90 fields populated. The system stores 900 million >> records and is hosted on several powerful server instances. >> >> Back in 2015 I attended a session at Lucid Works called “Solr Unleashed”. >> When I described the system that I was building I recall the presenter >> saying, “Good luck with that.” We have had very good luck. >> >> When I first started generating Solr query strings I did it with >> StringBuilder. That became problematic when I wanted to change a boost or a >> constant score for a query term group that had been generated previously. >> So, I eventually wrote some Java classes to provide an object structure that >> I could manipulate and navigate. It has been very helpful, and recently I >> revamped my query generation and was glad I had objects to work with instead >> of strings. >> >> My employer has often encouraged the development staff to participate in the >> Open Source community and they are supportive of sharing this query >> generation functionality. >> >> I will attach a link to the fork of Apache Solr that I am using below. >> >> I have some questions. >> >> 1) Do these Java classes provide functionality that the community would like >> to have? Maybe there is functionality already available or similar. >> 2) I just made a guess in the project structure on where to put the >> functionality. Maybe it should be in SolrJ, or maybe in Lucene, or somewhere >> else. >> >> The main or working java class is called QueryTermGrouper. >> >> QueryTermGrouper aggregates QueryTerms and other QueryTermGroupers to form >> complex queries that can be used in a Standard Solr Query >> Example: >> QueryTermGrouper grouper = new >> QueryTermGrouper().with(BooleanClause. Occur. MUST).withBoost(1.4f); >> grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1)); >> >> String query = grouper. toString(); >> >> Output: +( foo:bar~1 )^1.4 >> >> Example: >> QueryTermGrouper grouper = new >> QueryTermGrouper().withConstantScore(5.0f); >> grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1)); >> >> String query = grouper. toString(); >> >> Output: ( foo:bar~1 )^=5 >> >> Instead of using string manipulation to create complex query strings the >> QueryTermGrouper allows complex queries to be built inside an object model >> that can be more easily changed. >> If you need to generate a query like this: >> ( >> ( >> cd:"back in black" >> cd:"point of no return" >> cd:"night at the opera" >> )^0.3 >> >> ( >> record:destroyer >> record:"the grand illusion" >> )^0.5 >> >> ) >> >> The code to do so is as simple this: >> QueryTermGrouper grouper = new QueryTermGrouper(); >> QueryTermGrouper cdGrouper = grouper. addGroup(); >> QueryTermGrouper recordsGroup = grouper. addGroup(); >> >> cdGrouper. addTerm(new QueryTerm("cd", "back in black")); >> cdGrouper. addTerm(new QueryTerm("cd", "point of no return")); >> cdGrouper. addTerm(new QueryTerm("cd", "night at the opera")); >> cdGrouper. setBoost(0.3f); >> >> recordsGroup. addTerm(new QueryTerm("record", "destroyer")); >> recordsGroup. addTerm(new QueryTerm("record", "the grand >> illusion")); >> recordsGroup. setBoost(0.5f); >> >> >> The code can be found here: >> >> https://github.com/gslinker/solr/tree/QUERY_TERM_GROUPER >> >> Unit tests provide 100% coverage on all lines of code and on all branches in >> the code. >> >> Please share your thoughts. >> >> Sincerely >> Geoffrey Slinker >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> For additional commands, e-mail: dev-h...@solr.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org >