Advanced search with results matrix
Hi, First off, we're a happy user of Apache Solr v3.1 Enterprise search server, integrated and successfully running in our LIVE Production server. Now, we're enhancing our existing search feature in our web application as explained below, that truly helps application users in making informed decision before getting their search results: There will be 3 textboxes provided and users can enter keyword phrases with OR, AND combination within each textbox as shown below, for example: Textbox 1: "SQL Server" OR SQL Textbox 2: "Visual Basic" OR VB.NET Textbox 3: Java AND JavaScript If User clicks "Search" button, we want to present an intermediate or "results matrix" page that would generate all possible combinations for 3 textboxes with how many records found for each combination as given below (between combination it is AND operation). This, as I said before, truly helps application users in making informed decision/choice before getting their search results: +-+-+--- - Matches | Textbox 1 | Textbox 2 | Textbox 3 +-+-+--- - 200 |"SQL Server" OR SQL | | 300 | |"Visual Basic" OR VB.NET | 400 | | | Java AND JavaScript 250 |"SQL Server" OR SQL |"Visual Basic" OR VB.NET | 350 | |"Visual Basic" OR VB.NET | Java AND JavaScript 300 |"SQL Server" OR SQL | | Java AND JavaScript 100 |"SQL Server" OR SQL |"Visual Basic" OR VB.NET | Java AND JavaScript +-+-+--- - Only on clicking one of this "Matches" count will display actual results of that particular search. My questions are, 1) Do I need to run search separately for each combination or is it possible to combine and obtain "results matrix" page by making "only" one single call to Apache Solr? Or are they any plug-ins available that provides functionality close to my use case? 2) How do I instruct Solr to return only count (not result) for the search performed? 3) Any ideas/suggestions/approaches/resources are really appreciated and welcomed Regards, Gnanam
RE: Advanced search with results matrix
> 1. If I understand correctly you just need to perform one query. Like so > (translated to propper syntax of course): >("SQL Server" OR SQL) OR ("Visual Basic" OR VB.NET) OR (Java AND > JavaScript) No, it's not just one single query, rather, as I've mentioned before, it's combination of searches with result count for each combination. Explained in detail below: 1) ("SQL Server" OR SQL) 2) ("Visual Basic" OR VB.NET) 3) (Java AND JavaScript) 4) ("SQL Server" OR SQL) AND ("Visual Basic" OR VB.NET) 5) ("Visual Basic" OR VB.NET) AND (Java AND JavaScript) 6) ("SQL Server" OR SQL) AND (Java AND JavaScript) 7) ("SQL Server" OR SQL) AND ("Visual Basic" OR VB.NET) AND (Java AND JavaScript) Hope I made it clear.
RE: Advanced search with results matrix
Hi Mikhail, > have you considered to junk your subqueries into disjunction > (BooleanQuery.Occurs.SHOULD) and request > http://wiki.apache.org/solr/SimpleFacetParameters#facet.query_:_Arbitrary_Qu ery_Faceting? Thanks for pointing/guiding me at the right time and/or direction about Solr "Facet Queries" (facet.query). I really feel that I'm very close to what I'm expecting. Just I've another query here. "facet.query" is type of faceting that evaluates over the search results. My question is, is it possible to run multiple combination of search queries to just get only result count "in a single trip" without using "facet.query"?
RE: Advanced search with results matrix
>> My question is, is it possible to run >> multiple combination of search queries to just get only result count "in a >> single trip" without using "facet.query"? >> > > No. AFAIK. Yes, you're true. I just tried googling on this and I'm finding that a requirement similar to mine is being filed under "New Feature" towards v4.0 in JIRA. A RequestHandler to run multiple queries in a batch https://issues.apache.org/jira/browse/SOLR-1093 Alternatively, I'm also finding that "Field Collapsing/Result Grouping" is another feature in Solr which is analogous to Facet Querying, but for grouping. But, in my case, I think "Arbitrary Query Faceting" is the right choice to go with.
How does Solr's MoreLikeThis component internally work to get results?
Hi, I'm new to Apache Solr and am currently exploring/trying to make use of MoreLikeThis as a search component (instead of dedicated request handler). I'm finding difficult to understand clearly on how this works internally to get more-like-this results? For example, I'm trying to search for the word java in one of the document field named mytextcontentfield: http://localhost/solr/core0/select/?q=mytextcontentfield:java&version=2.2&st art=0&rows=10&indent=on&debugQuery=on&mlt=true&mlt.fl=mytextcontentfield and I could see moreLikeThis in the XML response with unique keys of the documents in name attribute. My questions here is, how does Solr internally work/match to find more-like-this documents based on the search keyword java? Any explanation with good example are appreciated. Regards, Gnanam
How does Solr's MoreLikeThis component internally work to get results?
Hi, I'm new to Apache Solr and am currently exploring/trying to make use of MoreLikeThis as a search component (instead of dedicated request handler). I'm finding difficult to understand clearly on how this works internally to get more-like-this results? For example, I'm trying to search for the word java in one of the document field named mytextcontentfield: http://localhost/solr/core0/select/?q=mytextcontentfield:java&version=2.2&st art=0&rows=10&indent=on&debugQuery=on&mlt=true&mlt.fl=mytextcontentfield and I could see moreLikeThis in the XML response with unique keys of the documents in name attribute. My questions here is, how does Solr internally work/match to find more-like-this documents based on the search keyword java? Any explanation with good example are appreciated. Regards, Gnanam
How to index and query "C#" as whole term?
Hi, I'm using Apache Solr v3.1. How do I configure/allow Solr to both index and query the term "c#" as a whole word/term? From "Analysis" page, I could see that the term "c#" is being reduced/converted into just "c" by solr.WordDelimiterFilterFactory. Regards, Gnanam
RE: How to index and query "C#" as whole term?
Thank you all for your valuable suggestion/approach. I'll set it up in synonyms.txt using solr.SynonymFilterFactory. Hope this fit the bill. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, May 17, 2011 2:12 AM To: solr-user@lucene.apache.org Subject: Re: How to index and query "C#" as whole term? The other advantage to the synonyms approach is it will be much less of a headache down the road. For instance, imagine you've defined "whitespacetokenizer" and "lowercasefilter". That'll fix your example just fine. It'll also cause all punctuation to be included in the tokens, so if you indexed "try to find me." (note the period) and searched for "me" (without the period) you'd not get a hit. Then, let's say you get clever and do a regex manipulation via PatternReplaceCharFilterFactory to leave in '#' but remove other punctuation. Then any miscellaneous stream that contains a # will give surprising results. Consider 15# (for 15 pounds). Won't match 15 in a search now. So whatever solution you choose, think about it pretty carefully before you jump .. Best Erick On Mon, May 16, 2011 at 2:10 PM, Robert Petersen wrote: > Sorry I am also using a synonyms.txt for this in the analysis stack. I > was not clear, sorry for any confusion. I am not doing it outside of > Solr but on the way into the index it is converted... :) > > -Original Message- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] > Sent: Monday, May 16, 2011 8:51 AM > To: solr-user@lucene.apache.org > Subject: Re: How to index and query "C#" as whole term? > > Before indexing so outside Solr? Using the SynonymFilter would be easier > i > guess. > > On Monday 16 May 2011 17:44:24 Robert Petersen wrote: >> I have always just converted terms like 'C#' or 'C++' into 'csharp' > and >> 'cplusplus' before indexing them and similarly converted those terms > if >> someone searched on them. That always has worked just fine for me... >> >> :) >> >> -Original Message- >> From: Jonathan Rochkind [mailto:rochk...@jhu.edu] >> Sent: Monday, May 16, 2011 8:28 AM >> To: solr-user@lucene.apache.org >> Subject: Re: How to index and query "C#" as whole term? >> >> I don't think you'd want to use the string type here. String type is >> almost never appropriate for a field you want to actually search on > (it >> is appropriate for fields to facet on). >> >> But you may want to use Text type with different analyzers selected. >> You probably want Text type so the value is still split into different >> tokens on word boundaries; you just don't want an analyzer set that >> removes punctuation. >> >> On 5/16/2011 10:46 AM, Gora Mohanty wrote: >> > On Mon, May 16, 2011 at 7:05 PM, Gnanakumar > wrote: >> >> Hi, >> >> >> >> I'm using Apache Solr v3.1. >> >> >> >> How do I configure/allow Solr to both index and query the term "c#" >> >> as a >> >> >> whole word/term? From "Analysis" page, I could see that the term >> >> "c#" is >> >> >> being reduced/converted into just "c" by >> >> solr.WordDelimiterFilterFactory. >> >> > [...] >> > >> > Yes, as you have discovered the analyzers for the field type in >> > question will affect the values indexed. >> > >> > To index "c#" exactly as is, you can use the "string" type, instead >> > of the "text" type. However, what you probably want some filters >> > to be applied, e.g., LowerCaseFilterFactory. Take a look at the >> > definition of the fieldType "text" in schema.xml, define a new field >> > type that has only the tokenizers and analyzers that you need, and >> > use that type for your field. This Wiki page should be helpful: >> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters >> > >> > Regards, >> > Gora > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 >
How to list/see all the indexed terms of a particular field in a document?
Hi, I'm using Apache Solr v3.1. How do I list/get to see all the indexed terms of a particular field in a document (by passing Unique Key ID of the document)? For example, I've the following "field" definition in schema.xml: In this case, I expect/want to list/see all the indexed terms of a particular document ("mydocumentid:x") for the document field "mytextcontent". Regards, Gnanam
RE: How to list/see all the indexed terms of a particular field in a document?
So this cannot be queried/listed using Apache Solr? -Original Message- From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com] Sent: Wednesday, May 18, 2011 3:36 PM To: solr-user@lucene.apache.org; gna...@zoniac.com Subject: Re: How to list/see all the indexed terms of a particular field in a document? ant luke? On Wed, May 18, 2011 at 11:47 AM, Gnanakumar wrote: > Hi, > > I'm using Apache Solr v3.1. > > How do I list/get to see all the indexed terms of a particular field in a > document (by passing Unique Key ID of the document)? > > For example, I've the following "field" definition in schema.xml: > > required="true" /> > required="true" /> > > In this case, I expect/want to list/see all the indexed terms of a > particular document ("mydocumentid:x") for the document field > "mytextcontent". > > Regards, > Gnanam > > -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains "[LON]" or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ? L(LON*) ? ?x. (x ? MyInbox ? Acknowledges(x, this) ? time(x) < Now + 48h) ? resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with "X". ?x. x ? MyInbox ? from(x) ? MySafeSenderList ? (?y. y ? subject(x) ? y ? L(-[a-z]+[0-9]X)).
How do I write/build query using "qf" parameter of dismax handler for my use case?
Hi, How do I write/build a Solr query using dismax handler for my application specific use case explained below: Snippet of fields definition from schema.xml: documentid textfield1 Now, I want to search for documents containing "solr" and "struts" in all 3 text fields (textfield1, textfield2, textfield3) but within the companyid = 100. As you can see from above statement, companyid=100 is common here but search keywords should be searched only in 3 text fields (textfield1, textfield2, textfield3). I also understand that this can be written as shown below by qualifying all the 3 text fields explicitly: http://localhost/solr/select?q=companyid:100&textfield1:solr AND struts&textfield2:solr AND struts&textfield3:solr AND struts But how do I write/build a query using "qf" parameter of dismax query handler, so that I don't need to specify all the 3 fields explicitly. Wiki says: "For each "word" in the query string, dismax builds a DisjunctionMaxQuery object for that word across all of the fields in the qf param" NOTE: I'm using "edismax" as my default query type in my Search Handler. Regards, Gnanam
RE: How do I write/build query using "qf" parameter of dismax handler for my use case?
> edismax supports full query format of lucene parser.But you can search using > filter queries eg. > qf=textfield1, textfield2, textfield3&fq=&textfield1:solr AND > struts&fq=textfield2:solr AND struts&fq=textfield3:solr AND struts > &fq=companyid:100 Is it not possible to build query without filter queries "fq"? For example, something like this (I believe this is syntactically not correct, but something equivalent to this): q=companyid:100 AND solr AND struts&qf= textfield1,textfield2,textfield3 Basically, I'm just trying/finding to simplify the query syntax.
Query facet count and its matching documents
Hi, We're running Apache Solr v3.1 and SolrJ is our client. We're passing multiple Arbitrary Faceting Query (facet.query) to get the number of matching documents (the facet count) evaluated over the search results in a *single* Solr query. My use case demands the actual matching facet results/documents/fields also along with facet count. My question is, is it possible to get facet query matching results along with facet count in a single Solr query call? Regards, Gnanam
RE: Query facet count and its matching documents
Any ideas on this? > We're running Apache Solr v3.1 and SolrJ is our client. > > We're passing multiple Arbitrary Faceting Query (facet.query) to get the > number of matching documents (the facet count) evaluated over the search > results in a *single* Solr query. My use case demands the actual matching > facet results/documents/fields also along with facet count. > > My question is, is it possible to get facet query matching results along > with facet count in a single Solr query call?