Hierarchical faceting in UI
I have some hierarchical data that I want to represent in the Solr UI (/browse). I've read through many discussions on this topic, including http://wiki.apache.org/solr/HierarchicalFaceting and http://packtlib.packtpub.com/library/9781849516068/ch06lvl1sec09 . However, I didn't see a solution that solves case. For each facet field in my data, the depth varies depending on the facet term. For example, at the root level, facet term 1 may have 3 levels down, but facet term 2 will have 8 levels down. In the UI, I want to show only the facet terms at the currently selected level. Using the example that comes with Solr, the facet "cat" has the field "electronics", which then has several children. When my user initially enters the UI, he should only see "electronics"; he should not see any of its children until he clicks on "electronics". Programmatically, something like this might work: for each facet field, add another hidden field that identifies its parent. Then, program additional logic in the UI to show only the facet terms at the currently selected level. For example, if one filters on "cat:electronics", the new UI logic would apply the additional filter "cat_parent:electronics". Can this be done? Would it be a lot of work? Is there a better way? By the way, Flamenco (another faceted browser) has built-in support for hierarchies, and it has worked well for my data in this aspect (but less well than Solr in others). I'm looking for the same kind of hierarchical UI feature in Solr.
Re: Hierarchical faceting in UI
Darren, One challenge for me is that a term can appear in multiple places of the hierarchy. So it's not safe to simply use the term as it appears to get its children; I probably need to include the entire tree path up to this term. For example, if the hierarchy is "Cardiovascular Diseases > Arteriosclerosis > Coronary Artery Disease", and I'm getting the children of the middle term Arteriosclerosi, I need to filter on something like "parent:Cardiovascular Diseases/Arteriosclerosis". I'm having trouble figuring out how I can get the complete path per above to add to the URL of each facet term. I know "velocity/facet_field.vm" is where I build the URL. I know how to simply add a "parent:" filter to the URL. But I don't know how to access a document field, like the complete parent path, in "facet_field.vm". Any help would be great. Yuhao From: "dar...@ontrenet.com" To: Yuhao Cc: solr-user@lucene.apache.org Sent: Monday, January 23, 2012 7:16 PM Subject: Re: Hierarchical faceting in UI On Mon, 23 Jan 2012 14:33:00 -0800 (PST), Yuhao wrote: > Programmatically, something like this might work: for each facet field, > add another hidden field that identifies its parent. Then, program > additional logic in the UI to show only the facet terms at the currently > selected level. For example, if one filters on "cat:electronics", the new > UI logic would apply the additional filter "cat_parent:electronics". Can > this be done? Yes. This is how I do it. > Would it be a lot of work? No. Its not a lot of work, simply represent your hierarchy as parent/child relations in the document fields and in your UI drill down by issuing new faceted searches. Use the current facet (tree level) as the parent: in the next query. Its much easier than other suggestions for this. > Is there a better way? Not in my opinion, there isn't. This is the simplest to implement and understand. > > By the way, Flamenco (another faceted browser) has built-in support for > hierarchies, and it has worked well for my data in this aspect (but less > well than Solr in others). I'm looking for the same kind of hierarchical > UI feature in Solr.
Re: Hierarchical faceting in UI
Hi Darren. You said: "Your UI will associate the correct parent id to build the facet query" This is the part I'm having trouble figuring out how to accomplish and some guidance would help. How would I get the value of the parent to build the facet query in the UI, if the value is in another document field? I was imagining that I would add the additional filter of "parent:" to the "fq" URL parameter. But I don't have a way to do it yet. Perhaps seeing some data would help. Here is a record in old (flattened) and new (parent-enabled) versions, both in JSON format: OLD: { "ID" : "3816", "Gene Symbol" : "KLK1", "Alternate Names" : "hCG_22931;Klk6;hK1;KLKR", "Description" : "Kallikrein 1, a peptidase that cleaves kininogen, functions in glucose homeostasis, heart contraction, semen liquefaction, and vasoconstriction, aberrantly expressed in pancreatitis and endometrial cancer; gene polymorphism correlates with kidney failure (BKL)", "GAD_Positive_Disease_Associations" : ["Mental Disorders(MESH:D001523) >> Dementia, Vascular(MESH:D015140)", "Cardiovascular Diseases(MESH:D002318) >> Coronary Artery Disease(MESH:D003324)"], "HuGENet_GeneProspector_Associations" : ["atherosclerosis", "HDL"], } NEW: { "ID" : "3816", "Gene Symbol" : "KLK1", "Alternate Names" : "hCG_22931;Klk6;hK1;KLKR", "Description" : "Kallikrein 1, a peptidase that cleaves kininogen, functions in glucose homeostasis, heart contraction, semen liquefaction, and vasoconstriction, aberrantly expressed in pancreatitis and endometrial cancer; gene polymorphism correlates with kidney failure (BKL)", "GAD_Positive_Disease_Associations" : ["Dementia, Vascular(MESH:D015140)", "Coronary Artery Disease(MESH:D003324)"], "GAD_Positive_Disease_Associations_parent" : ["Mental Disorders(MESH:D001523)", "Cardiovascular Diseases(MESH:D002318)"], "HuGENet_GeneProspector_Associations" : ["atherosclerosis", "HDL"], } In the old version, the field "GAD_Positive_Disease_Associations" had 2 levels of hierarchy that were flattened. It had the full path of the hierarchy leading to the current term. In the new version, the field only has the current term. A separate field called "GAD_Positive_Disease_Associations_parent" has the full path preceding the current term. So, let's say in the UI, I click on the term "Dementia, Vascular(MESH:D015140)" to get its child terms and data. My filters in the URL querystring would be exactly: fq=GAD_Positive_Disease_Associations:"Dementia, Vascular(MESH:D015140)"&fq=GAD_Positive_Disease_Associations_parent:"Mental Disorders(MESH:D001523)" My question is, how to get the parent value of "Mental Disorders(MESH:D001523)" to build that querystring? Thanks! Yuhao From: Darren Govoni To: solr-user@lucene.apache.org Sent: Tuesday, January 24, 2012 1:23 PM Subject: Re: Hierarchical faceting in UI Yuhao, Ok, let me think about this. A term can have multiple parents. Each of those parents would be 'different', yes? In this case, use a multivalued field for the parent and add all the parent names or id's to it. The relations should be unique. Your UI will associate the correct parent id to build the facet query from and return the correct children because the user is descending down a specific path in the UI and the parent node unique id's are returned along the way. Now, if you are having parent names/id's that themselves can appear in multiple locations (vs. just terms 'the leafs'), then perhaps your hierarchy needs refactoring for redundancy? Happy to help with more details. Darren On 01/24/2012 11:22 AM, Yuhao wrote: > Darren, > > One challenge for me is that a term can appear in multiple places of the > hierarchy. So it's not safe to simply use the term as it appears to get its > children; I probably need to include the entire tree path up to this term. > For example, if the hierarchy is "Cardiovascular Diseases> > Arteriosclerosis> Coronary Artery Disease", and I'm getting the children of > the middle term Arteriosclerosi, I need to filter on something like > "parent:Cardiovascular Diseases/Arteriosclerosis". > > I'm having trouble figuring out how I can get the complete path per above to > add to the URL of each facet term. I know "velocity/facet_field.vm" is whe
Re: SOLVED: Strange things happen when I query with many facet.prefixes and fq filters
Good question. I checked the output sent to Jetty. In the case where it returns a blank page, nothing at all is sent to Jetty. This raised my suspicion that Solr never got a chance to process the query. Sure enough, it led me to the finding that Jetty by default cannot take more than 4 KB of header. After I increased that limit, everything works. Problem solved. From: Erick Erickson To: solr-user@lucene.apache.org; Yuhao Sent: Sunday, January 29, 2012 1:05 PM Subject: Re: Strange things happen when I query with many facet.prefixes and fq filters The very first question I have is "what do your Solr logs show"? I suspect you'll see something interesting there. Otherwise, there's no way really to say what's going on here without reproducing your setup... Best Erick On Fri, Jan 27, 2012 at 6:48 PM, Yuhao wrote: > Hi, > > I'm having issues when running the following query, which is produced by > expanding several hierarchical facets (implemented the facet.prefix way). I > realize it's pretty massive, but I'd like to figure out what exactly is > causing the problem. Is it too many facet.prefix clauses, too many fq > filters, the combo of both, or what. Anyway, here is the URL I start out > with: > > http://40.163.5.153:920/solr/browse?&fq=Gene_Ontology_Associations%3A%220%2Fbiological_process%28GO%3A0008150%29%22&fq=Gene_Ontology_Associations%3A%221%2Fbiological_process%28GO%3A0008150%29%3Bmetabolic+process%28GO%3A0008152%29%22&fq=Gene_Ontology_Associations%3A%222%2Fbiological_process%28GO%3A0008150%29%3Bmetabolic+process%28GO%3A0008152%29%3Bsteroid+metabolic+process%28GO%3A0008202%29%22&fq=Gene_Ontology_Associations%3A%223%2Fbiological_process%28GO%3A0008150%29%3Bmetabolic+process%28GO%3A0008152%29%3Bsteroid+metabolic+process%28GO%3A0008202%29%3Bcholesterol+metabolic+process%28GO%3A0008203%29%22&fq=Mouse_Phenotype_Associations%3A%220%2Fmammalian+phenotype%28MP%3A001%29%22&fq=Mouse_Phenotype_Associations%3A%221%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%22&fq=Mouse_Phenotype_Associations%3A%222%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fme > tabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%22&fq=Mouse_Phenotype_Associations%3A%223%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostasis%28MP%3A0002118%29%22&fq=Mouse_Phenotype_Associations%3A%224%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostasis%28MP%3A0002118%29%3Babnormal+cholesterol+homeostasis%28MP%3A0005278%29%22&fq=Mouse_Phenotype_Associations%3A%225%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostasis%28MP%3A0002118%29%3Babnormal+cholesterol+homeostasis%28MP%3A0005278%29%3Babnormal+cholesterol+level%28MP%3A0003947%29%22&fq=Mouse_Phenotype_Associations%3A%226%2Fm > ammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostasis%28MP%3A0002118%29%3Babnormal+cholesterol+homeostasis%28MP%3A0005278%29%3Babnormal+cholesterol+level%28MP%3A0003947%29%3Bdecreased+cholesterol+level%28MP%3A0003983%29%22&fq=Mouse_Phenotype_Associations%3A%227%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostasis%28MP%3A0002118%29%3Babnormal+cholesterol+homeostasis%28MP%3A0005278%29%3Babnormal+cholesterol+level%28MP%3A0003947%29%3Bdecreased+cholesterol+level%28MP%3A0003983%29%3Bdecreased+liver+cholesterol+level%28MP%3A0010026%29%22&fq=BKL_Diagnostic_Marker_Associations%3A%220%2FCardiovascular+Diseases%28MESH%3AD002318%29%22&fq=BKL_Molecular_Mechanism_Associations%3A%220%2FCardiovascular+Diseases%28MESH%3AD002318%29%22&fq= > BKL_Diagnostic_Marker_Associations%3A%221%2FCardiovascular+Diseases%28MESH%3AD002318%29%3BArteriosclerosis%28MESH%3AD001161%29%22&q=&fq=BKL_Diagnostic_Marker_Associations:%222%2FCardiovascular+Diseases%28MESH%3AD002318%29%3BArteriosclerosis%28MESH%3AD001161%29%3BAtherosclerosis%28MESH%3AD050197%29%22&f.Gene_Ontology_Associations.facet.prefix=4%2Fbiological_process%28GO%3A0008150%29%3Bmetabolic+process%28GO%3A0008152%29%3Bsteroid+metabolic+process%28GO%3A0008202%29%3Bcholesterol+metabolic+process%28GO%3A0008203%29&f.Mouse_Phenotype_Associations.facet.prefix=8%2Fmammalian+phenotype%28MP%3A001%29%3Bhomeostasis%2Fmetabolism+phenotype%28MP%3A0005376%29%3Babnormal+homeostasis%28MP%3A0001764%29%3Babnormal+lipid+homeostas
$doc.getFieldNames() - what determines the order of fields?
$doc.getFieldNames() will give you a list of field names as defined in your schema.xml file. However, the order in which it returns the field names is not the same order that I defined them in schema.xml. What determines the order returned by $doc.getFieldNames() ?
Help: Creating another handler and template to display document attributes
Like the title says, I want to create a "page" to display a bunch of document attributes. I accomplished this by creating a new handler and a template for it. However, I'm having trouble pulling up the details of the document in the new handler. Here's my code. Is this a good way to do it? I first pass the doc ID to the handler, which I then use to pull up the details. * BEGIN *** #set($id = $params.get("id")) ## Note: id is the same thing as "Entrez ID" #foreach ($doc in $response.results) #if ($doc.getFieldValue('Entrez ID') == $id) ## Only show attrs for the current doc #foreach ($field_name in $doc.getFieldNames()) $field_name: $doc.getFieldValue($field_name) #end #end #end * END *** This approach requires the document to be in the search results. The way I pass the ID to the handler right now, is to simply add "id=$id" to the URL, without the rest of the querystring that was used to conceive the current query. I need to get the rest of the querystring and pass them along to the handler to ensure the document is in the search result. However, I don't know of a good way. I tried the following code: #set($querystring = "") #foreach ($param in $request.params.getParameterNamesIterator()) #set($querystring = $querystring + "&$param=$esc.url($request.params.get($param))" ) #end Unfortunately, the above code returns a lot of unnecessary params that are not part of the querystring, and it fails to put the document in the search result. Is there a better way to get the URL querystring (and just the querystring, not other environment parameters)?
Help: nothing is searchable in Solr
After modifying the schema, I've somehow managed to break the text search functionality, because the search can't find anything any more. For example, I defined a field called "Entrez ID" in my schema.xml file: Here's one of the indexed documents: { "Entrez ID" : "335", } The document comes up in the Solritas interface by just browsing, and it also shows up by drilling down the facets. However, if I search the term "335", nothing is found. Any idea why? Do I need to configure more settings to make a field searchable?
SOLVED Re: $doc.getFieldNames() - what determines the order of fields?
I found the answer to my question. The order is determined by the order in which the fields were defined in the input XML or JSON record for this document. From: Yuhao To: "solr-user@lucene.apache.org" Sent: Wednesday, February 1, 2012 3:27 PM Subject: $doc.getFieldNames() - what determines the order of fields? $doc.getFieldNames() will give you a list of field names as defined in your schema.xml file. However, the order in which it returns the field names is not the same order that I defined them in schema.xml. What determines the order returned by $doc.getFieldNames() ?
Re: Help: nothing is searchable in Solr
Oops, you're right about the typo! However, after I changed it to: , searching for "335" still returns no result. I did delete the index and re-index the documents after the change. Interestingly, adding * to the search does produce results, and it seems to be the only way to find anything. * by itself finds 549/757 results. *:* finds all 757 results. *[a-zA-Z]* finds nothing. *[0-9]* finds some results. For example, *33* finds 7 results, but it does NOT find the doc with id=335. The results are interesting because I definitely have many indexed fields with [a-zA-Z] characters, but nothing at all is found. From: Ahmet Arslan To: solr-user@lucene.apache.org; Yuhao Sent: Wednesday, February 1, 2012 5:59 PM Subject: Re: Help: nothing is searchable in Solr > For example, I defined a field called "Entrez ID" in my > schema.xml file: > > type="string" index="true" stored="true" required="true" > /> It could be the typo: index="true" should be indexed="true"
Re: Help: Creating another handler and template to display document attributes
Erik, Thanks for the slides. I followed the example on pages 24-25 (maybe too rigidly). The first line is giving me trouble: #set($doc= $response.results.get(0)) This will always get the first document in the search results, which happens to be the first document I indexed. So, no matter which record I click on, I'm always taken to the details of that first record. I can't get to any other record even though I pass their ID to the new handler. So, my question is, is the line above supposed to give me the record I passed the ID for, or is it just an example to get "some" record? Here are the details of my implementation: (I realize I've got a space in the ID field "Entrez ID". However, that doesn't seem to be the problem since it was able to get the first record) I defined a SearchHandler named "/details". Example url: http://localhost:920/solr/details?id=7391 schema.xml: velocity details __this is configured in solrconfig.xml, str name="v.template.header"__ layout Details {!raw f="Entrez ID" v=$id} on 10 count Gene_Ontology_Associations Mouse_Phenotype_Associations BKL_Diagnostic_Marker_Associations BKL_Molecular_Mechanism_Associations BKL_Therapeutic_Target_Associations BKL_Negative_Correlation_Associations GAD_Positive_Disease_Associations GAD_Negative_Disease_Associations HuGENet_GeneProspector_Associations Expression_Specificity_according_to_GNF OMIM_Clinical_Synopses_Matches OMIM_Full_Text_Matches 1 on text features name 0 name spellcheck details.vm = #set($doc= $response.results.get(0)) $doc.getFieldValue('Entrez ID') #foreach($fieldname in $doc.fieldNames) $fieldname: #foreach($value in $doc.getFieldValues($fieldname)) $esc.html($value) #end #end From: Erik Hatcher To: solr-user@lucene.apache.org Sent: Wednesday, February 1, 2012 8:26 PM Subject: Re: Help: Creating another handler and template to display document attributes I'm not following exactly what you're after here in detail, but I think this will help: <http://www.slideshare.net/erikhatcher/rapid-prototyping-with-solr-5675936> See slides 24 and 25. Note the use of $id in the /document request handler definition using parameter substitution, a really cool technique. Erik On Feb 1, 2012, at 17:17 , Yuhao wrote: > Like the title says, I want to create a "page" to display a bunch of document > attributes. I accomplished this by creating a new handler and a template for > it. However, I'm having trouble pulling up the details of the document in > the new handler. Here's my code. Is this a good way to do it? I first pass > the doc ID to the handler, which I then use to pull up the details. > > > * BEGIN *** > #set($id = $params.get("id")) > ## Note: id is the same thing as "Entrez ID" > > > #foreach ($doc in $response.results) > #if ($doc.getFieldValue('Entrez ID') == $id) ## Only show attrs for the >current doc > #foreach ($field_name in $doc.getFieldNames()) > $field_name: $doc.getFieldValue($field_name) > #end > #end > #end > * END *** > > > This approach requires the document to be in the search results. The way I > pass the ID to the handler right now, is to simply add "id=$id" to the URL, > without the rest of the querystring that was used to conceive the current > query. I need to get the rest of the querystring and pass them along to the > handler to ensure the document is in the search result. However, I don't > know of a good way. I tried the following code: > > #set($querystring = "") > #foreach ($param in $request.params.getParameterNamesIterator()) > #set($querystring = $querystring + >"&$param=$esc.url($request.params.get($param))" ) > #end > > > Unfortunately, the above code returns a lot of unnecessary params that are > not part of the querystring, and it fails to put the document in the search > result. Is there a better way to get the URL querystring (and just the > querystring, not other environment parameters)?
Re: Help: Creating another handler and template to display document attributes
Erik, You were right! The space in "Entrez ID" was the problem. It works fine after I got rid of all spaces and capital letters. Now I just have to come up with a way to display the original field names in the UI, which the users would prefer. Is there a way I can stick the display value (with spaces and capital letters) in the schema (or somewhere else) and pull it out at showtime? Thanks. From: Erik Hatcher To: solr-user@lucene.apache.org Sent: Thursday, February 2, 2012 10:32 AM Subject: Re: Help: Creating another handler and template to display document attributes There should only be one document matching that query (provided "Entrez ID" is your unique key field name). Using a space in a field name is perhaps the problem. It's way best practice that fields have only [a-zA-z0-9_] in them. Maybe that space isn't the issue though, but try &debugQuery=true and see how the query parsed to see for sure. Erik On Feb 2, 2012, at 10:22 , Yuhao wrote: > Erik, > > Thanks for the slides. I followed the example on pages 24-25 (maybe too > rigidly). The first line is giving me trouble: > > #set($doc= $response.results.get(0)) > > This will always get the first document in the search results, which happens > to be the first document I indexed. So, no matter which record I click on, > I'm always taken to the details of that first record. I can't get to any > other record even though I pass their ID to the new handler. So, my question > is, is the line above supposed to give me the record I passed the ID for, or > is it just an example to get "some" record? > > Here are the details of my implementation: > (I realize I've got a space in the ID field "Entrez ID". However, that > doesn't seem to be the problem since it was able to get the first record) > > I defined a SearchHandler named "/details". Example url: > http://localhost:920/solr/details?id=7391 > > > schema.xml: > > > > > velocity > > details > __this is configured in solrconfig.xml, >str name="v.template.header"__ > layout > Details > > {!raw f="Entrez ID" v=$id} > > on > 10 > count > Gene_Ontology_Associations > Mouse_Phenotype_Associations > BKL_Diagnostic_Marker_Associations > BKL_Molecular_Mechanism_Associations > BKL_Therapeutic_Target_Associations > BKL_Negative_Correlation_Associations > GAD_Positive_Disease_Associations > GAD_Negative_Disease_Associations > HuGENet_GeneProspector_Associations > Expression_Specificity_according_to_GNF > OMIM_Clinical_Synopses_Matches > OMIM_Full_Text_Matches > 1 > > > on > text features name > 0 > name > > > spellcheck > > > > > > > details.vm > = > #set($doc= $response.results.get(0)) > $doc.getFieldValue('Entrez > ID') > > #foreach($fieldname in $doc.fieldNames) > > $fieldname: > > #foreach($value in $doc.getFieldValues($fieldname)) > $esc.html($value) > #end > > > #end > > > > > > > From: Erik Hatcher > To: solr-user@lucene.apache.org > Sent: Wednesday, February 1, 2012 8:26 PM > Subject: Re: Help: Creating another handler and template to display document > attributes > > I'm not following exactly what you're after here in detail, but I think this > will help: > > <http://www.slideshare.net/erikhatcher/rapid-prototyping-with-solr-5675936> > > See slides 24 and 25. Note the use of $id in the /document request handler > definition using parameter substitution, a really cool technique. > > Erik > > On Feb 1, 2012, at 17:17 , Yuhao wrote: > >> Like the title says, I want to create a "page" to display a bunch of >> document attributes. I accomplished this by creating a new handler and a >> template for it. However, I'm having trouble pulling up the details of the >> document in the new handler. Here's my code. Is this a good way to do it? >> I first pass the doc ID to the handler, which I then use to pull up the >> details. >> >> >> * BEGIN *** >> #set($id = $params.get("id")) >> ## Note: id is the same thing as "Entrez
Re: Help: nothing is searchable in Solr
Erik, Thanks for your suggestions. After I made all field names [a-zA-Z0-9_] and turned on debugQuery=true, I saw that the query was using something like "text^0.5", which is beyond my current comprehension. I commented out those "^0.5" type settings in solrconfig.xml. Now the search works a little better, but still far from perfect. Now if I search for a term, say "335", the query is: +DisjunctionMaxQuery((gene_symbol:335)) The field referenced above, "gene_symbol", is the default search field I set in schema.xml. Searching against this field alone is not the default search behavior I would like. What I'd like is that when I search for a term, Solr should search it against every indexed field. What's the best way to make that happen? I know one way is to set the default search field to the catch-all field "text", which gets populated by calling for each document field. However I'm not sure if this is the best way. ____ From: Erick Erickson To: solr-user@lucene.apache.org; Yuhao Sent: Wednesday, February 1, 2012 7:57 PM Subject: Re: Help: nothing is searchable in Solr I really, really, really don't like the fact that you have a space in your field name. Adding &debugQuery=on to your query should show you the results of parsing the query. What I *expect*, but haven't tested, is one of two things: 1> the query parser interprets Entrez ID:335 as something like defaultsearchfield:Entrez ID:335 in fact, I'm surprised it doesn't throw an error unless you have a field named ID.. 2> you aren't specifying the field in the first place and you're getting defaultsearchfield:335 Please do yourself a favor and use lowercase and underscores for your field names, historically, there have been some corner cases where capitals produce surprising results, in some of the contribs as I remember If none of this is the problem, can you post the results of adding &debugQuery=on to your URL? Best Erick On Wed, Feb 1, 2012 at 6:17 PM, Yuhao wrote: > Oops, you're right about the typo! However, after I changed it to: > > stored="true" required="true" /> > > > , searching for "335" still returns no result. I did delete the index and > re-index the documents after the change. Interestingly, adding * to the > search does produce results, and it seems to be the only way to find anything. > > * by itself finds 549/757 results. > *:* finds all 757 results. > *[a-zA-Z]* finds nothing. > *[0-9]* finds some results. For example, *33* finds 7 results, but it does > NOT find the doc with id=335. > > > The results are interesting because I definitely have many indexed fields > with [a-zA-Z] characters, but nothing at all is found. > > > > > From: Ahmet Arslan > To: solr-user@lucene.apache.org; Yuhao > Sent: Wednesday, February 1, 2012 5:59 PM > Subject: Re: Help: nothing is searchable in Solr > >> For example, I defined a field called "Entrez ID" in my >> schema.xml file: >> >> > type="string" index="true" stored="true" required="true" >> /> > > It could be the typo: index="true" should be indexed="true"
Re: Help: nothing is searchable in Solr
Erick (sorry for missing the "c" previous :D), After playing around with the edismax query parser, I'm starting to like it. Originally I just wanted the simplest search feature to get started, but I can see that I might take advantage of edismax's field booster feature later. Turns out the trick to get my search working was to add all the document fields to the parameter. Previously I commented this parameter out, so edismax parser had no field to operate on, which explains why it grabbed the default search field. After I added the fields, the query does search against all of them now. I'm learning a lot as I go :) From: Erick Erickson To: solr-user@lucene.apache.org; Yuhao Sent: Thursday, February 2, 2012 1:56 PM Subject: Re: Help: nothing is searchable in Solr You're getting confused between default search fields and the dismax query parser. Look in your solrconfig.xml file and you'll see a request handler I think. Take a look at: http://wiki.apache.org/solr/DisMaxQParserPlugin I think this will do what you want. The catch-all field you mentioned is also possible, but the edismax style requests allow you to boost various fields separately, giving you finer control over the results. Best Erick On Thu, Feb 2, 2012 at 1:49 PM, Yuhao wrote: > Erik, > > Thanks for your suggestions. After I made all field names [a-zA-Z0-9_] and > turned on debugQuery=true, I saw that the query was using something like > "text^0.5", which is beyond my current comprehension. I commented out those > "^0.5" type settings in solrconfig.xml. Now the search works a little > better, but still far from perfect. Now if I search for a term, say "335", > the query is: > > > +DisjunctionMaxQuery((gene_symbol:335)) > > The field referenced above, "gene_symbol", is the default search field I set > in schema.xml. Searching against this field alone is not the default search > behavior I would like. What I'd like is that when I search for a term, Solr > should search it against every indexed field. What's the best way to make > that happen? I know one way is to set the default search field to the > catch-all field "text", which gets populated by calling for each > document field. However I'm not sure if this is the best way. > > > > > > From: Erick Erickson > To: solr-user@lucene.apache.org; Yuhao > Sent: Wednesday, February 1, 2012 7:57 PM > Subject: Re: Help: nothing is searchable in Solr > > I really, really, really don't like the fact that you have a space in your > field name. Adding &debugQuery=on to your query should show > you the results of parsing the query. What I *expect*, but haven't > tested, is one of two things: > 1> the query parser interprets Entrez ID:335 as something like > defaultsearchfield:Entrez ID:335 > in fact, I'm surprised it doesn't throw an error unless you have > a field named ID.. > 2> you aren't specifying the field in the first place and you're > getting defaultsearchfield:335 > > > Please do yourself a favor and use lowercase and underscores for > your field names, historically, there have been some corner cases > where capitals produce surprising results, in some of the contribs as > I remember > > If none of this is the problem, can you post the results of adding > &debugQuery=on to your URL? > > Best > Erick > > On Wed, Feb 1, 2012 at 6:17 PM, Yuhao wrote: >> Oops, you're right about the typo! However, after I changed it to: >> >> > stored="true" required="true" /> >> >> >> , searching for "335" still returns no result. I did delete the index and >> re-index the documents after the change. Interestingly, adding * to the >> search does produce results, and it seems to be the only way to find >> anything. >> >> * by itself finds 549/757 results. >> *:* finds all 757 results. >> *[a-zA-Z]* finds nothing. >> *[0-9]* finds some results. For example, *33* finds 7 results, but it does >> NOT find the doc with id=335. >> >> >> The results are interesting because I definitely have many indexed fields >> with [a-zA-Z] characters, but nothing at all is found. >> >> >> >> >> From: Ahmet Arslan >> To: solr-user@lucene.apache.org; Yuhao >> Sent: Wednesday, February 1, 2012 5:59 PM >> Subject: Re: Help: nothing is searchable in Solr >> >>> For example, I defined a field called "Entrez ID" in my >>> schema.xml file: >>> >>> >> type="string" index="true" stored="true" required="true" >>> /> >> >> It could be the typo: index="true" should be indexed="true"
I want to specify multiple facet prefixes per field
I simulated a hierarchical faceting browsing scheme using facet.prefix. However, it seems there can only be one facet.prefix per field. For OR queries, the browsing scheme requires multiple facet prefixes. For example: fq=facet1:term1 OR facet1:term2 OR facet1:term3 Something like the above is very powerful. For the hierarchical browsing, at this point what I want is to show the child terms (one level down) of term1, term2 and term3 (but not term4, term5 or term6). Now, if I add a facet.prefix, say "f.facet1.facet.prefix=term1", it would give me all the child terms of term1, but I also want the children of child 2 and child 3. So what I want is to be able to do something like this: "f.facet1.facet.prefix=term1 OR term2 OR term3. Is there a way to accomplish what I'm looking for?
Range facet - Count in facet menu != Count in search results
I've changed the "facet.range.include" option to every possible value (lower, upper, edge, outer, all)**. It only changes the count shown in the "Ranges" facet menu on the left. It has no effect on the count and results shown in search results, which ALWAYS is inclusive of both the lower AND upper bounds (which is equivalent to "include = all"). Is this by design? I would like to make the search results include the lower bound, but not the upper bound. Can I do that? My range field is multi-valued, but I don't think that should be the problem. ** Actually, it doesn't like "outer" for some reason, which leaves the facet completely empty.
Re: Range facet - Count in facet menu != Count in search results
Jay, Was the curly closing bracket "}" intentional? I'm using 3.4, which also supports "fq=price:[10 TO 20]". The problem is the results are not working properly. From: Jan Høydahl To: solr-user@lucene.apache.org; Yuhao Sent: Thursday, February 9, 2012 7:45 PM Subject: Re: Range facet - Count in facet menu != Count in search results Hi, If you use trunk (4.0) version, you can say fq=price:[10 TO 20} and have the upper bound be exclusive. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 10. feb. 2012, at 00:58, Yuhao wrote: > I've changed the "facet.range.include" option to every possible value (lower, > upper, edge, outer, all)**. It only changes the count shown in the "Ranges" > facet menu on the left. It has no effect on the count and results shown in > search results, which ALWAYS is inclusive of both the lower AND upper bounds > (which is equivalent to "include = all"). Is this by design? I would like > to make the search results include the lower bound, but not the upper bound. > Can I do that? > > My range field is multi-valued, but I don't think that should be the problem. > > ** Actually, it doesn't like "outer" for some reason, which leaves the facet > completely empty.