RE: Solr sort preferences number vs space vs character
No experience with this personally, but it seems like you are describing https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-UnicodeCollation - Andy - -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Monday, March 14, 2016 10:51 AM To: solr-user@lucene.apache.org Subject: Re: Solr sort preferences number vs space vs character On 3/14/2016 12:05 AM, vkrishna wrote: > Hey Shawn, > > Is there any way to use ASCII? so I can get the result I want. I do not know whether Solr has any config facility to incorporate a custom Lucene sorting class. I tried to look at the Lucene code to see if I could figure out how/where the sorting happens, but I couldn't decipher it. ASCII wouldn't give you the result you want, though -- it sorts numbers before letters, and you want them after. You would likely need some VERY custom Lucene sort code ... but like I said above, I do not know if Solr has a way to plug that in. Thanks, Shawn
RE: Solr sort preferences number vs space vs character
Are you sorting against an untokenized field (either defined using the 'string' fieldType or a fieldType that is configured with KeywordTokenizerFactory)? Solr will let you sort against a tokenized field. Not sure what happens internally when you do this, but the results will not be what you expect. - Andy - -Original Message- From: vkrishna [mailto:vamsikrishna_t...@yahoo.com] Sent: Monday, March 14, 2016 1:14 PM To: solr-user@lucene.apache.org Subject: Re: Solr sort preferences number vs space vs character Shawn, I think you did saw my required result order in previous update(which is different from what I asked first )space > number > character, sorry for confusion. Thanks, Krishna. On Mon, 3/14/16, Shawn Heisey-2 [via Lucene] wrote: Subject: Re: Solr sort preferences number vs space vs character To: "vkrishna" Date: Monday, March 14, 2016, 9:58 AM On 3/14/2016 10:28 AM, vkrishna wrote: > I completely forgot to mention that this kind of sorting is working fine in > 1.4 version now we are upgrading to 5.4. I know solr made many changes > between, because it's been years. Do you know when and in which version > they made changes for sorting. Absolutely no idea. I would be *very* surprised to learn that numbers were sorted after letters in ANY version of Solr/Lucene. If I did see that, I think I would be looking to file a bug. Thanks, Shawn ___ If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-sort-preferences-number-vs-space-vs-character-tp4263527p4263684.html To unsubscribe from Solr sort preferences number vs space vs character, visit http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4263527&code=dmFtc2lrcmlzaG5hX3Rzc3NAeWFob28uY29tfDQyNjM1Mjd8LTE3NjA5MTUyMw== -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-sort-preferences-number-vs-space-vs-character-tp4263527p4263691.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Multilevel nested level support using Solr
Don't know if this is what you are looking for, but we had a similar requirement. In our case each folder had a unique identifier associated with it. When generating the Solr input document our code populated 2 fields, parent_folder, and folder_hierarchy (multi-valued), and for a document in the root->foo->bar folder added: parent_folder: folder_hierarchy: folder_hierarchy: folder_hierarchy: At search time, if you wanted to restrict your search within the folder 'bar' we generated a filter query for either 'parent_folder:' or 'folder_hierarchy:' depending on whether you wanted only documents directly under the 'bar' folder (your case 3), or at any level underneath 'bar' (your case 1). If your folders don't have unique identifiers then you could achieve something similar by indexing the folder paths in string fields: parent_folder:root|foo|bar folder_hierarchy:root|foo|bar folder_hierarchy:root|foo folder_hierarchy:root and generating a fq for either 'parent_folder:root|foo|bar' or 'folder_hierarchy:root|foo|bar' If you didn't want to have to generate all the permutations for the folder_hierarchy field before sending the document to Solr for indexing you should be able to do something like: In which case you could just send in the 'folder_parent' field and Solr would generate the folder_hierarchy field. For cases 2 and 4 you could do something similar by adding 2 additional fields that just index the folder names instead of the paths. - Andy - -Original Message- From: Steven White [mailto:swhite4...@gmail.com] Sent: Monday, April 20, 2015 9:49 AM To: solr-user@lucene.apache.org Subject: Re: Multilevel nested level support using Solr Re sending to see if anyone can help. Thanks Steve On Fri, Apr 17, 2015 at 12:14 PM, Steven White wrote: > Hi folks, > > In my DB, my records are nested in a folder base hierarchy: > > > > record_1 > record_2 > > record_3 > record_4 > > record_5 > > > > record_6 > record_7 > record_8 > > You got the idea. > > Is there anything in Solr that will let me preserve this structer and > thus when I'm searching to tell it in which level to narrow down the > search? I have four search levels needs: > > 1) Be able to search inside only level: ...* > (and everything under Level_2 from this path). > > 2) Be able to search inside a level regardless it's path: .* > (no matter where is, i want to search on all records under > Level_2 and everything under it's path. > > 3) Same as #1 but limit the search to within that level (nothing below > its level are searched). > > 4) Same as #3 but limit the search to within that level (nothing below > its level are searched). > > I found this: > https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+I > ndex+Handlers#UploadingDatawithIndexHandlers-NestedChildDocuments > but it looks like it supports one level only and requires the whole > two levels be updated even if 1 of the doc in the nest is updated. > > Thanks > > Steve >
RE: Solr query which return only those docs whose all tokens are from given list
Based on his example, it sounds like Naresh not only wants the tags field to contain at least one of the values [T1, T2, T3] but also wants to exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should not be retrieved). If the set of possible values in the tags field is limited and known, you could use a NOT (or '-') clause to accomplish this. If there were 5 possible tag values: tags:(( T1 OR T2 OR T3) NOT (T4 OR T5)) However this doesn't seem practical if the number of possible values is large or unlimited. Perhaps something could be done with range queries: tags:(( T1 OR T2 OR T3) NOT ([* TO T1} OR {T1 TO T2} OR {T3 to * ])) however this would require whatever is constructing the query to be aware of the lexical ordering of the terms in the index. Maybe there are more elegant solutions, but I am not aware of them. - Andy - -Original Message- From: sujitatgt...@gmail.com [mailto:sujitatgt...@gmail.com] On Behalf Of Sujit Pal Sent: Monday, May 11, 2015 10:40 AM To: solr-user@lucene.apache.org Subject: Re: Solr query which return only those docs whose all tokens are from given list Hi Naresh, Couldn't you could just model this as an OR query since your requirement is at least one (but can be more than one), ie: tags:T1 tags:T2 tags:T3 -sujit On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav wrote: > Hi all, > > Also asked this here : http://stackoverflow.com/questions/30166116 > > For example i have SOLR docs in which tags field is indexed : > > Doc1 -> tags:T1 T2 > > Doc2 -> tags:T1 T3 > > Doc3 -> tags:T1 T4 > > Doc4 -> tags:T1 T2 T3 > > Query1 : get all docs with "tags:T1 AND tags:T3" then it works and > will give Doc2 and Doc4 > > Query2 : get all docs whose tags must be one of these [T1, T2, T3] > Expected is : Doc1, Doc2, Doc4 > > How to model Query2 in Solr ?? Please help me on this ? >
RE: Solr Exact match boost Reduce the results
If I understand you correctly you want to boost the score of documents where the contents of the product_name field match exactly (other than case) the query string. I think what you need is for the dummy_name field to be non-tokenized (indexed as a single string rather than parsed into individual words). The name of the field type you have configured the dummy_name field to use (string_ci) would seem to indicate this is your intent. However the definition of string_ci doesn't match the name. It is configured to use the WhitespaceTokenizerFactory tokenizer, which will break the contents of the field up into multiple tokens where ever white space occurs. Try defining string_ci using the (somewhat cryptically named) KeywordTokenizerFactory, which will index the entire contents of the field as a single token. Something like: - Andy - -Original Message- From: JACK [mailto:mfal...@gmail.com] Sent: Friday, June 12, 2015 12:54 PM To: solr-user@lucene.apache.org Subject: Re: Solr Exact match boost Reduce the results As explained above, actually I have around 10 lack data not 5 row. It's not about synonyms . When I checked in the FAQ page of Solr wiki, it is found that if we need to get exact match results first, use a copy field with different configuration. That's why I followed this way. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211434.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: [E] Re: Faceting Question(s)
It is possible to get the original facet counts for the field you are filtering on (we have been using this since Solr 3.6). Don't know if this can be extended to get the original counts for all fields however. This syntax is described here: https://cwiki.apache.org/confluence/display/solr/Faceting Tagging and Excluding Filters You can tag specific filters and exclude those filters when faceting. This is useful when doing multi-select faceting. Consider the following example query with faceting: q=mainquery&fq=status:public&fq=doctype:pdf&facet=true&facet.field=doctype Because everything is already constrained by the filter doctype:pdf, the facet.field=doctype facet command is currently redundant and will return 0 counts for everything except doctype:pdf. To implement a multi-select facet for doctype, a GUI may want to still display the other doctype values and their associated counts, as if the doctype:pdf constraint had not yet been applied. For example: === Document Type === [ ] Word (42) [x] PDF (96) [ ] Excel(11) [ ] HTML (63) To return counts for doctype values that are currently not selected, tag filters that directly constrain doctype, and exclude those filters when faceting on doctype. q=mainquery&fq=status:public&fq={!tag=dt}doctype:pdf&facet=true&facet.field={!ex=dt}doctype Filter exclusion is supported for all types of facets. Both the tag and ex local parameters may specify multiple values by separating them with commas. - Andy - -Original Message- From: Robert Brown [mailto:r...@intelcompute.com] Sent: Thursday, June 02, 2016 2:12 PM To: solr-user@lucene.apache.org Subject: Re: [E] Re: Faceting Question(s) MaryJo, I think you've mis-understood. The counts are different simply because the 2nd query contains an filter of a facet value from the 1st query - that's completely expected. The issue is how to get the original facet counts (with no filters but same q) in the same call as also filtering by one of those facet values. Personally I don't think it's possible, but will be interested to hear others input, since it's a very common situation for me - I cache the first result in memcached and tag future queries as related to the first. Or could you always make 2 calls back to Solr (one original (again), and one with the filters), the caches should help massively. On 02/06/16 19:07, MaryJo Sminkey wrote: > And you're saying the count for the second query is different than what was > returned in the facet? You may need to check for any defaults you have set > up in the solrconfig for the select parser, if for instance you have any > grouping going on, but aren't doing grouping in your facet, that could > result in the counts being off. > > MJ > > > > > On Thu, Jun 2, 2016 at 2:01 PM, Jamal, Sarfaraz < > sarfaraz.ja...@verizonwireless.com.invalid> wrote: > >> Absolutely, >> >> Here is what it looks like: >> >> This brings the right counts as it should >> http:// >> **select?q=video&hl=true&hl.fl=*&hl.snippets=20&facet=true&facet.field=team >> >> Then when I specify which team >> http:// >> **select?q=video&hl=true&hl.fl=*&hl.snippets=20&facet=true&facet.field=team&fq=team:rollback >> >> The counts are obviously different now, as the result set is limited to >> one team. >> >> Sas >> >> -Original Message- >> From: MaryJo Sminkey [mailto:mjsmin...@gmail.com] >> Sent: Thursday, June 2, 2016 1:56 PM >> To: solr-user@lucene.apache.org >> Subject: [E] Re: Faceting Question(s) >> >> Jamai - what is your q= set to? And do you have a fq for the original >> query? I have found that if you do a wildcard search (*.*) you have to be >> careful about other parameters you set as that can often result in the >> numbers returned being off. In my case, my defaults had things like edismax >> settings for phrase boosting, etc. that don't apply if there isn't a search >> term, and once I removed those for a wildcard search I got the correct >> numbers. So possibly your facet query itself may be set up correctly but >> something else in the parameters and/or filters with the two queries may be >> the cause of the difference. >> >> Mary Jo >> >> >> On Thu, Jun 2, 2016 at 1:47 PM, Jamal, Sarfaraz < >> sarfaraz.ja...@verizonwireless.com.invalid> wrote: >> >>> Hello Everyone, >>> >>> I am working on implementing some basic faceting into my project. >>> >>> I have it working the way I want to, but I feel like there is probably >>> a better way the way I went about it. >>> >>> * I want to show a category and its count. >>> * when someone clicks a category, it sets a FQ= to that category. >>> >>> But now that the results are being filtered, the category counts from >>> the original query without the filters are off. >>> >>> So, I have a single api call that I make with rows set to 0 and the >>> base query without any filters, and use that to display my categories. >>> >>> And then I call the api again, this time to get the results. And the >>> cate
RE: Are there issues with the use of SolrCloud / embedded Zookeeper in non-HA deployments?
Thanks Markus, Scott, and Erick, I appreciate the input. Scott, I am not clear what you meant by " One reason is that zkServer.cmd tells the process that run Zookeeper by judging the DOS window title. However, according to what verison of Windows you use and how you start DOS window, it could be wrong.". Can you explain further? I looked for your post on modifying the script but was unable to find it. Can you provide a link? Thanks again, Andy -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, July 28, 2016 1:01 PM To: solr-user Subject: Re: Are there issues with the use of SolrCloud / embedded Zookeeper in non-HA deployments? I can certainly that external Zookeepers get _waaay_ more testing/use than embedded. While I don't know of any _specific_ issues, embedded is largely intended for ease of first use. I think my argument would be that in the case where you have a customer migrating from one node to many, if you _already_ have an external ZK then that transition would be much easier. Even if the external ZK is just another Java program running on the same physical node... FWIW, Erick On Thu, Jul 28, 2016 at 8:44 AM, Markus Jelsma wrote: > Hello - all our production environments as deployed as a cloud, even when > just a single Solr instance is used. We did this for the purpose having a > single method of deployment / provisioning and just because we have the > option to add replica's with ease if we need to. > > We never use embedded Zookeeper. > > Markus > > > -Original message- >> From:Andy C >> Sent: Thursday 28th July 2016 17:38 >> To: solr-user@lucene.apache.org >> Subject: Are there issues with the use of SolrCloud / embedded Zookeeper in >> non-HA deployments? >> >> We have integrated Solr 5.3.1 into our product. During installation >> customers have the option of setting up a single Solr instance, or >> for high availability deployments, multiple Solr instances in a >> master/slave configuration. >> >> We are looking at migrating to SolrCloud for HA deployments, but are >> wondering if it makes sense to also use SolrCloud in non-HA deployments? >> >> Our thought is that this would simplify things. We could use the same >> approach for deploying our schema.xml and other configuration files >> on all systems, we could always use the SolrJ CloudSolrClient class >> to communicate with Solr, etc. >> >> Would it make sense to use the embedded Zookeeper instance in this >> situation? I have seen warning that the embedded Zookeeper should not >> be used in production deployments, but the reason generally given is >> that if Solr goes down Zookeeper will also go down, which doesn't >> seem relevant here. Are there other reasons not to use the embedded >> Zookeeper? >> >> More generally, are there downsides to using SolrCloud with a single >> Zookeeper node and single Solr node? >> >> Would appreciate any feedback. >> >> Thanks, >> Andy >>