How to achieve join like behavior on solr-cloud
Hello , I am aware of the fact that Solr (I am using 5.2) does not support join on distributed search with documents to be joined residing on different shards/collections. My use case is I want to fetch uuid of documents that are resultant of a search and also those docs which are outside this search but have "related" field same as one of the search result docs. This is a typical join scenario. Is there some way using streaming-api to achieve this behavior . Or some other approach. Thanks. Alok -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-join-like-behavior-on-solr-cloud-tp4247703.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to achieve join like behavior on solr-cloud
Hi Dennis , thanks for your reply. As I wanted this for some production system so may not be able to upgrade to under-development branch of solr. but thanks a lot for pointing me to this possible approach. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-join-like-behavior-on-solr-cloud-tp4247703p4247896.html Sent from the Solr - User mailing list archive at Nabble.com.
How to use DocValues with TextField
Hello , I have a field which is defined to be a textField with PatternTokenizer which splits on ";". Now for one of the use case I need to use /export handler to export this field. As /export handler needs field to support docValues , so if I try to mark that field as docValues="true" it says that TextField does not support docValues. If I think of using String fields then that does not support adding tokenizers. I am thinking of adding copy-field which is a string , so I query on original field but return a copy-field which is string which can be marked as docValues="true". This looks like solving my issue , is any better approach known. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-DocValues-with-TextField-tp4248647.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to use DocValues with TextField
Thanks Erick. Yes I was not clear in questioning but I want it to be searchable on TextField. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-DocValues-with-TextField-tp4248647p4248796.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How to use DocValues with TextField
Thanks Markus. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-DocValues-with-TextField-tp4248647p4248797.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr /export handler is exporting only unique values from multivalued field?
Hello , I am using solr /export handler to export search results and it is performing well. Today I faced an issue , actually there are 2 multivalued fields I am fetching lets say which holds list of items and which holds list of sellers. here I am storing information such that seller for 1st item is 1st seller from seller list , item1--1 item2--1 item3--2 I am expecting these 2 lists of same size. When I export I get 3 entries in item list but only 2 entries in seller list. Also these entries are sorted so it is not giving me the expected results. So I cant find seller for item1,item2 This is priority for me and I am stuck , please can someone help. I am using solr 5.2 Thanks, Alok -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-export-handler-is-exporting-only-unique-values-from-multivalued-field-tp4249986.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr /export handler is exporting only unique values from multivalued field?
Thanks Joel. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-export-handler-is-exporting-only-unique-values-from-multivalued-field-tp4249986p4250067.html Sent from the Solr - User mailing list archive at Nabble.com.
How to achieve exact string match query which includes spaces and quotes
Hello , I am using Solr 5.2. I have a field defined as "string" field type. It have some values in it like DOC-1 => abc ".. I am " not ? test DOC-2 => abc ".. This is the single string , I want to query all documents which exactly match this string i.e. it should return me only DOC-1 when I query for 'abc ".. I am " not ? test' and it should return me only DOC-2 if I query for 'abc"...'. Please let me know how I can achieve this , which defType I should use. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-exact-string-match-query-which-includes-spaces-and-quotes-tp4250402.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to achieve exact string match query which includes spaces and quotes
Hi Binoy thanks. But does it matter which query-parser I use , shall I use "lucene" parser or "edismax" parser. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-exact-string-match-query-which-includes-spaces-and-quotes-tp4250402p4250405.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to achieve exact string match query which includes spaces and quotes
Thanks Erick for your reply. Because of some medical reason I was out of office for a week. ClientUtils.escapeQueryChars method from solrj client should be used? or you think its better to escape only quote " character. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-exact-string-match-query-which-includes-spaces-and-quotes-tp4250402p4252217.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to achieve exact string match query which includes spaces and quotes
Hello Binoy , I found that if I am using a StringField and index it using java code/solr-admin it adds a \ before " , i.e. lest say I have string ==> test " , then it gets indexed as test \". For all other special chars it does not do anything , so the trick which worked for me is while searching I replace " with \" using this code text.replaceAll("\"", "\""). This makes sure that it matches the intended string , also if I use ClientUtils then I need to make changes in indexing code also to escape special chars as using ClientUtils to make exact search work. So I found just replacing " with \" does the trick for me. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-exact-string-match-query-which-includes-spaces-and-quotes-tp4250402p4252522.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to use DocValues with TextField
Hello Harry , sorry for delayed reply , I have taken other approach by giving user a different usability as I did not have solution for this. But your option looks great , I will try this out. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-DocValues-with-TextField-tp4248647p4256316.html Sent from the Solr - User mailing list archive at Nabble.com.
Multivalued fields order of storing is guaranteed ?
Hello , I am using Solr 5.10 , I have a use case to fit in. Lets say I define 2 fields group-name,group-id both multivalued and stored . 1)now I add following values to each of them group-name {a,b,c} and group-id{1,2,3} . 2)Now I want to add new value to each of these 2 fields {d},{4} , my requirement is that it should add these new values such that when I query these 2 fields it should return me {a,b,c,d,} and {1,2,3,4} in this order i.e a<=>1,d<=>4. Is it guaranteed that stored multivalued fields maintain order of insertion. Or I need to to explicitly handle this scenario. Any help is appreciated. Thanks, Alok -- View this message in context: http://lucene.472066.n3.nabble.com/Multivalued-fields-order-of-storing-is-guaranteed-tp4212383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multivalued fields order of storing is guaranteed ?
Thanks Yonik. -- View this message in context: http://lucene.472066.n3.nabble.com/Multivalued-fields-order-of-storing-is-guaranteed-tp4212383p4212428.html Sent from the Solr - User mailing list archive at Nabble.com.
how to get unique latest results from solr
Hello All, I am using solr 4.0. I have a data in my solrindex where on each review of a document a new entry for a document is added in solr , here document also have a field which holds employee_id and entry also holds the timestamp of when that record is added. Now I want to query this index in a way that I can specify two time ranges t1-t2 and get latest entry for employee_id in this time-range. So in the given time-range document might have been updated multiple times , but I want latest entry. Is there any know way to achieve this. Thanks Alok -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-unique-latest-results-from-solr-tp4080034.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to get unique latest results from solr
Thanks Jack. It may be the case that I was unable to explain the query correctly. Actually I don't want it for a single employee I want it for all the employees that are updated in that time range. So if lets say 10 employees data is updated in the given time-range and that also multiple times then I want latest entry per employee updated in the given time range. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-unique-latest-results-from-solr-tp4080034p4080052.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to get unique latest results from solr
Thanks Erick. It seems the approach suggested by you is the one which I was looking for , thanks a lot for reply. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-unique-latest-results-from-solr-tp4080034p4080228.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Memory Cost of group.cache.percent parameter
Hello , need to know about the same thing as I am also stuck on decision of using grouping and with smalled data-set it seems to be a good performing thing. Also is there anyway to specify that group caching should be used depending on RAM allocated to it. -- View this message in context: http://lucene.472066.n3.nabble.com/Memory-Cost-of-group-cache-percent-parameter-tp4012967p4082472.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr grouping performace
Hello , I need some functionality for which I found that grouping is the most suited feature. I want to know about performance issue associated with it. On some posts I found that performance is an bottleneck but want to know that if I am having 3 million records with 0.5 million distinct values for group.value then can I expect results to return in 2-3 seconds? the grouping field is an "int" , also I want only one filed for a document. I can afford t use upto 4GB RAM. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-grouping-performace-tp4082480.html Sent from the Solr - User mailing list archive at Nabble.com.
Concat 2 fields in another field
Hello all , I am using solr 4.x , I have a requirement where I need to have a field which holds data from 2 fields concatenated using _. So for example I have 2 fields firstName and lastName , I want a third field which should hold firstName_lastName. Is there any existing concatenating component available or I need to write a custom updateProcessor which does this task. By the way need for having this third field is that I want to group on the firstname,lastName but as grouping does not support multiple fields to form single group I am using this trick. Hope I am clear . Thanks . -- View this message in context: http://lucene.472066.n3.nabble.com/Concat-2-fields-in-another-field-tp4086786.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Concat 2 fields in another field
Thanks for reply. But I don't want to introduce any scripting in my code so want to know is there any Java component available for the same. -- View this message in context: http://lucene.472066.n3.nabble.com/Concat-2-fields-in-another-field-tp4086786p4086791.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Concat 2 fields in another field
Hi all, thanks for your replies. I have managed to do this by writing custom updateprocessor and configured it as bellow firstName lastName fullName _ . Federico Chiacchiaretta , I have tried the option mentioned by you but on frequent update of the document it keeps on adding the value multiple times which I don't want . In my custom component I checked for existing value and if its empty then I have updated it by fN_lN. Thanks a lot for quick replies. -- View this message in context: http://lucene.472066.n3.nabble.com/Concat-2-fields-in-another-field-tp4086786p4086934.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: massive memory consumption of grouping feature
Did find any solution to this. I am also facing the same issue. -- View this message in context: http://lucene.472066.n3.nabble.com/massive-memory-consumption-of-grouping-feature-tp4031895p4093297.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr grouping performace
Hello , I am using solr 4.0 , I want to group entries on one of the int field , I need all the groups and group.limit is 1. I am getting very slow performance and some times I am getting OutOfMemory also. My index is having 20 million records and out of which my search result returns 1 million document and I do grouping on these 1 million docs. The side of data on disk is approx 2GB and I am having Xmx 2GB. Please can anyone help me out in this. performance is very slow it takes 10-12 seconds. Thanks , Alok -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-grouping-performace-tp4093300.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: AW: Solr grouping performace
Thanks for reply Sandro. My requirement is that I need all groups and then build compact data from it to send to server. I am not sure about how much RAM should be allocated to JVM instance to make it serve requests faster , any inputs on that are welcome. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-grouping-performace-tp4093300p4093311.html Sent from the Solr - User mailing list archive at Nabble.com.
Can I pass some Object as request parameter to solr server
Hello , I am using solr 4.0 , I want to sent a list of objects to solr as a request parameter. Is it possible ? Please let me know. -- View this message in context: http://lucene.472066.n3.nabble.com/Can-I-pass-some-Object-as-request-parameter-to-solr-server-tp4093463.html Sent from the Solr - User mailing list archive at Nabble.com.
Faceting on tree structure in SOLR4
Hello, I have a tree data structure like t1 |-t2 |-t3 t4 |-t5 and so on . And there is no limit on tree depth as well as number of children to each node. What I want is that when I do the faceting for parent node t1 it should also include count of all of its children (t2 and t3 in this case). So lets say count corresponding to t1 is 5 and t2 and t3 also its 5 then the total should display 15 as a count against t1. Please let me know how I can achieve this. I am using SOLR4 and tree structure is dynamic and subject to addition,deletion and edition. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Faceting-on-tree-structure-in-SOLR4-tp4039650.html Sent from the Solr - User mailing list archive at Nabble.com.
StandardTokenizerFactory behaviour
Hello , I am working on Solr from last few months and stuck some where , Analyzer in Field Definition : -- In: "Please, email john@foo.com by 03-09, re: m37-xq." Expected Out: "Please", "email", "john@foo.com", "by", "03-09", "re", "m37-xq" but not getting this. Is something wrong with my understanding of StandardTokenizer? I am using solr 3.6. Please let me know what is wrong with this. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: StandardTokenizerFactory behaviour
Just to make sure that there is no ambiguity the In: "Please, email john@foo.com by 03-09, re: m37-xq." is the input given to this field for indexing and the Expected Out: "Please", "email", "john@foo.com", "by", "03-09", "re", "m37-xq" is expected output tokens. -- View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215p3990216.html Sent from the Solr - User mailing list archive at Nabble.com.
What we loose if we use ClassicTokenizer instead of StandardTokenizer
Hello, I need to know that if I use ClassicTokenizer instead of StandardTokenizer then what things I will loose. Is it the case that in future solr versions ClassicTokenizer will be deprecated? or development in ClassicTokenizer is going to halt? Please let me know this. -- View this message in context: http://lucene.472066.n3.nabble.com/What-we-loose-if-we-use-ClassicTokenizer-instead-of-StandardTokenizer-tp3990249.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What we loose if we use ClassicTokenizer instead of StandardTokenizer
thanks for the reply. Yes I have started the admin/analysis thing before you suggested but just wanted to know if out of the box anything specific is notsupported/supported by the tokenizers specified. -- View this message in context: http://lucene.472066.n3.nabble.com/What-we-loose-if-we-use-ClassicTokenizer-instead-of-StandardTokenizer-tp3990249p3990278.html Sent from the Solr - User mailing list archive at Nabble.com.
How can I optimize Sorting on multiple text fields
Hello, the requirement which I have is that on solr side we have indexed data of multiple customers and each customer we have at least a million documents. After executing search end user want to sort on some fields on datagrid lets say subject, title, date etc. Now as the sorting on text fields is costlier what optimisation I can do for that, I am thinking of following options 1)Create a custom cache and for each customer hold the list of documents in sorted order of each of the field on which we want to sort . So that when request for sorting comes from the user I can return a list from cache 2)Use filter query cache , where customer id criteria is added so that each time I get the docs from filter cache Please can anybody tell me whether this is the good approach or there is some better way of doing this? I am using solr 3.6. Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/How-can-I-optimize-Sorting-on-multiple-text-fields-tp3990874.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How can I optimize Sorting on multiple text fields
Thanks for the inputs. Eric, Yes I was referring to the String data-type. The reason I was asking this is that for a single customer we have multiple users and each user may apply different search criteria before sorting on the field so if we can cache the sorted results then it may improve the user experience with performance. -- View this message in context: http://lucene.472066.n3.nabble.com/How-can-I-optimize-Sorting-on-multiple-text-fields-tp3990874p3991129.html Sent from the Solr - User mailing list archive at Nabble.com.
Sending Object as a request parameter
Hello, Is there any provision available with Solr so that while querying the solr server using solrj API I can send Object as a request parameter? So that in my request handler on solr side I can read that object.Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Sending-Object-as-a-request-parameter-tp3991151.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Prefix query is not analysed?
Thanks for reply. If I check the debug query through solr-admin I can see that the lower case filter is applied and "rawquerystring":"em_to_name:Follett'.*", "querystring":"em_to_name:Follett'.*", "parsedquery":"+em_to_name:follett'.*", "parsedquery_toString":"+em_to_name:follett'.*", "explain":{}, "QParser":"ExtendedDismaxQParser", I can see this query. So is it the case that only tokenization is not done for the wildcard queries but other filters specified are applied? -- View this message in context: http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435p3992450.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Prefix query is not analysed?
Yes I am using Solr 3.6. Thanks for the link it is very useful. >From the link I could make out that if analyzer includes any one of the following then they are applied and any other elements specified under analyzer are not applied as they are not multi-term aware. ASCIIFoldingFilterFactory LowerCaseFilterFactory LowerCaseTokenizerFactory MappingCharFilterFactory PersianCharFilterFactory -- View this message in context: http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435p3992463.html Sent from the Solr - User mailing list archive at Nabble.com.
Search for abc AND *foo* return all docs for abc which do not have foo why?
Hello, If I Search for abc AND *foo* return all docs for abc which do not have foo why? I suspect that if the * is present on both the side of a word then that word is ignored. Is it the correct interpretation? I am using solr 3.6 and field uses StandardTokenizer. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-abc-AND-foo-return-all-docs-for-abc-which-do-not-have-foo-why-tp3993138.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search for abc AND *foo* return all docs for abc which do not have foo why?
It is my mistake, the field which I was referring to was non existing so this effect is shown. Sorry for the stupid question I have asked :-) -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-abc-AND-foo-return-all-docs-for-abc-which-do-not-have-foo-why-tp3993138p3993147.html Sent from the Solr - User mailing list archive at Nabble.com.
PathHierarchyTokenizerFactory behavior
Hello, this is how the field is declared in schema.xml when I query for this filed with input "M:/Users/User/AppData/Local/test/abc.txt" . It searches for documents containing any of the token generated M,Users, User etc.but I want to search for exact file with the given input as a value. Please let me know how I can achieve that. I am using solr 3.6.thanks -- View this message in context: http://lucene.472066.n3.nabble.com/PathHierarchyTokenizerFactory-behavior-tp3993839.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PathHierarchyTokenizerFactory behavior
Hello Koji, thanks for reply. yes one way I can try is use copyField with one of the copy using PathHierarchyTokenizerFactory and the other using KeywordTokenizerFactory and depending on whether input entered is directory path or exact file path switch between these 2 fields . thanks -- View this message in context: http://lucene.472066.n3.nabble.com/PathHierarchyTokenizerFactory-behavior-tp3993839p3993866.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PathHierarchyTokenizerFactory behavior
Modifying the field definition to solves the purpose . got it from the link http://stackoverflow.com/questions/6920506/solr-pathhierarchytokenizerfactory-facet-query -- View this message in context: http://lucene.472066.n3.nabble.com/PathHierarchyTokenizerFactory-behavior-tp3993839p3994154.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr edismax NOT operator behavior
Hello, I am using Edismax parser and query submitted by application is of the format price:1000 AND ( NOT ( launch_date:[2007-06-07T00:00:00.000Z TO 2009-04-07T23:59:59.999Z] AND product_type:electronic)). Solr while executing gives unexpected result. I am suspecting it is because of the AND ( NOT portion of the query . Please can any one explain me how this structure is handled. I am using solr 3.6 Any help is appreciated .. Thanks Alok -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-edismax-NOT-operator-behavior-tp3997663.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 3.6 observe connections in CLOSE_WAIT state
Hello, I am using solr 3.6.0 , I have observed many connection in CLOSE_WAIT state after using solr server for some time. On further analysis and googling found that I need to close the idle connections from the client which is connecting to solr to query data and it does reduce the number of CLOSE_WAIT connections but still some connection remain in that state. I am using 2 shards and one observation is that if I don't use shards then I am getting 0 CLOSE_WAIT connections. Need help of this as we need to use distributed search using shards. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-6-observe-connections-in-CLOSE-WAIT-state-tp4009097.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr cloud facet query returning incorrect results
Hello All , we are using Solr6.2 , in schema that we use we have an integer field. For a given query we want to know how many documents have duplicate value for the field , for an example how many documents have same doc_id=10. So to find this information we fire a query to solr-cloud with following parameters "q":"organization:abc", "facet.limit":"10", "facet.field":"doc_id", "indent":"on", "fl":"archive_id", "facet.mincount":"2", "facet":"true", But in response we get that there are no documents having duplicate doc_id as in facet query response we are not getting any facet_counts , but if I change the query to "q":"organization:abc AND doc_id:10" then in response I can see that there are 3 docs with doc_id=10. This behavior seems contrary to how facets behave , so wanted to know if there is any possible reason for this type of behavior.