Strange Sorting results on a Text Field
Hello, have a strange response in a query with sorting. I sort on a field which is : multiValued="true"/> in this field mostly 32 byte md5's are saved, mostly only a single entry but also up to 5. when I do a search like this : "+testfield: (fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f 3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc" I get the following results: c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 4302516b91b743a8972120f52d309a72 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 I have no idea why position 6 is in this search, because the XML entries are correct too. Any Idea where I may search for the error ? Also, does somebody has a link where the benefits of "multiValued" are explained ? Thanks, Tom
Re: Strange Sorting results on a Text Field
On 9/11/06, Tom Weber <[EMAIL PROTECTED]> wrote: Hello, have a strange response in a query with sorting. I sort on a field which is : I think you probably want a type="string" instead. Text fields have text analysis (stemming, lowercasing, word splitting, etc) and aren't used for exact matching or sorting. in this field mostly 32 byte md5's are saved, mostly only a single entry but also up to 5. when I do a search like this : "+testfield: (fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f 3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc" I get the following results: c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 4302516b91b743a8972120f52d309a72 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 c10c9bf4ef3f1bc30aedf83b96a9ce16 I have no idea why position 6 is in this search, because the XML entries are correct too. Any Idea where I may search for the error ? Also, does somebody has a link where the benefits of "multiValued" are explained ? You can have multiple values for the field in a single document if it's marked as multiValued: first val second val -Yonik
Re: Strange Sorting results on a Text Field
Hello Yonik, You are right about the string stuff, I saw while turning on the debugging a few minutes ago, that it is splitting the md5 sum up in several parts, eacht time we have a number after a letter or the other way round. Thanks also for the "multiValued" explanation, this is useful for my current application. But then, if I use this field and I ask for sorting, how will the sorting be done, alphanumeric on the first entry for this field ? Until now, I entered more than one entry by separting them with a space in the same field, like name="test">text1 text2 text3. Thanks, tom On 11 Sep, 2006, at 15:14 , Yonik Seeley wrote: On 9/11/06, Tom Weber <[EMAIL PROTECTED]> wrote: Hello, have a strange response in a query with sorting. I sort on a field which is : I think you probably want a type="string" instead. Text fields have text analysis (stemming, lowercasing, word splitting, etc) and aren't used for exact matching or sorting. in this field mostly 32 byte md5's are saved, mostly only a single entry but also up to 5. when I do a search like this : "+testfield: (fde34c51739462d9486140601dcfb7bf 63af20144c2cbae1ec4dc0bc2e9d2c2f 3cf8e32bf2b9384447d52318a72fd4b1) ;testfield asc" I get the following results: name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">4302516b91b743a8972120f52d309a72 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 name="testfield">c10c9bf4ef3f1bc30aedf83b96a9ce16 I have no idea why position 6 is in this search, because the XML entries are correct too. Any Idea where I may search for the error ? Also, does somebody has a link where the benefits of "multiValued" are explained ? You can have multiple values for the field in a single document if it's marked as multiValued: first val second val -Yonik
Re: Strange Sorting results on a Text Field
On 9/11/06, Tom Weber <[EMAIL PROTECTED]> wrote: Thanks also for the "multiValued" explanation, this is useful for my current application. But then, if I use this field and I ask for sorting, how will the sorting be done, alphanumeric on the first entry for this field ? Until now, I entered more than one entry by separting them with a space in the same field, like text1 text2 text3. Sorting is currently only supported when there is at most one value (or token) per document. This is a lucene restriction. -Yonik
Solr in production env.
Hello, I almost convinced my boss to use Solr in production for a new project and hopefully for lots of following projects but I'm a bit confused that there is no release available for download. Is Solr still in a beta state, are there solr servers in production. Is it recommendable to use it in production? I would be glad about some experience and recommendations about this topic. best regards Simon
Re: Solr in production env.
CNET has been using SOLR in production for quite some time. There are others on this list-serv that can elaborate way beyond me. On 9/11/06, Simon Willnauer <[EMAIL PROTECTED]> wrote: Hello, I almost convinced my boss to use Solr in production for a new project and hopefully for lots of following projects but I'm a bit confused that there is no release available for download. Is Solr still in a beta state, are there solr servers in production. Is it recommendable to use it in production? I would be glad about some experience and recommendations about this topic. best regards Simon
Re: Solr in production env.
I know about a number of production installations. I even know of a company which mere existence is based partly on Solr. :) There is also a public list of production installations available on the homepage and/or Wiki. Eivind
Re: Solr in production env.
Hi Simon, ...are there solr servers in production... You can see a list at http://wiki.apache.org/solr/PublicServers - there's some solid stuff running on Solr already! -Bertrand
Re: Got it working! And some questions
On 9/9/06, Michael Imbeault <[EMAIL PROTECTED]> wrote: The main problem was that addIndex was sending 1 doc at a time to solr; it would cause a problem after a few thousand docs because i was running out of resources. Sending one doc at a time should be fine... you shouldn't run out of resources. There must be a bug somewhere... -Yonik
Re: Got it working! And some questions
On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote: I'm still a little disappointed that I can't change the OR/AND parsing by just changing some parameter (like I can do for the number of results returned, for example); adding a OR between each word in the text i want to compare sounds suboptimal, but i'll probably do it that way; its a very minor nitpick, solr is awesome, as I said before. I'm the one that added support for controlling the default operator of Solr's query parser, and I hadn't considered the use case of controlling that setting from a request parameter. It should be easy enough to add. I'll take a look at adding that support and commit it once I have it working. What parameter name should be used for this?do=[AND|OR] (for default operator)? We have df for default field. Erik
Re: Got it working! And some questions
Hello Erik, Thanks for add that feature! "do" is fine with me, if "op" is already used (not sure about this one). Erik Hatcher wrote: On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote: I'm still a little disappointed that I can't change the OR/AND parsing by just changing some parameter (like I can do for the number of results returned, for example); adding a OR between each word in the text i want to compare sounds suboptimal, but i'll probably do it that way; its a very minor nitpick, solr is awesome, as I said before. I'm the one that added support for controlling the default operator of Solr's query parser, and I hadn't considered the use case of controlling that setting from a request parameter. It should be easy enough to add. I'll take a look at adding that support and commit it once I have it working. What parameter name should be used for this?do=[AND|OR] (for default operator)? We have df for default field. Erik -- Michael Imbeault CHUL Research Center (CHUQ) 2705 boul. Laurier Ste-Foy, QC, Canada, G1V 4G2 Tel: (418) 654-2705, Fax: (418) 654-2212
Re: Got it working! And some questions
On 9/11/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote: > I'm still a little disappointed that I can't change the OR/AND > parsing by just changing some parameter (like I can do for the > number of results returned, for example); adding a OR between each > word in the text i want to compare sounds suboptimal, but i'll > probably do it that way; its a very minor nitpick, solr is awesome, > as I said before. I'm the one that added support for controlling the default operator of Solr's query parser, and I hadn't considered the use case of controlling that setting from a request parameter. It should be easy enough to add. I'll take a look at adding that support and commit it once I have it working. What parameter name should be used for this?do=[AND|OR] (for default operator)? We have df for default field. Maybe something like q.op or q.oper if it *only* applies to q. Which begs the question... what *does* it apply to? At first blush, it doesn't seem like it should apply to other queries like fq, facet queries, and esp queries defined in solrconfig.xml. I think that would be very surprising. -Yonik
Re: Got it working! And some questions
: Maybe something like q.op or q.oper if it *only* applies to q. Which : begs the question... what *does* it apply to? At first blush, it : doesn't seem like it should apply to other queries like fq, facet : queries, and esp queries defined in solrconfig.xml. I think that : would be very surprising. agreed not the comment i put into SolrPluginUtils.parseFilterQueries when i add fq support to StandardRequestHandler... /* Ignore SolrParams.DF - could have init param FQs assuming the * schema default with query param DF intented to only affect Q. * If user doesn't want schema default, they should be explicit in the FQ. */ ... i would think a "do" or "op" or "q.op" param should *definitely* only influence the "q" param. -Hoss
MoreLikeThis class in Lucene within Solr?
Ok, so hopefully I resolved my problems posting to this mailing list and this won't show up in some thread, but as a new topic! Is it possible in any way to use the MoreLikeThis class with solr (http://lucene.apache.org/java/docs/api/org/apache/lucene/search/similar/MoreLikeThis.html)? Right now I'm determining similar docs by just querying for the whole body with OR between words, and it's not very efficient performance wise. I never coded in Java so I really don't know where I should start... Thanks, -- Michael Imbeault CHUL Research Center (CHUQ) 2705 boul. Laurier Ste-Foy, QC, Canada, G1V 4G2 Tel: (418) 654-2705, Fax: (418) 654-2212
Re: Solr in production env.
Hi Simon - We're running Solr in production, and it's rock solid. Of course you can't really just take an anonymous word for it, but I would honestly put this stack up against any other system you can find, open source or commercial. Run it for yourself and you'll be alarmed at how sound it is, out of the box. I'll bet Hoss & Yonik's paycheck on it. ;-) -- j P.S. Hoss & Yonik - just kidding, but couldn't resist. Many kudos to your efforts on this. On 9/11/06, Simon Willnauer <[EMAIL PROTECTED]> wrote: Hello, I almost convinced my boss to use Solr in production for a new project and hopefully for lots of following projects but I'm a bit confused that there is no release available for download. Is Solr still in a beta state, are there solr servers in production. Is it recommendable to use it in production? I would be glad about some experience and recommendations about this topic. best regards Simon
Re: Solr in production env.
: I'll bet Hoss & Yonik's paycheck on it. ;-) : P.S. Hoss & Yonik - just kidding, but couldn't resist. Many kudos to your : efforts on this. I can't speak for Yonik, but I take no offense -- I bet my paycheck on Solr every day :) : > and hopefully for lots of following projects but I'm a bit confused : > that there is no release available for download. Is Solr still in a : > beta state, are there solr servers in production. Is it recommendable : > to use it in production? I would be glad about some experience and : > recommendations about this topic. With any piece of software I'd personally recommend you rev your local copy only after veting that it behaves as you expect based on your usage of the previous version (ie: have your own Unit Tests that you run against it on a Dev box before deploying to production). It's also wise to keep snapshots of the source code and documentation each time you rev your local copy in case the project takes a drastic turn in direction and you find yourself wanting to fork from the last version you were happy with. As for Solr specificly: *I* certainly think it's suitable for production use, and have a vested interested in making sure it doesn't change so radically that future changes aren't backwards compatible. -Hoss
Re: Solr in production env.
: I even know of a company which mere existence is based partly on Solr. :) now *that* sounds like i story i'd like to hear more of -Hoss