Weighting the Licene score
I want to weighted average the Lucene score with an additional score i have, i.e. (W1 * Lucene score + W2 * Other score) / (W1 + W2) . What is the easiest way to do this? Also, is the Lucene score normalized. Thanks,
Re: Weighting the Licene score
But function query doesn't give access to the SOLR score, only to fields in the index, no ? thx On Tue, Aug 26, 2008 at 2:02 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > I think the easiest approach might be making use of Lucene's function > query. > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: s d <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Tuesday, August 26, 2008 1:55:38 PM > > Subject: Weighting the Licene score > > > > I want to weighted average the Lucene score with an additional score i > have, > > i.e. (W1 * Lucene score + W2 * Other score) / (W1 + W2) . > > What is the easiest way to do this? > > Also, is the Lucene score normalized. > > Thanks, > >
Partitioning the index
Hi,Is there a recommended index size (on disk, number of documents) for when to start partitioning it to ensure good response time? Thanks, S
display tokens
How can I retrieve the "analyzed tokens" (e.g. the stemmed values) of a specific field?
Is there a way to retrieve the "analyzed tokens" (e.g. the stemmed values) of a field from the SOLR index ?
Is there a way to retrieve the "analyzed tokens" (e.g. the stemmed values) of a field from the SOLR index ? Almost like using SOLR as a utility for generating the tokens. Thanks !
Lucene And SOLR
Is there a way to import a Lucene index (as is) into SOLR? Basically, I'm looking to enjoy the "web context" and caching provided by SOLR but keep the index under my control in Lucene.
RAMDirectory
Is there a way to use RAMDirectory with SOLR?If you can point me to documentation that would be great. Thanks, S
Query Syntax (Standard handler) Question
Is there a simpler way to write this query (I'm using the standard handler) ? field1:t1 field1:t2 field1:"t1 t2" field2:t1 field2:t2 field2:"t1 t2" Thanks,
Re: Query Syntax (Standard handler) Question
but i want to sum the scores and not use max, can i still do it with the DisMax? am i missing anything ? On Jan 4, 2008 2:32 AM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > On Jan 4, 2008, at 4:40 AM, s d wrote: > > Is there a simpler way to write this query (I'm using the standard > > handler) > > ? > > field1:t1 field1:t2 field1:"t1 t2" field2:t1 field2:t2 field2:"t1 t2" > > Looks like you'd be better off using the DisMax handler for > (without the brackets). > >Erik > >
queryResultCache
What is the best approach to tune queryResultCache ?For example the default size is: size="512" but since a document id is just an int (it is an int, right?) ,i.e 4 bytes why not set size to 10,000,000 for example (it's only ~38Mb). I sense there is something that I'm missing here :). any help would be appreciated. Thanks,
Boosting a Field (Standard Handler)
How do i boost a field (not a term) using the standard handler syntax? I know i can do that with the DisMax but I'm trying to keep myself in the standard one.Can this be done ? Thanks,
Re: queryResultCache
Thanks. a factor of 20 or even 30 from my numbers still gives a much larger number than the default one and i was wondering is there any disadvantage in having a big number/ cache?BTW, where is the TTL controlled ? On Jan 6, 2008 7:23 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Jan 6, 2008 12:59 AM, s d <[EMAIL PROTECTED]> wrote: > > What is the best approach to tune queryResultCache ?For example the > default > > size is: size="512" but since a document id is just an int (it is an > int, > > right?) ,i.e 4 bytes why not set size to 10,000,000 for example (it's > only > > ~38Mb). > > This cash size refers to the number of id lists are stored. > One query + sort that yields the top 20 results == 1 entry in the cache. > > -Yonik >
Re: queryResultCache
Got it. Smart. Thx On 1/6/08, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : number than the default one and i was wondering is there any disadvantage > in > : having a big number/ cache?BTW, where is the TTL controlled ? > > no disadvantage as long as you've got the RAM ... NOTE: the magic "512" > number you refered to isn't a "default" -- it's an "example" in the > "example" > solrconfig.xml > > There is no TTL for Solr caches, as noted in the wiki... > > http://wiki.apache.org/solr/SolrCaching > > Solr caches are associated with an Index Searcher -- a particular 'view' > of the index that doesn't change. So as long as that Index Searcher is > being used, any items in the cache will be valid and available for reuse. > Caching in Solr is unlike ordinary caches in that Solr cached objects will > not expire after a certain period of time; rather, cached objects will be > valid as long as the Index Searcher is valid. > > > > -Hoss > >
How do i normalize diff information (different type of documents) in the index ?
e.g. if the index is field1 and field2 and documents of type (A) always have information for field1 AND information for field2 while document of type (B) always have information for field1 but NEVER information for field2. The problem is that the formula will sum field1 and field2 hence skewing in favour of documents of type (A). If i combine the 2 fields into 1 field (in an attempt to normalize) i will obviously skew the statistics. Please advise, Thanks,
Re: How do i normalize diff information (different type of documents) in the index ?
Isn't there a better way to take the information into account but still normalize? taking the score of only one of the fields doesn't sound like the best thing to do (it's basically ignoring part of the information). On Jan 7, 2008 9:20 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > > On 7-Jan-08, at 9:02 PM, s d wrote: > > > e.g. if the index is field1 and field2 and documents of type (A) > > always have > > information for field1 AND information for field2 while document of > > type (B) > > always have information for field1 but NEVER information for field2. > > The problem is that the formula will sum field1 and field2 hence > > skewing in > > favour of documents of type (A). > > If i combine the 2 fields into 1 field (in an attempt to normalize) > > i will > > obviously skew the statistics. > > Try the dismax handler. It's main goal is to query multiple fields > while only counting the score of the highest-scoring one (mostly). > > -Mike >
Performance - FunctionQuery
Adding a FunctionQuery made the query response time slower by ~300ms, adding a 2ndFunctionQuery added another ~300ms so overall i got over 0.5sec for a response time (slow).Is this expected or am i doing something wrong ? Thx
Min-Score Filter
Is there a way or a point in filtering all results bellow a certain score? e.g. exclude all results bellow score Y.Thanks
Re: How do i normalize diff information (different type of documents) in the index ?
Got it ( http://wiki.apache.org/solr/DisMaxRequestHandler#head-cfa8058622bce1baaf98607b197dc906a7f09590) . thx ! On Jan 8, 2008 12:11 AM, Chris Hostetter < [EMAIL PROTECTED]> wrote: > > : Isn't there a better way to take the information into account but still > : normalize? taking the score of only one of the fields doesn't sound like > the > : best thing to do (it's basically ignoring part of the information). > > note the word "mostly" in Mike's response about dismax ... the "tie" param > > lets you decide how much the other fields influence the score. Try it, > it works really well ... trust me/us. > > For the record: i'm really not sure what your question is ... you say you > want to normalize for the fact that some docs don't have a value in some > fields, but you don't want to combine the fields because it will skew the > statistics ... isn't that "skewing" exactly what you are trying to > achieve? > > don't you need to introduce some "skew" in favor of hte docs that don't > have a value for field2 to compensate forr the existing "counter skew" > they already have? > > > > -Hoss > >
DisMax Syntax
User Query: x1 x2 Desired query (Lucene): field:x1 x2 field:"x1 x2"~a^b In the standard handler the only way i saw how to make this work was: field:x1 field:x2 field:"x1 x2"!a^b Now that i want to try the DisMax is there a way to implement this without having duplicate fields? i.e. since the fields and the terms are separated in the DisMax how do i achieve the same query ? Thanks
Re: DisMax Syntax
I may be mistaken, but this is not equivalent to my query.In my query i have matches for x1, matches for x2 without slope and/or boosting and then match to "x1 x2" (exact match) with slope (~) a and boost (b) in order to have results with exact match score better. The total score is the sum of all the above. Your query seems diff On Jan 8, 2008 11:56 AM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : User Query: x1 x2 > : Desired query (Lucene): field:x1 x2 field:"x1 x2"~a^b > : > : In the standard handler the only way i saw how to make this work was: > : field:x1 field:x2 field:"x1 x2"!a^b > : > : Now that i want to try the DisMax is there a way to implement this > without > : having duplicate fields? i.e. since the fields and the terms are > separated > : in the DisMax how do i achieve the same query ? > > i'm not sure what you mean by "without duplicate fields" but assuming i > understand your goal, this seems trivial... > >q = x1 x2 >qf = field >pf = field^b >ps = a > > > -Hoss > >
Inconsistent results
Hi,I use SOLR with standard handler and when i send the same exact query to solr i get different results every time (i.e. refresh the page with the query and get different results). Any ideas? Thx,
Interleaved results form different sources
We have an index of documents from different sources and we want to make sure the results we display are interleaved from the different sources and not only ranked based on relevancy.Is there a way to do this ? Thanks, S.
result limit / diversity with an OR query
Hi,I have a query similar to: x OR y OR z and i want to know if there is a way to make sure i get 1 result with x, 1 result with y and one with z ? Alternatively, is it possible to achieve through facets? Thanks, S.
Does SOLR support RAMDirectory ?
Can i use RAMDirectory in SOLR?Thanks, S