Setting a Threshold of a sortable field to filter the result?
Hi, How can I set a threshold value of a field so that I can filter the result which is lower than the threshold? By the schema.xml or set by the query? Thank you, Vinci -- View this message in context: http://www.nabble.com/Setting-a-Threshold-of-a-sortable-field-to-filter-the-result--tp16367336p16367336.html Sent from the Solr - User mailing list archive at Nabble.com.
Multiple unique field?
Hi, I want to set 2 field that are unique for different kind of searching. Does it possible? Thank you, Vinci -- View this message in context: http://www.nabble.com/Multiple-unique-field--tp16367339p16367339.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: synonyms
On Friday 28 March 2008 21:44:29 Leonardo Santagada wrote: > Well his examples are in brazilian portuguese and not spanish and the > biggest problem is that a spanish stemmer is not goin to work. I > haven't found a pt_BR steammer, have I overlooked something? Try the Snowball Porter filter factory. The algorithm is specified in plain text files, so adding new stemmers to the codebase is pretty easy. The hard part is finding a good specification of the algorithm for Brazilian Portuguese. A Google search reveals some references to Brazilian Portuguese versions of the Porter algorithm. Maybe one of these is suitably unencumbered for implementation and distribution as free software. As a last resort, there already is a Snowball Porter stemmer for Portuguese in the SOLR codebase. However, I do not know how suitable it would be for adaptation to Brazilian Portuguese, as I know zilch about the variant spoken in Portugal. Best regards - Christian
Re: Solr commits automatically on appserver shutdown
On Fri, Mar 28, 2008 at 2:05 PM, Noble Paul നോബിള് नोब्ळ् <[EMAIL PROTECTED]> wrote: > hi, > I am willing to work on this if you can give me some pointers as to > where to start? DirectUpateHandler2 implements it's own duplicates removal, which is no longer necessary. -Yonik
Re: hl.requireFieldMatch and idf
Mike, Thank you for your response. cause: If hl.requireFieldMatch set to true, DefaultSolrHighlight.getQueryScorer() uses QueryScorer(Query,IndexReader,String) constructor in Lucene highlighter. Then the constructor calls getIdfWeightedTerms() to get an array of WeightedTerm. In getIdfWeightedTerms(), idf is calculated to get weighted terms. And the calculated idf can be minus with un-optimized index. Okay, _this_ is the true bug. I don't see how lucene can return a negative idf, optimized index or no. I think that docFreq includes deleted docs count and this is Lucene's feature. This feature causes a negative idf, as long as the following fomula is used: // o.a.l.s.highlight.QueryTermExtractor.java float idf=(float)(Math.log((float)totalNumDocs/(double)(docFreq+1)) + 1.0); Does DefaultSolrHighlight.getQueryScorer() use QueryScorer(Query,IndexReader,String) by design? If no, I'm happy to open a ticket. Indeed it is by design: this is how requireFieldMatch is implemented, as the lucene highlighter will require the field to match as well as the term. A consequence of this is that the idf's as also folded into the score, which is triggering the bug you are seeing. Can we use QueryScorer(Query,String) instead of QueryScorer(Query,IndexReader,String) to implement hl.requireFieldMatch=true? I've opened SOLR-517 to follow up this problem. Thank you, Koji
Re: solr.search.function
: SELECT MID, AVG(Rating) as Average FROM mpr : WHERE PID in (p1[,p2,...]) : GROUP BY MID : ORDER BY Average DESC LIMIT 0, 10; : : Also I would require to boost the vales based on PIDs (some products have : more wight than others effectively computing a wighted average) : To handle these queries I am plannig to develop a custom request handler : plugin in most generic form to be useful in general. ok .. but i'm not really sure what you're asking at this point ... as i said: the FunctionQuery code isn't relaly going to help you here .. the Faceting Code is more akin to what you are asking about. alternately: just because your database is structured arround one record for each (MID, PID, Rating) triple doesn't mean your *documents* need to be structured that way ... instead you can have one document per product and precompute the average before indexing them (that's the theory behind building an index, you precompute/denormalize/invert information for faster lookup later) -Hoss
Re: Term frequency
: is there a way to get the term frequency per found result back from Solr ? this info is in the "explain" section of the debugQuery output, see this recent post about a similar question... http://www.nabble.com/Highlight---get-terms-used-by-lucene-to16276184.html#a16323025 -Hoss