How does adding a phrase slop in the handler help?
I tried ps=25 along with some pf values. I assumed that it means this..for
eg: a search term, 'child custody battle' means documents which have the
words 'child','custody','battle' within 25 words of one another will rank
high. Is that correct?
--
I'm not looking at the docs to double-check this, but the ps option lets you
boost exact phrase matches higher.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: anuvenk <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Saturday, January
It sounds like you simply want to drop solr.WordDelimiterFilterFactory from
your analyzer definition, no?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: anuvenk <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Saturday, January 5, 200
MLT - give it an ID of a doc and it will return similar docs.
DisMax - give it a query string and it will construct a "parametric" query with
boosts defined in solrconfig.xml
Different beasts for different uses.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original
Evgeniy,
Two simple options:
1) take your index, put it on N Solr search servers, and put them behind a load
balancer
2) take your index, split it in N (or create N smaller indices from scratch)
and put it on N Solr search servers (and see SOLR-303)
Each will help in a different way and it soun
My first guess would be that this is related to Wildcard queries not being
analyzed. Check the Lucene FAQ, I believe the explanation is there. Also, go
to Solr Admin page and run your query in the Analysis section of the Admin to
see what's going on.
Otis
--
Sematext -- http://sematext.com/ -
Thats what i'm thinking too. If i remove solr.worddelimiter filter from both
index and query, the word h1-b will remain as is in the index correct, so if
someone searches for h1b (without hyphens) would it still return the h1-b
doc.
Otis Gospodnetic wrote:
>
> It sounds like you simply want to
I noticed that the top 10 results for a particular search term had the same
score. In such cases how does solr determine which should get the first
place, second and so on?
--
View this message in context:
http://www.nabble.com/How-does-solr-rank-multiple-docs-with-same-score-tp14638959p14638959
I understand tf means term frequency. For eg: if the search term is 'chapter
7', does tf mean how frequently 'chapter 7' occurs in the docs? Does it take
in to account the total number of words in a doc to determine frequency.
Also what is idf, fieldNorm and queryNorm. Trying to understand how sol
can anyone offer any advice as to whether there is a java client that will work
on java 1.4 against 2.0. Well I have seen various references to java a java
clients but there doesn't seem to be one included in the solr 2.0 distribution.
I think there is one intended for solr 3.0 but of course th
On Jan 5, 2008 2:28 PM, anuvenk <[EMAIL PROTECTED]> wrote:
> Thats what i'm thinking too. If i remove solr.worddelimiter filter from both
> index and query, the word h1-b will remain as is in the index correct, so if
> someone searches for h1b (without hyphens) would it still return the h1-b
> doc.
On Jan 5, 2008 3:53 PM, anuvenk <[EMAIL PROTECTED]> wrote:
> I noticed that the top 10 results for a particular search term had the same
> score. In such cases how does solr determine which should get the first
> place, second and so on?
Ties are the same as in lucene... internal docid (equiv to t
The worddelimiter filter is set to
generatewordparts=1,generatenumberparts=1,catenatewords=1,catenatenumbers=1
both at index and querytime.
Now i have this synonym mapping k-1 => k1 visa
Here is the parsedquery_ToString
+(text:"k (1 k) 1 visa"^0.8 | name:"k (1 k) 1 visa"^2.0)~0.01 (text:"k (1 k
You should really look at Lucene first, if you want to know this type of stuff.
TF - # of occurrences of a term in a single doc
DF - # of occurrences of a term in the corpus/index (IDF is the inverse DF)
But lookgoogle...
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/
Sean,
There is no solr 2.0 nor 3.0 yet - 1.2 is the last release, while 1.3 is still
baking in the oven.
The only supported/official Solr Java client is solrj, and you can get it if
you get Solr our of svn (and maybe some other way). If solrj doesn't work for
you, I am guessing you'll have to r
What is the best approach to tune queryResultCache ?For example the default
size is: size="512" but since a document id is just an int (it is an int,
right?) ,i.e 4 bytes why not set size to 10,000,000 for example (it's only
~38Mb).
I sense there is something that I'm missing here :). any help wou
How do i boost a field (not a term) using the standard handler syntax? I
know i can do that with the DisMax but I'm trying to keep myself in the
standard one.Can this be done ?
Thanks,
The lower ps , the better or vice versa? I'm guessing lower. I think that'll
make the search stricter. Is it correct?
Otis Gospodnetic wrote:
>
> I'm not looking at the docs to double-check this, but the ps option lets
> you boost exact phrase matches higher.
>
> Otis
> --
> Sematext -- http://
sorry. I meant 1.2 and 1.3. thanks> Date: Sat, 5 Jan 2008 18:30:54 -0800> From:
[EMAIL PROTECTED]> Subject: Re: java client for java 1.4 solr 2.0> To:
solr-user@lucene.apache.org> > Sean,> There is no solr 2.0 nor 3.0 yet - 1.2 is
the last release, while 1.3 is still baking in the oven.> The on
ja
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: anuvenk <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Sunday, January 6, 2008 1:50:42 AM
Subject: Re: phrase slop param in dismax handler
The lower ps , the better or vice versa? I'm g
: Is the parsedquery_ToString, the one passed to solr after all the tokenizing
: and analyzing of the query?
yes.
: For the search term 'chapter 7' i have this parsedquery_ToString
...
: I have these synonyms
: chap 7 => bankruptcy
...
: But seem to have a little bit of trouble
: I've been using the solr admin form with debug=true to do some in-depth
: analysis on some results. Could someone explain how to make sense of
: this..This is the debugging info for the first result i got.
there's more to the debugging info then just what's below ... this is
known as a "score
: How does adding a phrase slop in the handler help?
: I tried ps=25 along with some pf values. I assumed that it means this..for
: eg: a search term, 'child custody battle' means documents which have the
: words 'child','custody','battle' within 25 words of one another will rank
: high. Is that c
23 matches
Mail list logo