Thanks that worked!
Otis Gospodnetic wrote:
>
> Hi,
>
> Probably by writing your own Similarity (Lucene codebase) and implementing
> the following method with capping:
>
> /** Implemented as sqrt(freq). */
> public float tf(float freq) {
> return (float)Math.sqrt(freq);
> }
>
> The
On 14-Apr-08, at 6:14 PM, s d wrote:
We have an index of documents from different sources and we want to
make
sure the results we display are interleaved from the different
sources and
not only ranked based on relevancy.Is there a way to do this ?
By far the easiest way is to get the top N
On 13-Apr-08, at 3:25 AM, khirb7 wrote:
it doesn't work solr still use the default value fragsize=100. also
I am not
able to spécifieregex fragmenter due to this probleme of
version I
suppose or the way I am declaring ..highlighting>
because
both of:
Hi khirb,
It might be easi
We have an index of documents from different sources and we want to make
sure the results we display are interleaved from the different sources and
not only ranked based on relevancy.Is there a way to do this ?
Thanks,
S.
Matthew -
Thanks for sharing this example. The Zeta site search works well and
provided results to my test queries instantly.
cheers,
--bemansell
On Fri, Apr 11, 2008 at 10:35 AM, Matthew Runo <[EMAIL PROTECTED]> wrote:
> Hello folks!
>
> First, the link: https://zeta.zappos.com (it's a very ear
Doing this well is harder. Giving a spam score to each page and boosting
by a function on this score is probably a stronger tool.Can't remember
where I found it. Gives a solid spam score algorithm for several
easy-to-code text analyses and a scoring function. This assumes you
pre-process.
Detectin
I've started implementing something to use fuzzy queries for selected fields
in dismax. The request handler spec looks like this:
exact~0.7^4.0 stemmed^2.0
If anyone has already done this, I'd be glad to use it.
I'm working with an older version of Solr, so I won't have a 1.2 patch
right away
It's hard to tell from the info given, though something doesn't sound ideal.
Even if Solr's caching doesn't help, with only 4M documents, your Solr search
slaves should be able to keep the whole index in RAM, assuming your index is
not huge.
How large is the index? (GB on disk)
Is it optimized
Hi,
I have some questions about performance for you guys.
So basically I have 2 slave solr servers and 1 master solr server load
balanced and around 100 request/second, aprox. 50 request per second per
solr server.
My index is about 4 million documents and the average query response time is
0.6 sec
it depends on your definition of "polular" if you mean "occurs in a lot of
documents" then take a look at the LukeRequestHandler ... if can give you
info on terms with high frequencies (and you can use a Shingles based
tokenizer to index "phrase" as terms
if by popular you mean "occurs in a lo
the Lucene Scorers don't keep track of component scores as they go, the
cumulative score is calculated all at once.
For specific documents your plugin could use the SolrIndexSearcher.explain
method to execute logic that will build up a data structure showing the
intermediate calculations.
-H
There is an "Ant script" section on that mySolr page.
But there is no need to use any of that for your project. All you
need is Solr's WAR file and the appropriate Solr configuration files
and you're good to go.
Erik
On Apr 14, 2008, at 9:12 AM, dudes dudes wrote:
thanks Erik,
Thanks for the info. Is the whole result set read into memory meaning that
the number of matches
I can have have for a query is limited by my machine's memory?
Otis Gospodnetic wrote:
>
> Hi,
> rows=N param just tells Solr how many top N results to return. Solr (and
> Lucene, really) still n
Note, this patch has been applied to trunk.
At any rate, here's what I did for this patch:
Downloaded to my Solr patches directory from the JIRA website
My setup looks like:
./patches
./solr-clean #contains a clean copy of Solr. It never has
uncommitted patches on it
On the command line
>c
Grant Ingersoll-6 wrote:
>
> I generally do:
>
> svn up (make sure I am up to date)
> patch -p 0 -i [--dry-run]
>
> I usually do the --dry-run first to see if it applies cleanly, then
> drop it if it does.
>
> HTH,
> Grant
>
> On Apr 13, 2008, at 10:37 AM, khirb7 wrote:
>
>>
>> hello
Tricia Williams is working on this problem in
https://issues.apache.org/jira/browse/SOLR-380, and there is a patch you
can try (instructions at
https://issues.apache.org/jira/browse/SOLR-380?focusedCommentId=12541699
#action_12541699). It uses Lucene payloads to carry the page
information, and requ
Hello,
how I can get count of distinct facet_fields ?
like numFacetFound in this example:
http://localhost:8983/solr/select?q=xxx&rows=0&facet=true&facet.limit=10&facet.field=county
02
3
1
1
5
Thanks
--
View this message in context:
http://ww
On Apr 10, 2008, at 3:48 PM, kirk beers wrote:
Hi Ryan,
I still can't seem to get my solr cores : core0 and core1 to accept
new
documents. I changed the appropriate code in the Perl client to
accommodate
the core as you mentioned in the previous email. I am able to delete
docs. Is there
I have extracted text from .pdf files and I also
inserted page numbers of the .pdf file to the text. My
document looks something like:
..Some Text..
..Some Text..
..
...
I indexed my data using solr and I am making
highli
Hi,
rows=N param just tells Solr how many top N results to return. Solr (and
Lucene, really) still needs to find all documents that match the query and then
score them (and optionally sort them). The more documents and matches you
have, the more time the query will take.
Otis
--
Sematext --
It seems that response time to a query is linear with the size of the result
set even if I always only want
the first 10 hits back.
Testing I did -
1 millions documents that have "feature1" all with the the same score
- query time = 3 seconds to get first 10 hits
10 millions documents t
thanks Erik,
Basically I have used the build file from solr not from that page,... I have
had a look and couldn't really find their build.xml file !
thanks
ak
> From: [EMAIL PROTECTED]
> Subject: Re: issues with solr
> Date: Mon, 14 Apr 2008 08:54:39 -
The mysolr.dist target is defined in the Ant file on that page. My
guess is that you were not using the Ant build file bits there.
My take is that the mySolr page is not quite what folks should be
cloning for incorporation of Solr into their application. Maybe that
page should be removed
Hello there
I'm new to Solr
I'm trying to deploy the example under http://wiki.apache.org/solr/mySolr
.However, every time I issue ant mysolr.dist it generates:
Buildfile: build.xml
BUILD FAILED
Target "mysolr.dist" does not exist in the project "solr".
I'm running Ubuntu getty an
24 matches
Mail list logo