Shalin Shekhar Mangar wrote:
The implementation is a bit more complicated.
1. Read all tokens from the specified field in the solr index.
2. Create n-grams of the terms read in #1 and index them into a separate
Lucene index (spellcheck index).
3. When asked for suggestions, create n-grams of the
Shalin Shekhar Mangar wrote:
If onlyMorePopular=true, then the algorithm finds tokens which have greater
frequency than the searched term. Among these terms, the one which is
closest (by edit distance) is returned.
Okay, this is a bit weird, but I think I got it now. Let me try to
explain it u
Shalin Shekhar Mangar wrote:
And to come back to my last question: There seems to be no case in which
"onlyMorePopular=false" makes sense (provided Grant's assumption is
correct). Do you see one?
Here's a use-case -- you provide a mis-spelled word and you want the closest
suggestion by edit dis
Shalin Shekhar Mangar wrote:
The end goal is to give spelling suggestions. Even if it gave less
frequently occurring spelling suggestions, what would you do with it?
To give you an example:
We have an index for computer games. One title is "gran turismo". The
word "gran" is less frequent in the
Grant Ingersoll wrote:
I believe the reason is b/c when onlyMP is false, if the word itself is
already in the index, it short circuits out. When onlyMP is true, it
checks to see if there are more frequently occurring variations.
This would mean that onlyMorePopular=false isn't useful at all. If
Hi Grant,
thanks for your help.
I have just one more question:
BTW, one workaround is to simply create an index from your file and then
use the IndexBasedSpellChecker. Each line equals one document. You
could even assign weights that way.
In the solrconfig.xml there is a line
field
Can I u
Hello,
I have another question concerning the spell checking mechanism.
Setting onlyMorePopular=true and using the parameters
spellcheck=true&spellcheck.q=gran&q=gran&spellcheck.onlyMorePopular=true
I get the result
1
0
4
13
32
grand
true
which is oka
Hello,
Are you sending in the same query to both? Frequency and word only get
printed when extendedResults == true. correctlySpelled only gets
printed when there is Index frequency information. For the
FileBasedSpellChecker, there is no Frequency information, so it isn't
returned.
Yes, I
Hello,
I'm trying to learn how to use the spell checkers of solr (1.3). I found
out that FileBasedSpellChecker and IndexBasedSpellChecker produce
different outputs.
IndexBasedSpellChecker says
1
0
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png
https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png
https://issues.apache.org/jira/secure/attachment/12393936/logo_remake.jpg
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_
Justin Knoll wrote:
We plan to attempt to rewrite the snappuller (and possibly other
distribution scripts, as required) to eliminate this dependency on SSH.
I thought I ask the list in case anyone has experience with this same
situation or any insights into the reasoning behind requiring SSH ac
If you could live with a cap of 2B on message id, switching to type
"int" would decrease the memory usage to 4 bytes per doc (presumably
you don't need range queries?)
I haven't found exact definitions of the fieldTypes anywhere. Does
"integer" span the common range from -2^31 to 2^31-1?
And t
Hi David,
We're running Solr 1.1 and we're seeing intermittent cases where
Solr stops responding to HTTP requests. It seems like the listener
on port 8983 just doesn't respond.
When we started using solr we encountered the same problem. We are
currently running solr 1.0 (!) with tomcat 5.5 o
Our Solr system is up now since a few days. You can find it at
http://www.booklooker.de/
I'm sorry we have a german user interface only, but maybe if you want to
try out our system you just can fill out some fields in our search form
and press "suchen" on the right side. We are "book brokers" an
Hi,
Chris Hostetter wrote:
This is a fairly typical Lucene issue (ie: not specific to Solr)...
Ah, I see. I should really put more attention on Lucene. But when
working with Solr I sometimes forget about the underlying technology.
Sorting on a field requires building a FieldCache for every d
Hello,
I have a new problem with OutOfMemory errors.
As I reported before, we have an index with more than 10 million
documents and 23 fields. Recently I added a new field which we will only
use for sorting purposes (by "adding" I mean building a new index). But
it turned out that every query
Talking about configuration and system properties: is it possible to set
the log level of Solr's logger from a system property? Or is there any
other way to change this level during the start of the servlet container?
Thanks,
Marcus
Erik Hatcher wrote:
I believe that Solr indexes one document at a time; each document
requires a separate HTTP POST.
Actually adding multiple documents per POST is possible
But deleting multiple documents with just one POST is not possible,
right? Is there a special reason for that or is it be
Chris Hostetter wrote:
correct .. we thought we can impliment something that looked at the war
file name easily ... but then we were set straight -- there is no portable
way to do that, hence we came up with the current JNDI plan which isn't
quite as "out of the box" as we had hoped, but it has t
Yonik Seeley wrote:
I am hoping I can change the default location for each webapp. Thanks!
It's not yet possible, but see this thread:
http://www.mail-archive.com/solr-dev@lucene.apache.org/msg00298.html
If I see it right, if I just rename the webapp to, say, "solrfoo" then
it still uses the s
On 5/4/06, I wrote:
> From my point of view it looks like this: Revision 393957 works while
> the latest revision cause problems. I don't know what part of the
> distribution causes the problems but I will try to find out. I think a
> good start would be to find out which was the first revision no
Yonik Seeley wrote:
> If you start from a normal tomcat distribution, we will be able to
> eliminate that difference.
Yes, I finally got Solr working with Tomcat.
But there are still two minor problems.
The first appears when I try to get the statistics page.
I'm getting this error message:
org.a
Chris Hostetter wrote:
This is because building a full Solr distribution from scratch requires
that you have JUnit. Bt it is not required to run Solr.
Ah, I see. That was a very valuable hint for me.
I was able now to compile an older revision (393957). Testing this
revision I was able to dele
Yonik Seeley wrote:
Is your problem reproducable with a test case you can share?
Well, you can get the configuration files. If you ask for the data, this
could be a problem, since this is "real" data from our production
database. The amount of data needed could be another problem.
You could al
Hello,
deleting or updating documents is still not possible for me so now I
tried to built a completely new index. Unfortunately this didn't work
either. Now I'm getting OOM after inserting slightly more than 20,000
documents to the new index.
To me this looks as if a bug has been introduced
Chris Hostetter wrote:
> this is off the subject of the heap space issue ... but if the id changes,
> then maybe it shouldn't be the uniqueId of your index? .. your code must
> have someone of recognizing that article B with id 222 is a changed
> version of article A with id 111 (otherwise how woul
Yonik Seeley wrote:
Yes, on a delete operation. I'm not doing any commits until the end of
all delete operations.
I assume this is a delete-by-id and not a delete-by-query? They work
very differently.
Yes, all queries are delete-by-id.
If you are first deleting so you can re-add a newer ve
Chris Hostetter wrote:
> interesting .. are you getting the OutOfMemory on an actual delete
> operation or when doing a commit after executing some deletes?
Yes, on a delete operation. I'm not doing any commits until the end of
all delete operations.
After reading this I was curious if using commi
Chris Hostetter wrote:
How big is your physical index directory on disk?
It's about 2.9G now.
Is there a direct connection between size of index and usage of ram?
Your best bet is to allocate as much ram to the server as you can.
Depending on how full your caches are, and what hitratios you ar
Yonik Seeley wrote:
>I think you are probably right about Jetty timing out the request.
>Solr doesn't implement timeouts for requests, and I havent' seen this
>behavior with Solr running on Resin.
>
>You could try another app server like Tomcat, or perhaps figure out of
>the Jetty timeout is config
Hello,
when doing a commit or optimize the operation takes quite long (in my
test case at least some minutes). When I submit the command via curl, I
get the response "curl: (52) Empty reply from server" though solr is
still working (as I can see from the process list and the admin
interface). I tr
Yonik Seeley wrote:
> OK, I think I fixed this bug. Haven't added a test case yet...
In our test case everything works properly now.
Thanks for the quick bugfix!
Marcus
> Yes, I believe the Wiki has an example like this (a uniqueKey field
> not named "id")
Right, I should have looked there, too.
> > But after a I found the number of documents unchanged
> > in the stats.
> What stat? maxDoc may be unchanged since it doesn't reflect deleted
> documents that haven
Hello,
I have a problem deleting documents from the index.
In the tutorial "SP2514N" is used as an
example for deleting. I was wondering if "" is some kind of
keyword or the name of a field (in the example, a unique field
named "id" is used). In my config I have the line
bookID
making bookID (ty
> Solr looks in the current working directory for the solrconf
> directory, so it depends where that ends up when tomcat is started.
Meanwhile I found out that tomcat is located in /usr/share/tomcat5 and
that there is a bin-directory in it, which I was searching for. A
handfull of links are pointin
Hi,
I have a tomcat5 running under linux (debian). I think that
my configuration may be wrong, because I don't get solr running.
Yonik Seeley wrote:
>the layout should look something like this:
>
>tomcat/webapps/solr.war
>tomcat/solrconf/solrconfig.xml, schema.xml, etc
>tomcat/bin/startup.sh
>
>t
36 matches
Mail list logo