Hi,
This is in continuation with my previous mail.
Iam using the SOLRInputDocument to perform the index operation.
So, my question if a field to be indexed contains multiple values,
then does the SOLRInputDocument performs the index for each word for that
field
or does it does for the set of wor
Hi,
Consider the scenario: I have indexed a document with a field1 having the
values as "Test solr search " (having multiple words)
And when i perform the keyword search as "Test solr search" i do get the
results,
whereas when i do the search for the "Test", i dont get any results,
Any quick inp
Tomcat unpacks the jar into the webapps directory based off the context name
anyway...
What was the original thinking behind not having solr/home set in the web.xml
-- seems like an easier way to deal with this.
I would imagine most people are more familiar with setting params in web.xml
than
: It works excellently in Tomcat 6. The toughest thing I had to deal with is
: discovering that the environment variable in web.xml for solr/home is
: essential. If you skip that step, it won't come up.
no, there's no reason why you should need to edit the web.xml file ... the
solr/home property
: It's a trivial fix, and it seems like it's already been done in trunk:
:
:
http://svn.apache.org/viewvc/lucene/solr/trunk/src/scripts/commit?r1=543259&r2=555612&view=patch
:
: The change has not been applied to 1.2. It might be nice if it were.
i'm not sure what you mean by "applied to 1
: I recently set up Solr with distribution on a couple of servers. I just
: learned that our network policies do not permit us to use SSH with
: passphraseless keys, and the snappuller script uses SSH to examine the master
: Solr instance's state before it pulls the newest index via rsync.
you ma
: It does make some sense, but I'm not sure that it should be blindly analyzed
: without adding logic to handle certain cases (like the QueryParser does).
: What happens if the analyzer produces two tokens? The spellchecker has to
: deal with this appropriately. Spell checkers should be able to
Thanks a ton, that worked
-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 04, 2007 3:08 PM
To: solr-user@lucene.apache.org
Subject: Re: SOLR sorting - question
Kasi Sankaralingam wrote:
> Do I need to select the fields in the query that I am trying
Is anyone managing solr projects with maven? I see:
https://issues.apache.org/jira/browse/SOLR-19
but that is >1 year old
If someone has a current pom.xml, can you post it on SOLR-19?
I just started messing with maven, so I don't really know what I am
doing yet.
thanks
ryan
Kasi Sankaralingam wrote:
Do I need to select the fields in the query that I am trying to sort on?, for
example if I want sort on update date then do I need to select that field?
I don't think so... are you getting an error?
I run queries like:
/select?q=*:*&fl=name&sort=added desc
without p
I don't think you have to. Just try the query on the REST interface and you
will know.
On Dec 5, 2007 9:56 AM, Kasi Sankaralingam <[EMAIL PROTECTED]> wrote:
> Do I need to select the fields in the query that I am trying to sort on?,
> for example if I want sort on update date then do I need to se
Do I need to select the fields in the query that I am trying to sort on?, for
example if I want sort on update date then do I need to select that field?
Thanks,
Hi,
I had to work with this kind of sides effects reguarding multiwords synonyms.
We installed solr on our project that extensively uses synonyms, a big
list that sometimes could bring out some wrong match as the one
noticed by Anuvenk
for instance
> dui => drunk driving defense
> or
> dui,drunk
It seems to me that another way to write the formula -- borrowing
Python syntax -- is:
4 * numDocs + 38 * len(uniqueTerms) + 2 * sum([len(t) for t in uniqueTerms])
That's 4 bytes per document, plus 38 bytes per term, plus 2 bytes *
the sum of the lengths of the terms. (Numbers taken from
http://m
Thanks for the suggestion, Dennis. I decided to implement this as you
described on my collection of about 400,000 documents, but I did not
receive the results I expected.
Prior to putting the indexes on a tmpfs, I did a bit of benchmarking and
found that it usually takes a little under two sec
> See Lucene's FieldCache.StringIndex
To understand just what's getting stored for each string field, you
may also want to look at the createValue() method of the inner Cache
object instantiated as stringsIndexCache in FieldCacheImpl.java (line
399 in HEAD):
http://svn.apache.org/viewvc/lucene/ja
On Dec 4, 2007 3:11 PM, Norskog, Lance <[EMAIL PROTECTED]> wrote:
> "String[nTerms()]": Does this mean that you compare the first term, then
> the second, etc.? Otherwise I don't understand how to compare multiple
> terms in two records.
Lucene sorting only supports a single term per document for
"String[nTerms()]": Does this mean that you compare the first term, then
the second, etc.? Otherwise I don't understand how to compare multiple
terms in two records.
Lance
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Tuesday, Decemb
int[maxDoc()] + String[nTerms()] + size_of_all_unique_terms.
Then double that to allow for a warming searcher.
This is great, but can you help me parse this? Assume 8M docs and I'm
sorting on an int field that is unix time (seonds since epoch.) For
the purposes of the experiment assume eve
Thanks!
I've seen a few formulae like this go by over the months. Can someone
please make a wiki page for memory and processing estimation with
locality properties? Or is there a Lucene page we can use?
Lance
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behal
Oops, I should explain. *:* means all records. This trick puts a
positive query in front of your negative query, and that allows it to
work.
Lance
-Original Message-
From: Rob Casson [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 04, 2007 7:44 AM
To: solr-user@lucene.apache.org
Subj
Thanks, this is very interesting idea. But my index folder is about 30Gb. Max
RAM I could get probably is 16Gb. Rest could be in swap, but I think it will
kill the whole idea.. May be it will be useful to put just some files from
index folder to RAM? If this is possible at all))...
- Origi
Any suggestions are helpful to me,. even general.. Here is the info from my
index:
How big is the index on disk (the most important files are .frq,
and .prx if you do phrase queries?
- Total index folder size is 30.7 Gb
- .frq is 12.2 Gb
- .prx is 6 Gb
How big and what exactly is a record in
On 4-Dec-07, at 8:43 AM, Evgeniy Strokin wrote:
Hello,...
we have 110M records index under Solr. Some queries takes a while,
but we need sub-second results. I guess the only solution is cache
(something else?)...
We use standard LRUCache. In docs it says (as far as I understood)
that it lo
Wow. So I feel stupid. Sorry to waste your time =p
--Matthew
On Dec 4, 2007, at 10:36 AM, Ryan McKinley wrote:
did you try 'ant clean' before running 'ant dist'?
the method signature for SortSpec changed recently
Matthew Runo wrote:
Ooops, I get this error when I try to search an index
did you try 'ant clean' before running 'ant dist'?
the method signature for SortSpec changed recently
Matthew Runo wrote:
Ooops, I get this error when I try to search an index with a few
documents in it.
ie..
http://dev14.zappos.com:8080/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&
On 4-Dec-07, at 8:10 AM, Brian Carmalt wrote:
Hello,
I am also fighting with heap exhaustion, however during the
indexing step. I was able to minimize, but not fix the problem
by setting the thread stack size to 64k with "-Xss64k". The minimum
size is os specific, but the VM will tell
you
Ooops, I get this error when I try to search an index with a few
documents in it.
ie..
http://dev14.zappos.com:8080/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
caching : true
numDocs : 5
maxDoc : 5
readerImpl : MultiReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/so
Hello!
I'm trying to make use of SOLR 1.3, svn trunk, and get the following
error.
SEVERE: java.lang.NoSuchMethodError:
org.apache.solr.search.QParser.getSort(Z)Lorg/apache/solr/search/
QueryParsing$SortSpec;
at
org
.apache
.solr.handler.component.QueryComponent.prepare(QueryComponent
The first step is to look at what searches are taking too long, and
see if there is a way to structure them so they don't take as long.
The whole index doesn't have to be in memory to get good search
performance, but 100M documents on a single server is big. We are
working on distributed search (
It works excellently in Tomcat 6. The toughest thing I had to deal with is
discovering that the environment variable in web.xml for solr/home is
essential. If you skip that step, it won't come up.
solr/home
java.lang.String
F:\Tomcat-6.0.14\webapps\solr
- Origi
One way to do this if you are running on linux is to create a tempfs
(which is ram) and then mount the filesystem in the ram. Then your
index acts normally to the application but is essentially served from
Ram. This is how we server the Nutch lucene indexes on our web search
engine (www.visvo
Hello,...
we have 110M records index under Solr. Some queries takes a while, but we need
sub-second results. I guess the only solution is cache (something else?)...
We use standard LRUCache. In docs it says (as far as I understood) that it
loads view of index in to memory and next time works with
On Dec 4, 2007, at 10:37 AM, Wagner,Harry wrote:
Anyone know of a problem with faceting on more than 1 field and
using a
different facet.limit for each field? I'm using a query like:
...facet=true&facet.mincount=1&facet.limit=15&facet.field=fpn&facet.li
mi
t=-1&facet.field=ln&facet.limit=
Hello,
I am also fighting with heap exhaustion, however during the indexing
step. I was able to minimize, but not fix the problem
by setting the thread stack size to 64k with "-Xss64k". The minimum size
is os specific, but the VM will tell
you if you set the size too small. You can try it, it
On Dec 4, 2007 10:59 AM, Brian Whitman <[EMAIL PROTECTED]> wrote:
> >
> > For faceting and sorting, yes. For normal search, no.
> >
>
> Interesting you mention that, because one of the other changes since
> last week besides the index growing is that we added a sort to an
> sint field on the queri
For faceting and sorting, yes. For normal search, no.
Interesting you mention that, because one of the other changes since
last week besides the index growing is that we added a sort to an
sint field on the queries.
Is it reasonable that a sint sort would require over 2.5GB of heap on
On Dec 4, 2007 10:46 AM, Brian Whitman <[EMAIL PROTECTED]> wrote:
> Are there 'native' memory requirements for solr as a function of
> index size?
For faceting and sorting, yes. For normal search, no.
-Yonik
This maybe more of a general java q than a solr one, but I'm a bit
confused.
We have a largish solr index, about 8M documents, the data dir is
about 70G. We're getting about 500K new docs a week, as well as about
1 query/second.
Recently (when we crossed about the 6M threshold) resin has
i'm using this:
*:* -[* TO *]
which is what lance suggested..works just fine.
fyi: https://issues.apache.org/jira/browse/SOLR-381
On Dec 3, 2007 8:09 PM, Norskog, Lance <[EMAIL PROTECTED]> wrote:
> Wouldn't this be: *:* AND "negative query"
>
>
> -Original Message-
> From: [EMAIL PROTEC
On Dec 4, 2007 5:02 AM, Maciej Szczytowski
<[EMAIL PROTECTED]> wrote:
> Hi, I use Solr 1.1 application for indexing russian documents. Sometimes
> I've got as search results docs with invalid character.
>
> For example I've indexed "иго" but search returned "и��о". It's strange
> because something
On Dec 4, 2007 5:39 AM, Dilip.TS <[EMAIL PROTECTED]> wrote:
> When i use both the Keyword search and the facet search together in a same
> search operation,
> I dont get any results whereas if i perform them seperately, i could get
> back the results.
add debugQuery=on to your requests (and chan
Hi,
The default field separator seems to be a '.' when highlighting
multi-value fields. Can this be overridden in 1.2 to another character?
Thanks!
harry
Hi,
Considering the following scenario where i need to use keyword search on
fields title and description with the keyword typed as testing
And using the search on fields price, publisher and tag , the fields
publisher and tag being selected for the facet searching
If the constructed queryStr
I can't answer the question, but I *can* guarantee that
the people who can will give you *much* better
responses if you include some details. Like which
analyzers you use, how you submit the query,
samples of the two queries that work and the
one that doesn't.
Imagine you're on the receiving end i
Hi,
When i use both the Keyword search and the facet search together in a same
search operation,
I dont get any results whereas if i perform them seperately, i could get
back the results.
Is it a constraint from the SOLR point of view?
Thanks in advance.
Regards,
Dilip TS
Hi, I use Solr 1.1 application for indexing russian documents. Sometimes
I've got as search results docs with invalid character.
For example I've indexed "иго" but search returned "и��о". It's strange
because something has changed 2 bytes into 6 bytes.
иго - D0 B8 D0 B3 D0 BE
и��о - D0 B8 EF
47 matches
Mail list logo