Hi All,
I want to change the duplicate content behavior in solr. What I want to
do is:
1) I don't want duplicate content.
2) I don't want to overwrite old content with new one.
Means, if I add duplicate content in solr and the content already
exists, the old content should not be overwritten.
You must do a check before adding documents
On Tue, Jul 15, 2008 at 1:15 PM, Sunil <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I want to change the duplicate content behavior in solr. What I want to
> do is:
>
> 1) I don't want duplicate content.
> 2) I don't want to overwrite old content with new on
On Tue, 15 Jul 2008 13:15:41 +0530
"Sunil" <[EMAIL PROTECTED]> wrote:
> 1) I don't want duplicate content.
SOLR uses the field you define as the unique field to determine whether a
document should be replaced or added. The rest of the fields are in your hands.
You could devise a setup whereby the
Hi,
Apologies if you are receiving it second time...having tough time with mail
server..
I take a user entered query as it is and run it with dismax query handler.
The documents fields have been filled from structured data, where different
fields have different attributes like number of beds, num
Norberto Meijome pisze:
>> 2) I don't want to overwrite old content with new one.
>>
>> Means, if I add duplicate content in solr and the content already
>> exists, the old content should not be overwritten.
>
> before inserting a new document, query the index - if you get a result back,
> then
On Tue, 15 Jul 2008 10:48:14 +0200
Jarek Zgoda <[EMAIL PROTECTED]> wrote:
> >> 2) I don't want to overwrite old content with new one.
> >>
> >> Means, if I add duplicate content in solr and the content already
> >> exists, the old content should not be overwritten.
> >
> > before inserting a n
Chris,
On Sat, Jan 26, 2008 at 2:30 AM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
> : I have the synonym filter only at query time coz i can't re-index data (or
> : portion of data) everytime i add a synonym and a couple of other reasons.
>
> Use cases like yours will *never* work as a query time
swarag wrote:
>
> Knowing the Lucene struggles with multi-word query-time synonyms, my
> question is, does this also affect index-time synonyms? What other
> alternatives do we have if we require there to be multiple word synonyms?
>
No the multiple word problem doesn't happen with index synon
thanks ! I think I fixed the issue and it's doing good :)
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> Subject: RE: Solr searching issue..
> Date: Mon, 14 Jul 2008 20:12:00 +
>
> Copy field dest="text". I am not sure if u can copy int
On Jul 15, 2008, at 4:45 AM, Preetam Rao wrote:
What are your thoughts on having one more request handler like
dismax, but
which uses a sub-phrase query instead of dismax query ?
It'd be better to just implement a QParser(Plugin) such that the
StandardRequestHandler can use it (&defType=di
I'm using Solr with a Drupal site, and one of the fields in the schema is
"type".
In my example development site, searching for the word "fish" returns 2
documents, one type='story', and the other type='idea'.
If I filter by type:idea then I get 9 results, the correct first result,
followed by 8
I agree. If we do decide to implement another kind of request handler, it
should be through StandardRequesthandler def type attribute, which selects
the registered QParser which generates appropriate queries for lucene.
Preetam
On Tue, Jul 15, 2008 at 3:59 PM, Erik Hatcher <[EMAIL PR
Hi Matt,
Other than applying one more fq, is everything else remains same between the
two queries, like q and all other parameters ?
My understanding is that, fq is an intersection on the set of results
returned from q. So it should always be a subset of results returned from q.
So if one uses ju
Yes, the same, except for the filter.
For example:
http://localhost:8983/solr/select?q=fish
returns:
etc (followed by 2
docs)
http://localhost:8983/solr/select?q=fish+type:idea
returns:
. (followed by 9
docs)
-Matt
Preetam Rao wrote:
>
> Hi Matt,
>
> Other than applying o
hi
is it preferable to compress each and every field, if not why.?
how exactly it helps?
--
View this message in context:
http://www.nabble.com/which-type-of-fields-are-to-be-compressed-tp18464056p18464056.html
Sent from the Solr - User mailing list archive at Nabble.com.
hi,
in databases, sorting based on text fields is faster and preferable, if i am
not wrong.
similarly, which type of fields are to be chosen to sort in 'solr'? how the
ties are broken?
sorry for mistakes, if any ..
thank you
--
View this message in context:
http://www.nabble.com/solr%3Asorting
Hi Matt,
When I say filter, I meant q=fish&fq=type:idea
What you are trying is a boolean OR of defaultsearchfield.:fish OR
type:idea.
Its not a filter, its an OR. Obviously you will get a union of results...
--
Preetam
On Tue, Jul 15, 2008 at 5:37 PM, matt connolly <[EMAIL PROTECTED]>
Of course - it's so obvious now. Thanks!
--
View this message in context:
http://www.nabble.com/Filter-by-Type-increases-search-results.-tp18462188p18464457.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks guys.
-Original Message-
From: Norberto Meijome [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 15, 2008 2:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Duplicate content
On Tue, 15 Jul 2008 10:48:14 +0200
Jarek Zgoda <[EMAIL PROTECTED]> wrote:
> >> 2) I don't want to overwri
Compression is only relevant for the original text, not the indexed
part. So in terms of searching, it's irrelevant.
Where it is relevant is when you *fetch* the document (e.g.
doe = hits.doc(32)), the de-compression work is done (for
stored documents). Depending upon your app, this may or
may not
Hi,
as I understand the WordDelimiterFilter should split on case changes, word
delimiters and changes from character to digit, but it should not
differentiate between ASCII and multibyte chars. It does however. The word
"hälse" (german plural of "neck") gets split into "h", "ä" and "lse", which
un
On Jul 15, 2008, at 2:45 AM, Sunil wrote:
Hi All,
I want to change the duplicate content behavior in solr. What I want
to
do is:
1) I don't want duplicate content.
2) I don't want to overwrite old content with new one.
Means, if I add duplicate content in solr and the content already
exis
On Tue, 15 Jul 2008 18:07:43 +0530
"Preetam Rao" <[EMAIL PROTECTED]> wrote:
> When I say filter, I meant q=fish&fq=type:idea
btw, this *seems* to only work for me with standard search handler. dismax and
fq: dont' seem to get along nicely... but maybe, it is just late and i'm not
testing it pro
Since we pushed Solr out to production a few weeks ago, we've seen a
few issues with Solr not responding to requests (searches or admin
pages). There doesn't seem to be any reason for it from what we can
tell. We haven't seen it in QA or development.
We're running Solr with basically the
If a sort is not specified then documents are returned in decreasing order
of their score. You can get more details on the scoring at
http://lucene.apache.org/java/docs/scoring.html
On Tue, Jul 15, 2008 at 6:03 PM, sumantht <[EMAIL PROTECTED]> wrote:
>
> hi,
> in databases, sorting based on text
Doug Steigerwald pisze:
> We're running Solr with basically the example Solr setup with Jetty
> (6.1.3). We package our Solr install by using 'ant example' and
> replacing configs/etc. Whenever Solr stops responding, there are no
> messages in the logs, nothing. Requests just time out.
>
> We
Thanks. Do we expect the same some time soon. I agree that the user community
have shed light in with a lot of examples. Just wanna know if there was more
that could be done. I am looking at the java docs of the same too and that
helps to some extent. But have felt the wiki was very very useful
I constantly have the same problem; sometimes I have OutOfMemoryError
in logs, sometimes
not. Not-predictable. I minimized all caches, it still happens even
with 8192M. CPU usage
is 375%-400% (two double-core Opterons), SUN Java 5. Moved to BEA
JRockit 5 yesterday,
looks 30 times faster (25%
Thanks Ryan,
Is really unique if we allow duplicates? I had similar problem...
Quoting Ryan McKinley <[EMAIL PROTECTED]>:
On Jul 15, 2008, at 2:45 AM, Sunil wrote:
Hi All,
I want to change the duplicate content behavior in solr. What I want to
do is:
1) I don't want duplicate content.
2
We haven't seen an OutOfMemoryError. The load on the server doesn't
go up either (hovers around 1-2). We're on Java 1.6.0_03-b05.
4x3.8GHz Xeons, 8GB RAM.
Doug
On Jul 15, 2008, at 11:29 AM, Fuad Efendi wrote:
I constantly have the same problem; sometimes I have
OutOfMemoryError in logs
Can we collect more information. It would be nice to know what the
threads are doing when it hangs.
If you are using *nix issue kill -3
it would print out the stacktrace of all the threads in the VM . That
may tell us what is the state of each thread which could help us
suggest something
On Tue
Hi all,
I've been trying to return a field of type ExternalFileField in the search
result. Upon examining XMLWriter class, it seems like Solr can't do this out
of the box. Therefore, I've tried to hack Solr to enable this behaviour.
The goal is to call to ExternalFileField.getValueSource(SchemaFie
Following lines are strange, looks like SOLR deals with OOM and
rethrows own exception (so that in some cases JVM simply hangs instead
of exit):
Apr 4, 2008 1:20:53 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space
This is full Thread Dump a
Hi Stefan,
I wrote a test case for the problem you described but it is working fine. I
used the following definition:
What configuration are you using? If it is different, please share it so
that I can test with it.
On Tue, Jul 15, 2008 at 7:59 PM, Stefan Oestreicher <
[EMAIL PROTECTED]> wrote
On Tue, Jul 15, 2008 at 10:29 AM, Stefan Oestreicher
<[EMAIL PROTECTED]> wrote:
> as I understand the WordDelimiterFilter should split on case changes, word
> delimiters and changes from character to digit, but it should not
> differentiate between ASCII and multibyte chars. It does however. The wo
I suspect that SolrException is used to catch ALL exceptions in order
to show "500 OutOfMemory" in HTML/XML/JSON etc., so that JVM simply
hangs... weird HTTP understanding...
Quoting Fuad Efendi <[EMAIL PROTECTED]>:
Following lines are strange, looks like SOLR deals with OOM and
rethrows
matt connolly wrote:
>
>
> swarag wrote:
>>
>> Knowing the Lucene struggles with multi-word query-time synonyms, my
>> question is, does this also affect index-time synonyms? What other
>> alternatives do we have if we require there to be multiple word synonyms?
>>
>
> No the multiple word p
On Tue, Jul 15, 2008 at 11:10 AM, Norberto Meijome <[EMAIL PROTECTED]> wrote:
> On Tue, 15 Jul 2008 18:07:43 +0530
> "Preetam Rao" <[EMAIL PROTECTED]> wrote:
>
>> When I say filter, I meant q=fish&fq=type:idea
>
> btw, this *seems* to only work for me with standard search handler. dismax
> and fq:
You won't have the multiple word problem if you use synonyms at index time
instead of query time.
swarag wrote:
>
> Here is a basic example of some synonyms in my synonyms.txt:
> club=>club,bar,night cabaret
> bar=>bar,club
>
> As you can see, a search for 'bar' will return any documents with
On Jul 15, 2008, at 10:31 AM, Fuad Efendi wrote:
Thanks Ryan,
Is really unique if we allow duplicates? I had similar
problem...
if you allowDups, then uniqueKey may not be unique...
however, it is still used as the key for many items.
Quoting Ryan McKinley <[EMAIL PROTECTED]>:
O
Just as a sample, SolrCore contains blocks like
} catch (Throwable e) {
SolrException.logOnce(log,null,e);
}
And SolrServlet:
} catch (Throwable e) {
SolrException.log(log,e);
sendErr(500, SolrException.toStr(e), request, response);
}
What will happen with OutOfMemoryError? I
Hi
With some strange reason hotmail doesn't send any XML tags through. I have
attached a file with all the necessary xml tags there , thanks :)
I have a rare situation and I'm not too sure how to resolve it.
I have defined 2 fields.. one is call userID and the other one is called
companyID in
matt connolly wrote:
>
> You won't have the multiple word problem if you use synonyms at index time
> instead of query time.
>
>
> swarag wrote:
>>
>> Here is a basic example of some synonyms in my synonyms.txt:
>> club=>club,bar,night cabaret
>> bar=>bar,club
>>
>> As you can see, a search
: Thanks. Do we expect the same some time soon. I agree that the user
: community have shed light in with a lot of examples. Just wanna know if
: there was more that could be done. I am looking at the java docs of the
: same too and that helps to some extent. But have felt the wiki was very
:
Sorry for bunch of short self-replies, just trying to analyse...
CPU may get overloaded by constantly running GC trying to
defragment&optimize memory, in a loop (constant queue of requests);
response time will be few minutes (in best cases) and contain 500...
so that sometimes we can't see
On Tue, Jul 15, 2008 at 2:27 PM, swarag <[EMAIL PROTECTED]> wrote:
> To my understanding, this means I am using synonyms at index time and NOT
> query time. And yet, I am still having these problems with synonyms.
Can you give a specific example? Use debugQuery=true to see what the
resulting quer
THANKS!!!
> Date: Tue, 15 Jul 2008 11:38:06 -0700> From: [EMAIL PROTECTED]> To:
> solr-user@lucene.apache.org> Subject: RE: Wiki for 1.3> > > : Thanks. Do we
> expect the same some time soon. I agree that the user > : community have shed
> light in with a lot of examples. Just wanna know if > :
Hi-
I'm messing with spellchecking and running into behavior that seems
peculiar. We have an index with many words including:
"swim" and "slim"
If I search for "slim", it returns "swim" as an option -- likewise, if
I search for "slim" it returns "swim"
why does it check words that are in
On Jul 15, 2008, at 3:49 PM, Ryan McKinley wrote:
Hi-
I'm messing with spellchecking and running into behavior that seems
peculiar. We have an index with many words including:
"swim" and "slim"
If I search for "slim", it returns "swim" as an option -- likewise,
if I search for "slim" it
On Tue, Jul 15, 2008 at 4:19 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> agreed, but there is a problem in Solr, AIUI, with regards to when the
> readers are available and when inform() gets called. The workaround is to
> have a warming query, I believe.
Right... see https://issues.apache.or
Also see https://issues.apache.org/jira/browse/SOLR-622
On Wed, Jul 16, 2008 at 2:25 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Tue, Jul 15, 2008 at 4:19 PM, Grant Ingersoll <[EMAIL PROTECTED]>
> wrote:
> > agreed, but there is a problem in Solr, AIUI, with regards to when the
> > readers a
Multiple uniqueKeys are not supported. You must use only one field as the
uniqueKey.
On Tue, Jul 15, 2008 at 11:52 PM, dudes dudes <[EMAIL PROTECTED]> wrote:
>
> Hi
>
> With some strange reason hotmail doesn't send any XML tags through. I have
> attached a file with all the necessary xml tags the
Yonik Seeley wrote:
>
> On Tue, Jul 15, 2008 at 2:27 PM, swarag <[EMAIL PROTECTED]>
> wrote:
>> To my understanding, this means I am using synonyms at index time and NOT
>> query time. And yet, I am still having these problems with synonyms.
>
> Can you give a specific example? Use debugQuery=
Hi,
I think the reason was indeed maxPendingDeletes which was configured to
1000.
After having updated to a solr nightly build with Lucene 2.4, the issue
seems to have disappeared.
Thanks for your advices.
--
Renaud Delbru
Mike Klaas wrote:
On 1-Jul-08, at 10:44 PM, Chris Hostetter wrote:
I have a use case where I want to spellcheck the input query across
multiple fields:
Did you mean: location = washington
vs
Did you mean: person = washington
The current parameter / response structure for the spellcheck
component does not support this kind of thing. Any thoughts on how/i
One way would be to create a copyField containing both the fields and use it
as the dictionary's source.
If you do want to keep separate dictionaries for both the fields then I
guess we can introduce per-dictionary overridable parameters like the
per-field overridden facet parameters. That would b
56 matches
Mail list logo