Yeah i meant filter-cache, thanks.
It seemed that the particular field (cityname) was using a keywordtokenizer
(which doens't show at the front) which is why i missed it i guess :-S. This
means the term field is tokenized so termEnums-apporach is used. This
results in about 10.000 inserts on face
On 11-Oct-07, at 7:38 PM, BrendanD wrote:
Unfortunately pretty much ALL of our fields are multi-valued. A
product can
exist in multiple categories, be sold by multiple merchants, and have
multiple attributes with multiple attribute values assigned to it.
E.g. an
iPod in a special Gifts cat
On 11-Oct-07, at 8:38 PM, BrendanD wrote:
Mike Klaas wrote:
Any reason why you are faceting on a field that you are restricting?
Clearly, the answer will be '1001644' --> , (all other
categories) -> 0. Just use numFound.
Also, if there can only be one category per doc, make sure you are
using
Mike Klaas wrote:
>
> On 11-Oct-07, at 6:47 PM, BrendanD wrote:
>
>>
>> Hi,
>>
>> I have the following query that I've found in my production logs:
>>
>> INFO: /select/
>> rows=0&start=0&f.category_id.facet.limit=-1&facet=true&facet.field=cat
>> egory_id&fq=product_is_active:true&fq=product_st
Another route to getting the number of documents is to get it from
the LukeRequestHandler:
http://localhost:8983/solr/admin/luke?numTerms=0 (numTerms=0 to get
the fastest response possible)
Erik
On Oct 10, 2007, at 10:19 AM, Stefan Rinner wrote:
Hi
for some tests I need to know
On 11-Oct-07, at 6:47 PM, BrendanD wrote:
Hi,
I have the following query that I've found in my production logs:
INFO: /select/
rows=0&start=0&f.category_id.facet.limit=-1&facet=true&facet.field=cat
egory_id&fq=product_is_active:true&fq=product_status_code:complete&fq=
category_id:"1001644"&
Hi,
I have the following query that I've found in my production logs:
INFO: /select/
rows=0&start=0&f.category_id.facet.limit=-1&facet=true&facet.field=category_id&fq=product_is_active:true&fq=product_status_code:complete&fq=category_id:"1001644"&fq=attribute_id_value_en_pair:"1005585\:Wedding"&
Hi,
Is there a difference in the performance for the following 2 variations on
query syntax? The first query was a response from Solr by using a single fq
parameter in the URL. The second query was a response from Solr by using
separate fq parameter in the URL, one for each field.
product_is_ac
: ..fq=country:france
:
: do these queries share cached items in the fieldcache? (in this example:
: country:france) or do they somehow live as seperate entities in the cache?
: The latter would explain my fieldcache having evictions at the moment.
FieldCache can't have evicitions. it's a reall
On 11-Oct-07, at 4:34 PM, Ravish Bhagdev wrote:
Hi Mike,
Thanks for your reply :)
I am not an expert of either! But, I understand that Nutch stores
contents albeit in a separate data structure (they call segment as
discussed in the thread), but what I meant was that this seems like
much more e
Hi Mike,
Thanks for your reply :)
I am not an expert of either! But, I understand that Nutch stores
contents albeit in a separate data structure (they call segment as
discussed in the thread), but what I meant was that this seems like
much more efficient way of presenting summaries or snippets (o
How can I add a field name to query dynamicly?
Examle: If user types "in stock" replace it with "quantity:[1 TO *]"
--
View this message in context:
http://www.nabble.com/Add-fields-to-query-when-processing-tf4610578.html#a1317
Sent from the Solr - User mailing list archive at Nabble.com.
When searching data, I need to process field name similar to processing
terms.
Lower-case of field name so field name is not case sensitive (all fileld
names are lower case when indexed),
have "synonyms" for field names, example-if user types article:abc or user
types content:abc it both cases it
First, it should be noted that I am not an expert in Nutch's
architure. I do think I understand what is being said there, however.
Nutch is a distributed web search engine, and uses lucene as a
indexing component. It is free to use external data structures to
store data, and can store the
Yes, we have some huge performance issues with non-cached queries. So doing a
commit is very expensive for us. We have our autowarm count for our
filterCache and queryResultCache both set to 4096. But I don't think that's
near high enough. We did have it as high as 16384 before, but it took over
a
Hey guys,
Checkout this thread I opened on nutch mailing list. Looks like Solr
can benefit from reusing Nutch's "segment" based storage strategy for
efficiency in returning snippets, summaries etc without using Lucene
stored fields?
Was this considered before?
Ravish
-- Forwarded messa
On 11-Oct-07, at 2:47 PM, BrendanD wrote:
Hi,
Is it possible to send a command to have Solr flush the deletesPending
documents without doing a commit? I know there's a setting in
solrconfig.xml
for setting a threshold value, but I'd like to somehow kick it off on
demand. We need this to be
On 11-Oct-07, at 2:37 PM, Yonik Seeley wrote:
On 10/11/07, Mike Klaas <[EMAIL PROTECTED]> wrote:
I'm seeing some interesting behaviour when doing benchmarks of query
and facet performance. Note that the query cache is disabled, and
the index is entirely in the OS disk cache. filterCache is fu
Hi,
Is it possible to send a command to have Solr flush the deletesPending
documents without doing a commit? I know there's a setting in solrconfig.xml
for setting a threshold value, but I'd like to somehow kick it off on
demand. We need this to be able to remove merchants from our product
listin
On 10/11/07, Mike Klaas <[EMAIL PROTECTED]> wrote:
> I'm seeing some interesting behaviour when doing benchmarks of query
> and facet performance. Note that the query cache is disabled, and
> the index is entirely in the OS disk cache. filterCache is fully
> primed.
>
> Often when repeatedly meas
I'm seeing some interesting behaviour when doing benchmarks of query
and facet performance. Note that the query cache is disabled, and
the index is entirely in the OS disk cache. filterCache is fully
primed.
Often when repeatedly measuring the same query, I'll see pretty
consistent resu
say I have the following (partial)
querystring:...&facet=true&facet.field=country
field 'country' is not tokenized, not multi-valued, and not boolean, so the
field-cache approach is used.
Morover, the following (partial) querystring is used as well:
..fq=country:france
do these queries share ca
yup that clarifies things a lot, thanks.
Mike Klaas wrote:
>
> On 10-Oct-07, at 4:16 AM, Britske wrote:
>
>>
>> However, I realized that for calculating the count for each of the
>> facetvalues, the original standardrequesthandler already loops the
>> doclist
>> to check for matches. Theref
Many thanks for your reply, Yonik.
On 10/11/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> > Caused by: org.apache.solr.common.SolrException: Internal Server Error
>
> Is there a longer stack trace somewhere concerning the internal server
> error?
I shouldn't have bothered you with this. We've
On 10-Oct-07, at 4:16 AM, Britske wrote:
However, I realized that for calculating the count for each of the
facetvalues, the original standardrequesthandler already loops the
doclist
to check for matches. Therefore my implementation actually does
double work,
since it gets doclists for eac
Where does the index come from in the first place? Do we have to
enter the words, or are they entered as documents enter the SOLR index?
I'd love to be able to use my own documents as the spell check index
of "correctly spelled words".
+--
Good to know, I think this needs to be a configurable value in the library
(overridable, at a minimum.)
What's outstanding for me on this is understanding the Solr side of the
equation, and whether culture variance comes into play. What makes this
even more interesting/confusing is how culture sc
> To achieve this I have to keep the document field to "stored" right?
Yes, the field needs to be stored to return snippets.
> When I do this my index becomes huge 10 GB index, cause I have 10K
> docs but each is very lengthy HTML. Is there any better solution?
> Why is index created by nutch s
On 10/10/07, Jason Rennie <[EMAIL PROTECTED]> wrote:
> We're using solr 1.2 and a nightly build of the solrj client code. We very
> occasionally see things like this:
>
> org.apache.solr.client.solrj.SolrServerException: Error executing query
> at org.apache.solr.client.solrj.request.Query
Martin Grotzke schrieb:
Try the SnowballPorterFilterFactory with German2 as language attribute
first and use synonyms for combined words i.e. "Herrenhose" => "Herren",
"Hose".
so you use a combined approach?
Yes, we define the relevant parts of compounded words (keywords only) as
synon
Hi Daniel,
thanx for your suggestions, being able to export a large synonyms.txt
sounds very well!
Thx && cheers,
Martin
On Wed, 2007-10-10 at 23:38 +0200, Daniel Naber wrote:
> On Wednesday 10 October 2007 12:00, Martin Grotzke wrote:
>
> > Basically I see two options: stemming and the usage
Climbingrose,
I think you make a valid point. Each person may have a different concept of
how something should work with their application.
My thought on the subject of spell checking multiple words:
- the parameter "multiWords" enables spell checking on each word in "q"
parameter instead of
Cool, thanks for the clarification, Ryan.
-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Wednesday, October 10, 2007 5:28 PM
To: solr-user@lucene.apache.org
Subject: Re: quick allowDups questions
the default solrj implementation should do what you need.
>
> As
Hi Tom,
On Wed, 2007-10-10 at 12:28 +0200, Thomas Traeger wrote:
> in short: use stemming
ok :)
>
> Try the SnowballPorterFilterFactory with German2 as language attribute
> first and use synonyms for combined words i.e. "Herrenhose" => "Herren",
> "Hose".
so you use a combined approach?
>
>
This even works if you request 0 results. --wunder
On 10/11/07 1:56 AM, "Stefan Rinner" <[EMAIL PROTECTED]> wrote:
>
> On Oct 10, 2007, at 6:49 PM, Chris Hostetter wrote:
>
>>
>> : I think search for "*:*" is the optimal code to do it. I don't
>> think you can
>> : do anything faster.
>>
>> F
Jeff,
Thanks! Your suggestion worked, instead of invoking ToString() on
float values I've used ToString's other signature, which takes a an
IFormatProvider:
CultureInfo MyCulture = CultureInfo.InvariantCulture;
this.Add(new IndexFieldValue("weight",
weight.ToString(MyCulture.NumberFormat)));
this
Hi All,
I'm facing similar problem. I want to index entire document as a
field. But I also want to be able to retrieve snippets (like
Google/Nutch return in results page below the links).
To achieve this I have to keep the document field to "stored" right?
When I do this my index becomes huge 1
On Tue, 9 Oct 2007 10:12:51 -0400
"David Whalen" <[EMAIL PROTECTED]> wrote:
> So, how would you build it if you could? Here are the specs:
>
> a) the index needs to hold at least 25 million articles
> b) the index is constantly updated at a rate of 10,000 articles
> per minute
> c) we need to ha
Just to clarify this line of code:
String[] suggestions = spellChecker.suggestSimilar(termText, numSug,
req.getSearcher().getReader(), restrictToField, true);
I only return suggestions if they are more popular than termText. You
probably need to use code in Scott's patch to make this behaviour
co
Hi all,
I've been so busy the last few days so I haven't replied to this email. I
modified SpellCheckerHandler a while ago to include support for multiword
query. To be honest, I didn't have time to write unit test for the code.
However, I deployed it in a production environment and it has been wo
On Oct 10, 2007, at 6:49 PM, Chris Hostetter wrote:
: I think search for "*:*" is the optimal code to do it. I don't
think you can
: do anything faster.
FYI: getting the data from the xml returned by stats.jsp is definitely
faster in the case where you really want all docs.
if you want th
Unable to get solr up in tomcat. Im getting the following log
NFO: Using JNDI solr.home: E:/test/workspace/reviewGist/solr/home
Oct 11, 2007 1:48:13 PM org.apache.solr.core.Config setInstanceDir
INFO: Solr home set to 'E:/test/workspace/reviewGist/solr/home/'
Oct 11, 2007 1:48:13 PM org.apache.sol
Since the title of my original post may not have been so clear, here a
repost.
//Geert-Jan
Britske wrote:
>
> First of all, I just wanted to say that I just started working with Solr
> and really like the results I'm getting from Solr (in terms of
> performance, flexibility) as well as the goo
43 matches
Mail list logo