On 11/21/2013 9:51 PM, RadhaJayalakshmi wrote:
> Thanks Shawn for your response.
> So, from your email, it seems that unique_key validation is handled
> differently from other field validation.
> But what i am not very clear, is what the unique_key has to do with finding
> the live server?
> Becase
Hi Robert,
That was the idea, dynamic fields, so, as you said, it is easier to sort
and filter. Besides, having dynamic fields it would be easier to add new
stores, as I wouldn't have to modify the schema :)
Thanks for the answer!
2013/11/21 Petersen, Robert
> Hi,
>
> I'd go with (2) also but
On 11/21/2013 6:41 PM, Dave Seltzer wrote:
> In digging a little deeper and looking at the config I see that
> true is commented out. I believe this is the default
> setting. So I don't know if NRT is enabled or not. Maybe just a red herring.
I had never seen this setting before. The default is
Thanks Doug!
One thing I'm not clear on is how do I know if this is in-fact related to
Garbage Collection. If you're right, and the cluster is only as slow as its
slowest link, how do I determine that this is GC. Do I have to run the
profiler on all eight nodes?
Or is it a matter of turning on th
Thanks Shawn for your response.
So, from your email, it seems that unique_key validation is handled
differently from other field validation.
But what i am not very clear, is what the unique_key has to do with finding
the live server?
Becase if there is any mismatch in the unique_key, it is throwing
Additional info on GC selection
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#available_collectors
> If response time is more important than overall throughput and garbage
collection pauses must be kept shorter than approximately one second, then
select the concurrent colle
Hi, guys.
I indexed 1000 documents, which have fields like title, ptime and frequency.
The title is a text fild, the ptime is a date field, and the frequency is a
int field.
Frequency field is ups and downs. say sometimes its value is 0, and
sometimes its value is 999.
Now, in my app, the qu
Dave you might want to connect JVisualVm and see if there's any pattern
with latency and garbage collection. That's a frequent culprit for
periodic hits in latency.
More info here
http://docs.oracle.com/javase/6/docs/technotes/guides/visualvm/jmx_connections.html
There's a couple GC implementatio
Lots of questions. Okay.
In digging a little deeper and looking at the config I see that
true is commented out. I believe this is the default
setting. So I don't know if NRT is enabled or not. Maybe just a red herring.
I don't know what Garbage Collector we're using. In this test I'm running
Sol
Hi,
On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> At the Lucene level, I think it would require a directory
> implementation which writes to a remote node directly. Otherwise, on
> the solr side, we must move the leader itself to another node which
> h
Might not be a perfect solution but you can use edgengram filter and copy all
your field data to that field and use it for suggestion.
http://localhost:8983/solr/core1/select?q=name:iphone
The above query will return
iphone
iphone5c
iphone4g
--
View thi
Yes, more details…
Solr version, which garbage collector, how does heap usage look, cpu, etc.
- Mark
On Nov 21, 2013, at 6:46 PM, Erick Erickson wrote:
> How real time is NRT? In particular, what are you commit settings?
>
> And can you characterize "periodic slowness"? Queries that usually
>
How real time is NRT? In particular, what are you commit settings?
And can you characterize "periodic slowness"? Queries that usually
take 500ms not tail 10s? Or 1s? How often? How are you measuring?
Details matter, a lot...
Best,
Erick
On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer wrote:
I'm doing some performance testing against an 8-node Solr cloud cluster,
and I'm noticing some periodic slowness.
http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png
I'm doing random test searches against an Alias Collection made up of four
smaller (monthly) collections. Like this:
Hi Erick,
Thanks for the reply and sorry, my fault, wasn't clear enough. I was
wondering if there was a way to remove terms that would always be zero
(because the term came from a document that didn't match the filter query).
Here's an example. I have a bunch of documents with fields 'manufacture
Hi all:
I’m currently on a Solr 4.5.0 instance and running this tutorial,
http://lucene.apache.org/solr/4_5_0/tutorial.html
My question is specific to indexing data as proposed from this tutorial,
$ java -jar post.jar solr.xml monitor.xml
The tutorial advises to validate from your localhost,
h
That's what faceting does. The facets are only tabulated
for documents that satisfy they query, including all of
the filter queries and anh other criteria.
Otherwise, facet counts would be the same no matter
what the query was.
Or I'm completely misunderstanding your question...
Best,
Erick
On
Hi All,
Is it possible to perform a facet field query on a subset of documents (the
subset being defined via a filter query for instance)?
I understand that facet pivoting might work, but it would require that the
subset be defined by some field hierarchy, e.g. manufacturer -> price (then
only lo
I am querying "test" in solr 4.3.1 over the field below and it's not finding
all occurences. It seems that if it is a substring of a word like
"Supertestplan" it isn't found unless I use a wildcards "*test*". This is
write because of my tokenizer but does someone know a way around this? I
don't wan
Hello,
I'm using Solr 4.x. In my solr schema I have the following fields defined :
stored="true" multiValued="true" />
stored="false" multiValued="true" termVectors="true" />
multiValued="true" termVectors="true" />
multiValued="true" termVectors="true" />
multiVal
I have the following simplified setting:
My schema contains one text field, named "text".
When I perform a query, I need to get the scores for the same text field
but for different similarity functions (e.g. TFIDF, BM25..) and combine
them externally using different weights.
An obvious way to achie
add Durl=http://localhost:8983/solr/collection2/update when run post.jar,
此邮件发送自189邮箱
"Reyes, Mark" wrote:
>Hi all:
>
>I’m currently on a Solr 4.5.0 instance and running this tutorial,
>http://lucene.apache.org/solr/4_5_0/tutorial.html
>
>My question is specific to indexing data as proposed fr
此邮件发送自189邮箱
"Reyes, Mark" wrote:
>Hi all:
>
>I’m currently on a Solr 4.5.0 instance and running this tutorial,
>http://lucene.apache.org/solr/4_5_0/tutorial.html
>
>My question is specific to indexing data as proposed from this tutorial,
>
>$ java -jar post.jar solr.xml monitor.xml
>
>The tut
The query parser does its own tokenization and parsing before your analyzer
tokenizer and filters are called, assuring that only one white
space-delimited token is analyzed at a time.
You're probably best off having an application layer preprocessor for the
query that "enriches" the query in t
Solr (actually Lucene) stores the input _exactly_ as it is entered, and
returns it the same way.
What you're seeing is almost certainly your display mechanism interpreting
the results,
whitespace is notoriously variable in terms of how it's displayed by various
interpretations of the "standard". F
OK - probably I should have said "A",or "a" :) My point was just
that there is not really anything special about "special" characters.
On 11/21/2013 10:50 AM, Jack Krupansky wrote:
"Would you store "a" as "A" ?"
No, not in any case.
-- Jack Krupansky
-Original Message- From: Michael
you're leaving off the - in front of the D,
-Durl.
Try java -jar post.jar -help for a list of options available
On Thu, Nov 21, 2013 at 12:04 PM, Reyes, Mark wrote:
> So then,
> $ java -jar post.jar Durl=http://localhost:8983/solr/collection2/update
> solr.xml monitor.xml
>
>
>
>
>
> On 11/21/
I know it's documented that Lucene/Solr doesn't apply filters to queries with
wildcards, but this seems to trip up a lot of users. I can also see why
wildcards break a number of filters, but a number of filters (e.g. mapping
charsets) could mostly or entirely work. The N-gram filter is another
"Would you store "a" as "A" ?"
No, not in any case.
-- Jack Krupansky
-Original Message-
From: Michael Sokolov
Sent: Thursday, November 21, 2013 8:56 AM
To: solr-user@lucene.apache.org
Subject: Re: How to index X™ as ™ (HTML decimal entity)
I have to agree w/Walter. Use unicode as a
And this is the exact problem. Some characters are stored as entities, some are
not. When it is time to display, what else needs escaped? At a minimum, you
would have to always store & as & to avoid escaping the leading ampersand
in the entities.
You could store every single character as a nume
Hi Adnreas,
If you don't want to use wildcards at query time, alternative way is to use
NGrams at indexing time. This will produce a lot of tokens. e.g.
For example 4grams of your example : Supertestplan => supe uper pert erte rtes
*test* estp stpl tpla plan
Is that you want? By the way why do
I suppose i have to create another field with diffenet tokenizers and set
the boost very low so it doesn't really mess with my ranking because there
the word is now in 2 fields. What kind of tokenizer can do the job?
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Donnerstag, 21. November 2013
Hi,
I'd go with (2) also but using dynamic fields so you don't have to define all
the storeX_price fields in your schema but rather just one *_price field. Then
when you filter on store:store1 you'd know to sort with store1_price and so
forth for units. That should be pretty straightforward.
I have to agree w/Walter. Use unicode as a storage format. The entity
encodings are for transfer/interchange. Encode/decode on the way in and
out if you have to. Would you store "a" as "A" ? It makes it
impossible to search for, for one thing. What if someone wants to
search for the TM ch
I confirm
.
Ah... now I understand your perspective - you have taken a narrow view of
what "text" is. A broader view is that it can contain formatting and special
"entities" as well, or rich text in general. My "read" is that it all
depends on the nature of the application and its requirements, not a "one
On 11/21/2013 1:57 AM, RadhaJayalakshmi wrote:
Hi,I am using solr4.4 with zookeeper 3.3.5. While i was checking for error
conditions of my application, i came across a strange issue.Here is what i
tried:I have three fields defined in my schemaa) UNIQUE_KEY - of type
solr.TrieLongb) empId - of t
I know all about formatted text -- I worked at MarkLogic. That is why I
mentioned the XML Infoset.
Numeric entities are part of the final presentation, really, part of the
encoding. They should never be stored. Always store the Unicode.
Numeric and named entities are a convenience for tools and
You might be able to make use of the dictionary compound word filter, but
you will have to build up a dictionary of words to use:
http://lucene.apache.org/core/4_5_1/analyzers-common/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilterFactory.html
My e-book has some examples an
So then,
$ java -jar post.jar Durl=http://localhost:8983/solr/collection2/update
solr.xml monitor.xml
On 11/21/13, 8:14 AM, "xiezhide" wrote:
>
>add Durl=http://localhost:8983/solr/collection2/update when run post.jar,
>此邮件发送自189邮箱
>
>"Reyes, Mark" wrote:
>
>>Hi all:
>>
>>I’m currently on a
"there is not really anything special about "special" characters"
Well, the distinction was about "named entities", which are indeed special.
Besides, in general, for more sophisticated text processing, character
"types" are a valid distinction.
But all of this begs the question of the origin
What is the actual target speed you are pursuing? Is this for user
suggestions or something of that sort? Content based suggestions with
faceting and esp on 1.4 solr won't be lightning fast.
Have you looked at TermsComponent?
http://wiki.apache.org/solr/TermsComponent
By shingles, which in the re
Hi,
We would like to implement special handling for queries that contain
certain keywords. Our particular use case:
In the example query "Footitle season 1" we want to discover the keywords
"season" , get the subsequent number, and boost (or filter for) documents
that match "1" on field name="seas
Hi,
I've been recently ask to implement an application to search products from
several stores, each store having different prices and stock for the same
product.
So I have products that have the usual fields (name, description, brand,
etc) and also number of units and price for each store. I must
Hi,I am using solr4.4 with zookeeper 3.3.5. While i was checking for error
conditions of my application, i came across a strange issue.Here is what i
tried:I have three fields defined in my schemaa) UNIQUE_KEY - of type
solr.TrieLongb) empId - of type Solr.TrieLongc) companyId - of type
Solr.Trie
Hi,
I'd like to clarify our use case a bit more.
We want to return the exact search query as a suggestion only if it is
present in the index. So in my example we would expect to get the
suggestion "foo" for the query "foo" but no suggestion "abc" for the query
"abc" (because "abc" is not in the di
46 matches
Mail list logo