hello.
my NRT-Search is not correctly configured =(
2 Solr-Instances. one "searcher" and one "updater"
the updater start every minute an update of around 3000 documents. and the
searcher start an commit ervery minute to refresh the index and read the new
doc`s
these are my Cache values for an
Hi Lance,
Well not actually copied over the whole configuration files, instead i just
added in the missing configuration (into a fresh copy of the example
directory).
By the directory implementation do you mean the readers used by
SolrIndexSearcher ?
These are:
reader : SolrIndexReader{this=1cb0
Hello !
Every night within my maintenance window, during high load caused by
postgresql (vacuum analyze), i see a few (10-30) messages showing up in the
solr 3.1 logfile.
SEVERE: org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
at
org.mortbay
i start a commit on "searcher"-Core with:
.../core/update?commit=true&waitFlush=false
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores < 100.000
- Solr1 for Search
Hey folks,
The Berlin Buzzwords team recently released the schedule for
the conference on high scalability. The conference focuses on the
topics search,
data analysis and NoSQL. It is to take place on June 6/7th 2011 in Berlin.
We are looking forward to two awesome keynote speakers who shaped the
my filterCache has a warmupTime from ~6000 ... but my config is like this:
LRU Cache(maxSize=3000, initialSize=50, autowarmCount=50 ...)
should i set maxSize to 50 or similar value ?
-
--- System
One Server, 12 GB RAM, 2 S
oooh. my queryResultCache has a warmupTime from 54000 => ~1 Minute
any suggestions ??
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores < 100.000
- Solr1 for Sea
i fighting with the same problem but with jetty.
its in this case necessary to delete also the jetty work-DIR ???
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores
Hi Lance
thanx for your reply, but I have a question
is this patch committed to trunk?
Hi all,
I am porting a previously series of Solr plugins developed for 1.4.1 version
to 3.1.0, I've written some integration tests extending the
AbstractSolrTestCase [1] utility class but now it seems that wasn't included
in the solr-core 3.1.0 artifact as it's in the solr/src/test directory. Was
t
Hi everyone,
My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:
q=car returns 3 docs
1-
name=car ford
marketValue=1
score=1.3
2-
name=car citroen
marketValue=2
score=1.3
3-
name=car mercedes
marketValue
Hi,
Im trying to do somethinglike this in Solr 1.4.1
fq=category_id:(24 79)
However the values inside the parenthesis will be fetched through another
query, so far I’ve tried using _query_ but it doesnt work the way I want it
to. Here is what im trying
fq=category_id:(_query_:”{!lucene fl=catego
Hi,
from time to time we're seeing a "ProtocolException: Unbuffered entity
enclosing request can not be repeated." in the logs when sending ~500
docs to solr (the stack trace is at the end of the email).
I'm aware that this was discussed before (e.g. [1]) and our solution was
already to reduce th
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(
any way to do both in the same time ?
-
---
On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili
wrote:
> Hi all,
> I am porting a previously series of Solr plugins developed for 1.4.1 version
> to 3.1.0, I've written some integration tests extending the
> AbstractSolrTestCase [1] utility class but now it seems that wasn't included
> in the sol
Chris:
Here's the nabble URL:
http://lucene.472066.n3.nabble.com/Strip-spaces-and-new-line-characters-from-data-tp2795453p2795453.html
The message in the Solr list is from alexei on 8-April. "Strip spaces and
newline characters from data".
This started happening a couple (?) of weeks ago and I
FWIW, I see the xml I just sent in gMail, so I'm guessing things are over on
the nabble side, but I have very little evidence..
Erick
P.S. It's not a huge deal, getting to the correct message on nabble is just
a click away. But it is a bit annoying.
On Tue, Apr 12, 2011 at 8:38 AM, Erick Eri
Make sure streaming is on.
--> how to check ?
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores < 100.000
- Solr1 for Search-Requests - commit every Minute - 5GB
Hi,
I did not want to hijack this thread (
http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
but I am experiencing the same exact problem mentioned here.
To sum up the issue, I am getting intermittent "Unavailable Service"
exception during indexing commit phase.
I know that I
I've asked on Nabble if they know of a fix for the problem:
http://nabble-support.1.n2.nabble.com/solr-dev-mailing-list-tp6023495p6264955.html
Steve
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Tuesday, April 12, 2011 8:43 AM
> To: Chris Hostetter
If your commit from the client fails, you don't really know the
state of your index anyway. All the threads you have sending
documents to Solr are adding them to a single internal buffer.
Committing flushes that buffer.
So if thread 1 gets an error on commit, it will presumably
have some documents
Sorry, fat fingers. Sent that last e-mail inadvertently.
Anyway, if I have this correct, I'd recommend going to
autocommit and NOT committing from the clients. That's
usually the recommended procedure.
This is especially true if you have a master/slave setup,
because each commit from each client
Hi,
I have been trying to perform a search using a CommonsHttpSolrServer when my
postCommit event listener is called.
I am not able to find the documents just commited; the "post" in postCommit
caused me to assume that I would; it seems that the commit only takes effect
when all postCommit hav
Try using AND (or set q.op):
q=car+AND+_val_:marketValue
On Apr 12, 2011, at 07:11 , Marco Martinez wrote:
> Hi everyone,
>
> My situation is the next, I need to sum the value of a field to the score to
> the docs returned in the query, but not to all the docs, example:
>
> q=car returns 3
Hi
I would like to build a component that during indexing analyses all tokens
in a stream and adds metadata to a new field based on my analysis. I have
different tasks that I would like to perform, like basic classification and
certain more advanced phrase detections. How would I do this? A normal
Thanks Robert, that was very useful :)
Tommaso
2011/4/12 Robert Muir
> On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili
> wrote:
> > Hi all,
> > I am porting a previously series of Solr plugins developed for 1.4.1
> version
> > to 3.1.0, I've written some integration tests extending the
> > Abs
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:
My query:
_query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned
Solr query parser
car
:
: Here's the nabble URL:
:
:
http://lucene.472066.n3.nabble.com/Strip-spaces-and-new-line-characters-from-data-tp2795453p2795453.html
:
: The message in the Solr list is from alexei on 8-April. "Strip spaces and
: newline characters from data".
And the raw message as recieved by apache...
h
On 4/12/2011 6:21 AM, stockii wrote:
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(
any way to do both in the same ti
You can index and optimize at the same time. The current limitation
or pause is when the ram buffer is flushing to disk, however that's
changing with the DocumentsWriterPerThread implementation, eg,
LUCENE-2324.
On Tue, Apr 12, 2011 at 8:34 AM, Shawn Heisey wrote:
> On 4/12/2011 6:21 AM, stockii
I'm not sure it's a 100% solution but the new path hierarchy tokenizer
seems promising. I've only played with a little bit with a little too
booze and not enough sleep (in the sky) so apologies for the
potty-mouth-ness of this blog post.
http://www.aaronland.info/weblog/2011/04/02/status/#sky
I have 1 master, and 2 slaves setup with 1.30 collection distribution. My
frontwed web application does query to the master, do I need to change any
code in the web application to query on the slaves? or does the master
requests query from the slaves automatcially? Please help thx.
Erick,
My setup is not quite the way you described. I have multiple threads
indexing simultaneously, but I only have 1 thread doing the commit after all
indexing threads finished. I have multiple instances of this running each
in their own java vm. I'm ok with throwing out all the docs indexed
Yes. You need to put, say, a load balancer on front of your slaves
and distribute the requests to the slave.
Best
Erick
On Tue, Apr 12, 2011 at 2:20 PM, Li Tan wrote:
> I have 1 master, and 2 slaves setup with 1.30 collection distribution. My
> frontwed web application does query to the master,
See below:
On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais wrote:
> Erick,
>
> My setup is not quite the way you described. I have multiple threads
> indexing simultaneously, but I only have 1 thread doing the commit after
> all
> indexing threads finished. I have multiple instances of this runnin
Hi,
I have been trying to get spellcheck to work in the Chinese language. So far
I have not had any luck. Can someone shed some light here as a general guide
line in terms of what need to happen?
I am using the CJKAnalyzer in the text field type and searching works fine,
but spelling does not wor
Did this go to the list? I think I may need to resubscribe...
Sent from my iPhone
On Apr 12, 2011, at 12:55 AM, Estrada Groups
wrote:
> Has anyone tried doing this? Got any tips for someone getting started?
>
> Thanks,
> Adam
>
> Sent from my iPhone
Thanks Eric, I thought the master does automatically when you setup collection
distribution. I wish there are more document for 1.3 collection distribution.
Do you know how to show the slave stats on the Master admin page, the
distribution tab? Thanks in advance guys.
Sent from my iPhone
On Ap
It did: http://search-lucene.com/?q=panaramio
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Estrada Groups
> To: Estrada Groups
> Cc: "solr-user@lucene.apache.org"
> Sent: Tue, Apri
Hi,
Does spellchecking in Chinese actually make sense? I once asked a native
Chinese speaker about that and the person told me it didn't really make sense.
Anyhow, with n-grams, I don't think this could technically work even if it made
sense for Chinese, could it?
Otis
Sematext :: http://
If I follow things correctly, I think you should be seeing new documents only
after the commit is done and the new index searcher is open and available for
search. If you are searching before the new searcher is available, you are
probably still hitting the old searcher.
Otis
Sematext ::
Hi,
I did Flickr into Lucene about 3 years ago. There is a Flickr API,
which covers almost everything you need (as I remember, not always
Flickr feature was implemented at that time in the API, like the
"collection" was not searchable). You can harvest by user ID or
searching for a topic. You can
On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
wrote:
> Thanks but I tried this and I saw that this work in a standard scenario, but
> in my query i use a my own query parser and it seems that they dont doing
> the AND and returns all the docs in the index:
>
> My query:
> _query_:"{!bm25}car" A
It doesn't make sense to spell check individual character sized words,
but makes a lot of sense for phrases. Due to pervasive use of pinyin
IM, it's very easy to write phrases that are totally wrong in
semantics and but "sounds" correct. n-gram should work if it doesn't
mangle the characters.
On T
Hi Hoss,
thanks for your response...
you are right I got a typo in my question, but I did use maxSegments, and
here is the exactly url I used:
curl
'http://localhost:8080/solr/97/update?optimize=true&maxSegments=10&waitFlush=true'
I used jconsole and du -sk to monitor each partial optimize, and
Thanks Otis and Luke.
Yes it does make sense to spellcheck phrases in Chinese. Looks like the
default Solr spellCheck component is already doing some kind of NGram-ing.
When examining the spellCheck index, I did see gram1, gram2, gram3, gram4...
The problem is no Chinese terms were indexed into th
: /tmp # ls /xxx/solr/data/32455077/index | wc ---> this is the
start point, 150 seg files
: 150 150 946
: /tmp # time curl
the number of files i nthe index directory is not the "number of
segments"
the number of segments is an internal lucene concept that impacts
Thanks Peter! I am thinking that I may just use Nutch to do the crawl and index
off of these sites. I need to check out the APIs for each to make sure I'm not
missing anything related to the geospatial data for each image. Obviously both
do the extraction when the images are uploaded so I'm gues
I am hoping to get some feedback on the architecture I've been planning
for a medium to high volume site. This is my first time working
with Solr, so I want to be sure what I'm planning isn't totally weird,
unsupported, etc.
We've got a a pair of F5 loadbalancers and 4 hosts. 2 of those hosts
I think the repeaters are misleading you a bit here. The purpose of a
repeater is
usually to replicate across a slow network, say in a remote data
center, then slaves at that center can get more timely updates. I don't
think
they add anything to your disaster recovery scenario.
So I'll ignore repe
ManifoldCF sounds like it might be the right solution, so long as it's
not secretly building a filter query in the back end, otherwise it
will hit the same limits.
In the meantime, I have made a minor improvement to my filter query;
it now scans the permitted IDs and attempts to build a filter que
Hi Parker,
Lovely ASCII art. :)
Yes, I think you can simplify this by introducing shared storage (e.g., SAN)
that hosts the index to which you active/primary master writes. When your
primary master dies, you start your stand-by master that is configured to point
to the same index. If there
ok I dug more into this and realize the file extensions can vary depending on
schema, right?
for instance we dont have *.tvx, *.tvd, *.tvf (not using term vector)... and
I suspect the file extensions
may change with future lucene releases?
now it seems we can't just count the file using any formul
For example, I am storing email ids of a person. If the person has 3 email
ids, I want to store them as
email = 'x...@whatever.com'
email = 'a...@blah.com'
email = 'p...@moreblah.com'
How can we do this ?
I know someone will come up with "why don't you store it like email1,
email2, email3 and
Just set up your schema with a "string" multivalued field...
On Wed, Apr 13, 2011 at 12:47 AM, shrinath.m wrote:
> For example, I am storing email ids of a person. If the person has 3 email
> ids, I want to store them as
> email = 'x...@whatever.com'
> email = 'a...@blah.com'
> email = 'p...@more
55 matches
Mail list logo