Otis,
We are not running master-slave configuration. We get very few
searches(admin only) in a day so we didn't see the need of
replication/snapshot. This problem is with one Solr instance managing
4 cores (each core 200 million records). Both indexing and searching
is performed by the same Solr
To me it sounds like it's not finding solr home. I have Windows Vista and
JDK 1.6.0_11 and when I run java -jar start.jar, I too get a ton of the INFO
messages and one of them should read something like:INFO: solr home
defaulted to 'solr/' (could not find system property or JNDI)
May 13, 2009 10:45
There's a related issue open.
https://issues.apache.org/jira/browse/SOLR-712
On Thu, May 14, 2009 at 7:50 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
>
> Bryan, maybe it's time to stick this in JIRA?
> http://wiki.apache.org/solr/HowToContribute
>
> Thanks,
> Otis
> --
> Sematext -
ideally , we don't do that.
you can just keep the master host behind a VIP so if you wish to
change the master make the VIP point to the new host
On Wed, May 13, 2009 at 10:52 PM, nk 11 wrote:
> This is more interesting.Such a procedure would involve taking down and
> reconfiguring the slave?
>
>
HI Grant,
That's not a bad idea... I could try that. I was also looking at cactus:
http://jakarta.apache.org/cactus/integration/ant/index.html
It has an ant task to merge XML. Could this be a contrib-crawl add-on?
Alternately, do you know of any xslt templates built for this? Could
write one,
On May 13, 2009, at 6:53 PM, vivek sar wrote:
Disabling first/new searchers did help for the initial load time, but
after 10-15 min the heap memory start climbing up again and reached
max within 20 min. Now the GC is coming up all the time, which is
slowing down the commit and search cycles.
T
Andrey,
I urge you to use JIRA for this. That's exactly what it's for and how it gets
used.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Andrey Klochkov
> To: solr-user@lucene.apache.org
> Sent: Thursday, May 7, 2009 5:14:26 AM
> Sub
Wojtek,
I believe
http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/spans/SpanFirstQuery.html
would help, though there is no support for Span queries in Solr. But there is
support for custom query parsers, and there is
http://lucene.apache.org/java/2_4_1/api/contrib-snowb
Bryan, maybe it's time to stick this in JIRA?
http://wiki.apache.org/solr/HowToContribute
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Bryan Talbot
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 10:11:21 PM
>
I think the patch I included earlier covers solr core, but it looks
like at least some other extensions (DIH) create and use their own XML
parser. So, if this functionality is to extend to all XML files,
those will need similar patches.
Here's one for DIH:
--- src/main/java/org/apache/sol
Coincidentally, from
http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ :
"Hadoop configuration files now support XInclude elements for including
portions of another configuration file (HADOOP-4944). This mechanism allows you
to make configuration files more modular
There is constant mixing of indexing concepts and searching concepts in this
thread. Are you having problems on the master (indexing) or on the slave
(searching)?
That .tii is only 20K and you said this is a large index? That doesn't smell
right...
Otis
--
Sematext -- http://sematext.com/
Yeah, I'm not sure why this would help. There should be nothing in FieldCaches
unless you sort or use facets.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: vivek sar
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 5:5
Even a simple command like this will help:
jmap -histo:live | head -30
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: vivek sar
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 6:53:29 PM
> Subject: Re: Solr memory re
I'm having difficulty getting Solr running on Vista. I've got the 1.6
JDK installed, and I've successfully compiled file and run other Java
programs.
When I run java -jar start.jar in the Apache Solr example directory, I
get a large number of INFO messages, including:
INFO: JNDI not configur
I created Ruby class SolrCellRequest and saved it to
/path/to/resume/vendor/plugins/acts_as_solr/lib directory.
Here is code original from the tutorial.
module ActsAsSolr
class SolrCellRequest < Solr::Request::Select
def initialize(doc,file_name)
.
.
def handler
'
Warning: I'm wy out of my competency range when I comment
on SOLR, but I've seen the statement that string fields are NOT
tokenized while text fields are, and I notice that almost all of your fields
are string type.
Would someone more knowledgeable than me care to comment on whether
this is at
I think maxBufferedDocs has been deprecated in Solr 1.4 - it's
recommended to use ramBufferSizeMB instead. My ramBufferSizeMB=64.
This shouldn't be a problem I think.
There has to be something else that Solr is holding up in memory. Anyone else?
Thanks,
-vivek
On Wed, May 13, 2009 at 4:01 PM, Ja
Hi Erik et all,
I am following this tutorial link
http://www.lucidimagination.com/blog/tag/acts_as_solr/
to play with acts_as_solr and see if we can invoke solr cell right
from our Rails app.
following he tutorial i created classSolrCellRequest but dont
know where to save the solr_cell_re
Have you checked the maxBufferedDocs? I had to drop mine down to 1000 with
3 million docs.
Jack
On Wed, May 13, 2009 at 6:53 PM, vivek sar wrote:
> Disabling first/new searchers did help for the initial load time, but
> after 10-15 min the heap memory start climbing up again and reached
> max w
Disabling first/new searchers did help for the initial load time, but
after 10-15 min the heap memory start climbing up again and reached
max within 20 min. Now the GC is coming up all the time, which is
slowing down the commit and search cycles.
This is still puzzling what does Solr holds in the
Just an update on the memory issue - might be useful for others. I
read the following,
http://wiki.apache.org/solr/SolrCaching?highlight=(SolrCaching)
and looks like the first and new searcher listeners would populate the
FieldCache. Commenting out these two listener entries seems to do the
tric
Have you done any profiling to see where the hotspots are? I realize
that may be difficult on an index of that size, but maybe you can
approximate on a smaller version. Also, do you have warming queries?
You might also look into setting the termIndexInterval at the Lucene
level. This is
With solr 1.3 I'm having a problem boosting new documents to the top. I
used the recommended BoostFunction "recip(rord(created_at),1,1000,1000)"
but older documents, sometimes 5 years old, make it to the top 3 documents.
I've started using "ord(created_at)^0.0005" and get better results, but I
d
Otis,
In that case, I'm not sure why Solr is taking up so much memory as
soon as we start it up. I checked for .tii file and there is only one,
-rw-r--r-- 1 search staff 20306 May 11 21:47 ./20090510_1/data/index/_3au.tii
I have all the cache disabled - so that shouldn't be a problem too. My
Hi,
Sorting is triggered by the sort parameter in the URL, not a characteristic of
a field. :)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: vivek sar
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 4:42:16 PM
> Subje
Thanks Otis.
Our use case doesn't require any sorting or faceting. I'm wondering if
I've configured anything wrong.
I got total of 25 fields (15 are indexed and stored, other 10 are just
stored). All my fields are basic data type - which I thought are not
sorted. My id field is unique key.
Is th
Indeed - that looks nice - having some kind of conditional includes
would make many things easier.
-Peter
On Wed, May 13, 2009 at 4:22 PM, Otis Gospodnetic
wrote:
>
> This looks nice and simple. I don't know enough about this stuff to see any
> issues. If there are no issues.?
>
> Otis
>
This looks nice and simple. I don't know enough about this stuff to see any
issues. If there are no issues.?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Bryan Talbot
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2
Hi,
Some answers:
1) .tii files in the Lucene index. When you sort, all distinct values for the
field(s) used for sorting. Similarly for facet fields. Solr caches.
2) ramBufferSizeMB dictates, more or less, how much Lucene/Solr will consume
during indexing. There is no need to commit every 5
Hi,
I'm pretty sure this has been asked before, but I couldn't find a
complete answer in the forum archive. Here are my questions,
1) When solr starts up what does it loads up in the memory? Let's say
I've 4 cores with each core 50G in size. When Solr comes up how much
of it would be loaded in
Hi Terence,
Terence Gannon schrieb:
Yes, the ownerUid will likely be assigned once and never changed. But
you still need it, in order to keep track of who has contributed which
document.
Yes, of course!
I've been going over some of the simpler query scenarios, and Solr is
capable of handlin
>Hi
>
>Is it possible, through dataimport handler to remove an existing
>document from the Solr index?
>
>I import/update from my database where the active field is true.
>However, if the client then set's active to false, the document stays
>in the Solr index and doesn't get removed.
>
>Regards
>A
Try a search for *:* and see if you get results for that. If so, you
have your documents indexed, but you need to dig into things like
query parser configuration and analysis to see why things aren't
matching. Perhaps you're not querying the field you think you are?
Erik
On May 1
Hi,
This problem is still haunting us. I've reduced the merge factor to
50, but as my index get fat (anything over 20G), the commit starts
taking much longer. Some info,
1) Less than 20 G index size, 5000 records commit takes around 15sec
2) Over 20G the commit starts taking 50-70sec for 5K rec
This is more interesting.Such a procedure would involve taking down and
reconfiguring the slave?
On Wed, May 13, 2009 at 7:55 PM, Bryan Talbot wrote:
> Or ...
>
> 1. Promote existing slave to new master
> 2. Add new slave to cluster
>
>
>
>
> -Bryan
>
>
>
>
>
> On May 13, 2009, at May 13, 9:48 AM
I forget to say that when I do
curl http://localhost:8983/solr/update -H "Content-Type: text/xml"
--data-binary ''
0453
and search for added keywords gives 0 results. Does status 0 mean that addition
was successful?
Thanks.
Alex.
-Original Message-
From: Erik Hatcher
T
On Wed, May 13, 2009 at 12:29 PM, Geoffrey Young
wrote:
>> However since the indexed term is simply "leann", a
>> WordDelimiterFilter configured to split won't match (a search for
>> "LeAnn" will be translated into a search for "le" "ann".
>
> but the concatparts and/or concatall should handle spl
Or ...
1. Promote existing slave to new master
2. Add new slave to cluster
-Bryan
On May 13, 2009, at May 13, 9:48 AM, Jay Hill wrote:
- Migrate configuration files from old master (or backup) to new
master.
- Replicate from a slave to the new master.
- Resume indexing to new master.
- Migrate configuration files from old master (or backup) to new master.
- Replicate from a slave to the new master.
- Resume indexing to new master.
-Jay
On Wed, May 13, 2009 at 4:26 AM, nk 11 wrote:
> Nice.
> What if the master fails permanently (like a disk crash...) and the new
> master is
Our company has a large search deployment serving > 50 M search hits / per
day.
We've been leveraging Lucene for several years and have recently deployed
Solr for the distributed search feature. We were hitting scaling limits
with lucene due to our index size.
I did an evaluation of Sphinx and f
On May 13, 2009, at 11:55 AM, wojtekpia wrote:
I came across this article praising Sphinx:
http://www.theregister.co.uk/2009/05/08/dziuba_sphinx/. The article
specifically mentions Solr as an 'aging' technology,
Solr is the same age as Sphinx (2006), so if Solr is aging, then so is
Sphinx.
On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley
wrote:
> On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young
> wrote:
>> hi all :)
>>
>> I'm having trouble with camel-cased query strings and the dismax handler.
>>
>> a user query
>>
>> LeAnn Rimes
>>
>> isn't matching the indexed term
>>
>> Leann Rim
It's probably the case that every search engine out there is faster
than Solr at one thing or another, and that Solr is faster or better
at some other things.
I prefer to spend my time improving Solr rather than engage in
benchmarking wars... and Solr 1.4 will have a ton of speed
improvements over
I see that Nobel's final comment in SOLR-1154 is that config files
need to be able to include snippets from external files. In my
limited testing, a simple patch to enable XInclude support seems to
work.
--- src/java/org/apache/solr/core/Config.java (revision 774137)
+++ src/java/org/a
I came across this article praising Sphinx:
http://www.theregister.co.uk/2009/05/08/dziuba_sphinx/. The article
specifically mentions Solr as an 'aging' technology, and states that
performance on Sphinx is 2x-4x faster than Solr. Has anyone compared Sphinx
to Solr? Or used Sphinx in the past? I re
Yes, the ownerUid will likely be assigned once and never changed. But
you still need it, in order to keep track of who has contributed which
document.
I've been going over some of the simpler query scenarios, and Solr is
capable of handling them without having to resort to an external
RDBMS. In
Hi
Is it possible, through dataimport handler to remove an existing
document from the Solr index?
I import/update from my database where the active field is true.
However, if the client then set's active to false, the document stays
in the Solr index and doesn't get removed.
Regards
Andrew
On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young
wrote:
> hi all :)
>
> I'm having trouble with camel-cased query strings and the dismax handler.
>
> a user query
>
> LeAnn Rimes
>
> isn't matching the indexed term
>
> Leann Rimes
This is the camel-case case that can't currently be handled by a
Hmmm, maybe we need to think about someway to hook this into the build
process or make it easier to just drop it into the conf or lib dirs.
I'm no web.xml expert, but I'm sure you're not the first one to want
to do this kind of thing.
The easiest way _might_ be to patch build.xml to take a
Terence Gannon schrieb:
Paul -- thanks for the reply, I appreciate it. That's a very
practical approach, and is worth taking a closer look at. Actually,
taking your idea one step further, perhaps three fields; 1) ownerUid
(uid of the document's owner) 2) grantedUid (uid of users who have
been g
Nice.
What if the master fails permanently (like a disk crash...) and the new
master is a clean machine?
2009/5/13 Noble Paul നോബിള് नोब्ळ्
> On Wed, May 13, 2009 at 12:10 PM, nk 11 wrote:
> > Hello
> >
> > I'm kind of new to Solr and I've read about replication, and the fact
> that a
> > node
On Wed, May 13, 2009 at 12:10 PM, nk 11 wrote:
> Hello
>
> I'm kind of new to Solr and I've read about replication, and the fact that a
> node can act as both master and slave.
> I a replica fails and then comes back on line I suppose that it will resyncs
> with the master.
right
>
> But what happ
Hello Shalin,
thaks you for your help. yes it answers my question.
Much appreciated
Shalin Shekhar Mangar wrote:
>
> On Tue, May 12, 2009 at 9:48 PM, Wayne Pope
> wrote:
>
>>
>> I have this request:
>>
>>
>> http://localhost:8983/solr/select?start=0&rows=20&qt=dismax&q=copy&hl=true&hl.snipp
Thats probably Jira #1063. We have only seen it in the spellcheck
results and only in PHPS and not in PHP ResponseWriter.
https://issues.apache.org/jira/browse/SOLR-1063
-
Markus Jelsma Buyways B.V. Tel. 050-3118123
Technisch ArchitectFriesestraatweg 215c
In addition to earlier mail I have a particular scenario. For that I have to
explain my application level logging in detail.
I am using solr as embedded server. I am using solr with Solr-560-slf4j patch.
I need logging information for solr. Right now my application is using log4j
for logging
We are using a nightly from 13/04. I've found one issue with the PHP
ResponseWriter but apart from that it has been pretty solid.
I'm using the bundled Jetty server to run it for the moment but hope
to move to Tomcat once released and stable (and I have learned
Tomcat!).
Andrew
2009/5/12 Walte
57 matches
Mail list logo