hello all,
for business reasons, we are sourcing the spellcheck file from another
business group.
the file we receive looks like the example data below
can solr support this type of format - or do i need to process this file in
to a format that has a single word on a single line?
thanks for a
On 3/23/2012 9:55 AM, stockii wrote:
how look your requestHandler of your broker? i think about your idea to do
the same ;)
Here's what I have got for the default request handler in my broker
core, which is called ncmain. The "rollingStatistics" section is
applicable to the SOLR-1972 patch.
> Also, the field length is enocded in a byte (as I remember).
> So it's
> quite possible that,
> even if the lengths of these fields were 3 and 4 instead of
> both being
> 1, the value
> stored for the length norms would be the same number.
Exactly. http://search-lucene.com/m/uGKRu1pvRjw
Hey All-
we run a http://carsabi.com car search engine with Solr and did some
benchmarking recently after we switched from a hosted service to
self-hosting. In brief, we went from 800ms complex range queries on a 1.5M
document corpus to 43ms. The major shifts were switching from EC2 Large to
EC2
Erik:
The field length is, I believe, based on _tokens_, not characters.
Both of your examples
are exactly one token long, so the scores are probably identical
Also, the field length is enocded in a byte (as I remember). So it's
quite possible that,
even if the lengths of these fields were 3
Alexandre:
Have you changed anything like on your slave?
And do you have more than one slave? If you do, have you considered
just blowing away the entire .../data directory on the slave and letting
it re-start from scratch? I'd take the slave out of service for the
duration of this operation, or
Hello there,
I have a quite basic question but my Solr is behaving in a way I'm not quite
sure of why it does so.
The setup is simple: I have a field "suggestionText" in which single strings
are indexed. Schema:
Since I want this field to serve for a suggestion-search, the input string is
In that case, I'm kind of stuck. You've already rebuilt your index
from scratch and removed it from your slaves. That should have
cleared out most everything that could be an issue. I'd suggest
you set up a pair of machines from scratch and try to set up an
index/replication with your current schem
Suppose I have content which has title and description. Users can tag content
and search content based on tag, title and description. Tag has more
weightage.
Any inputs on how indexing and retrieval will work given there is content
and tags using Solr? Has anyone implemented search based on collab
Tomás,
The 300+GB size is only inside the index.20110926152410 dir. Inside there
are a lot of files.
I am almost conviced that something is messed up like someone commited on
this slave machine.
Thanks
2012/3/23 Tomás Fernández Löbbe
> Alexandre, additionally to what Erick said, you may want t
Erick,
The master /data dir contains only an index dir with a bunch of files.
In the slave, the /data dir contains an index.20110926152410 dir with a lot
more files than the master. That is quite strange for me.
I guess that the config is right, since we have another slave that is
running fine wi
On Mar 23, 2012, at 12:49 PM, I-Chiang Chen wrote:
> Caused by: java.lang.OutOfMemoryError: Map failed
Hmm...looks like this is the key info here.
- Mark Miller
lucidimagination.com
We saw couple distinct errors and all machines in a shard is identical:
-On the leader of the shard
Mar 21, 2012 1:58:34 AM org.apache.solr.common.SolrException log
SEVERE: shard update error StdNode:
http://blah.blah.net:8983/solr/master2-slave1/:org.apache.solr.common.SolrException:
Map failed
a
Howdy Folks,
I'm stumped and hope somebody can give me some clues on how to work around
this occasional error I'm getting.
I've got a .Net console program using SolrNet to scour certain folders at
certain times and extract text from PDF files and index them. It succeeds on
a majority of the fi
Hi Erick,
It's not possible because both master and slaves using same binaries.
Thanks...
On Fri, Mar 23, 2012 at 5:30 PM, Erick Erickson wrote:
> Hmmm, looking at your stack trace in a bit more detail, this is really
> suspicious:
>
> Caused by: org.apache.lucene.index.IndexFormatTooNewExcept
@Shawn Heisey-4
how look your requestHandler of your broker? i think about your idea to do
the same ;)
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 8 Cores,
1 Core with 45 Million Documents other Cores < 200.000
Alexandre, additionally to what Erick said, you may want to check in the
slave if what's 300+GB is the "data" directory or the "index."
directory.
On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson wrote:
> not really, unless perhaps you're issuing commits or optimizes
> on the _slave_ (which you s
Hmmm, looking at your stack trace in a bit more detail, this is really
suspicious:
Caused by: org.apache.lucene.index.IndexFormatTooNewException: Format
version is not supported in file 'segments_1': -12 (needs to be between -9
and -11)
This *looks* like your Solr version on your slave is older t
not really, unless perhaps you're issuing commits or optimizes
on the _slave_ (which you should NOT do).
Replication happens based on the version of the index on the master.
True, it starts out as a timestamp, but then successive versions
just have that number incremented. The version number
in th
Also, what happens if, instead of adding the 40K docs you add just one and
commit?
2012/3/23 Tomás Fernández Löbbe
> Have you changed the mergeFactor or are you using 10 as in the example
> solrconfig?
>
> What do you see in the slave's log during replication? Do you see any line
> like "Skippin
Have you changed the mergeFactor or are you using 10 as in the example
solrconfig?
What do you see in the slave's log during replication? Do you see any line
like "Skipping download for..."?
On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
ben.mccar...@tradermedia.co.uk> wrote:
> I just have a i
Hi Erick,
I've already tried step 2 and 3 but it didn't help. It's almost impossible
to do step 1 for us because of project dead-line.
Do you have any other suggestion?
Thank your reply.
On Fri, Mar 23, 2012 at 4:56 PM, Erick Erickson wrote:
> Hmmm, that is odd. But a trunk build from that lon
Erick,
We're using Solr 3.3 on Linux (CentOS 5.6).
The /data dir on master is actually 1.2G.
I haven't tried to recreate the index yet. Since it's a production
environment,
I guess that I can stop replication and indexing and then recreate the
master index to see if it makes any difference.
Also
I just have a index directory.
I push the documents through with a change to a field. Im using SOLRJ to do
this. Im using the guide from the wiki to setup the replication. When the
feed of updates to the master finishes I call a commit again using SOLRJ. I
then have a poll period of 5 minut
Hmmm, that is odd. But a trunk build from that long ago is going
to be almost impossible to debug/fix. The problem with working
from trunk is that this kind of problem won't get much attention.
I have three suggestions:
1> update to current trunk. NOTE: you'll have to completely
reindex your
What version of Solr and what operating system?
But regardless, this shouldn't be happening. Indexes can
temporarily double in size, but any extras should be
cleaned up relatively soon.
On the master, what's the total size of the /data directory?
I'm a little suspicious of the on your master, bu
Hi Ben, only new segments are replicated from master to slave. In a
situation where all the segments are new, this will cause the index to be
fully replicated, but this rarely happen with incremental updates. It can
also happen if the slave Solr assumes it has an "invalid" index.
Are you committing
Hello,
We have a Solr index that has an average of 1.19 GB in size.
After configuring the replication, the slave machine is growing the index
size expoentially.
Currently we have an slave with 323.44 GB in size.
Is there anything that could cause this behavior?
The current replication config is be
So do you just simpy address this with big nic and network pipes.
-Original Message-
From: Martin Koch [mailto:m...@issuu.com]
Sent: 23 March 2012 14:07
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question
I guess this would depend on network bandwidth, but we mo
I guess this would depend on network bandwidth, but we move around
150G/hour when hooking up a new slave to the master.
/Martin
On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
ben.mccar...@tradermedia.co.uk> wrote:
> Hello,
>
> Im looking at the replication from a master to a number of slaves.
Hi list
In my ~6M index served from a slave that is replicating from a master, I'm
trying to do this query :
localhost:8080/solr/core0/select?q=car&qf=document%5E1&defType=edismax
Can anybody explain the below error that I get as a result? It may (or may
not) be related to another problem that w
>
> Where is Join documented? I looked at
> http://wiki.apache.org/solr/Join and see no reference to "fromIndex".
> Also does this work in a distributed environment?
>
The "fromIndex" isn't documented in the wiki It is mentioned in the
issue and you can find in the Solr code:
https://issues.ap
On Fri, Mar 23, 2012 at 6:37 AM, Martijn v Groningen
wrote:
> On 22 March 2012 03:10, Jamie Johnson wrote:
>
>> I need to apologize I believe that in my example I have too grossly
>> over simplified the problem and it's not clear what I am trying to do,
>> so I'll try again.
>>
>> I have a situat
Hi all
Fyi, I am working on a website for doing side by side comparison of
several common enterprise search engines, including some that is based
on Solr. Currently I have Searchdaimon ES, Microsoft SSE 2010,
SearchBlox, Google Mini, Thunderstone, Constellio, mnoGoSearch and Ibm
OmniFind Yahoo ru
Hello,
Im looking at the replication from a master to a number of slaves. I have
configured it and it appears to be working. When updating 40K records on the
master is it standard to always copy over the full index, currently 5gb in
size. If this is standard what do people do who have massiv
We did some tests too with many millions of documents and auto-commit enabled.
It didn't take long for the indexer to stall and in the meantime the number of
open files exploded, to over 16k, then 32k.
On Friday 23 March 2012 12:20:15 Mark Miller wrote:
> What issues? It really shouldn't be a pr
What issues? It really shouldn't be a problem.
On Mar 22, 2012, at 11:44 PM, I-Chiang Chen wrote:
> At this time we are not leveraging the NRT functionality. This is the
> initial data load process where the idea is to just add all 200 millions
> records first. Than do a single commit at the e
I went deeper in the problem and discovered that...
$math.toInteger("10.1") returns 101
$math.toInteger("10,1") returns 10
Although I'm using Strings in the previous examples, I have a Float
variable from Solr.
I'm not sure if it is just a Solr problem, just a Velocity problema or
somewhere betw
On 22 March 2012 03:10, Jamie Johnson wrote:
> I need to apologize I believe that in my example I have too grossly
> over simplified the problem and it's not clear what I am trying to do,
> so I'll try again.
>
> I have a situation where I have a set of access controls say user,
> super user and
Am 23.03.2012 11:17, schrieb Michael Kuhlmann:
Adding an own SearchComponent after the regular QueryComponent (or
better as a "last-element") is goof ...
Of course, I meant "good", not "goof"! ;)
Greetings,
Kuli
Am 23.03.2012 10:29, schrieb Ahmet Arslan:
I'm looking at the following. I want
to (1) map some query fields to
some other query fields and add some things to FL, and then
(2)
rescore.
I can see how to do it as a RequestHandler that makes a
parser to get
the fields, or I could see making a Searc
> I'm looking at the following. I want
> to (1) map some query fields to
> some other query fields and add some things to FL, and then
> (2)
> rescore.
>
> I can see how to do it as a RequestHandler that makes a
> parser to get
> the fields, or I could see making a SearchComponent that was
> stuck
here is my method.
1. check out latest source codes from trunk or download tar ball
svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk
2. create a dynamic web project in eclipse and close it.
for example, I create a project name lucene-solr-trunk in my
workspace.
43 matches
Mail list logo