Hello again ;-)
after a full-import from 36M Doc`s my delta import dont work fine.
if i starts my delta (which runs on another core very fast) the commit need
vry long.
I think, that solr copys the hole index and commit the new documents in the
index and then reduce the index size after this
why is solr copy my complete index to somewhere when i start an delta-import?
i copy one core, start an full-import from 35Million docs and then start an
delta-import from the last hour (~2000Docs).
dih/solr need start to copy the hole index... why ? i think he is copy the
index, because my hdd-sp
Hey guys,
I'm wondering how people are managing regression testing, in particular with
things like text based search.
I.e. if you change how fields are indexed or change boosts in dismax,
ensuring that doesn't mean that critical queries are showing bad data.
The obvious answer to me was using un
I third that request.
Would greatly appreciate taking a look at that diagram!
Regards,
Jonathan
On Wed, Apr 6, 2011 at 9:12 AM, Isan Fulia wrote:
> Hi Ephraim/Jen,
>
> Can u share that diagram with all.It may really help all of us.
> Thanks,
> Isan Fulia.
>
> On 6 April 2011 10:15, Tirthankar
Hello all, I am having an issue with Solr and the SynonymFilterFactory. I am
using a library to interface with Solr called "sunspot." I realize that is not
what this list is for, but I believe this may be an issue with Solr, not the
library (plus the lib author doesn't know the answer). I am
Hi Ephraim/Jen,
Can u share that diagram with all.It may really help all of us.
Thanks,
Isan Fulia.
On 6 April 2011 10:15, Tirthankar Chatterjee wrote:
> Hi Jen,
> Can you please forward the diagram attachment too that Ephraim sent. :-)
> Thanks,
> Tirthankar
>
> -Original Message-
> Fro
Hmmm, after being stuck on this for hours, I find the answer myself
15minutes after asking for help... as usual. :)
For anyone interested, and no doubt this will not be a revelation for some,
I need the servlet API in my app for it to work, despite being command line.
So adding this to the maven P
Hi All,
I'm hoping this is a reasonably trivial issue, but it's frustrating me to no
end. I'm putting together a tiny command line app to write data into an
index. It has no web based Solr running against it; the index will be moved
at a later time to have a proper server instance start for respon
Hi Jen,
Can you please forward the diagram attachment too that Ephraim sent. :-)
Thanks,
Tirthankar
-Original Message-
From: Jens Mueller [mailto:supidupi...@googlemail.com]
Sent: Tuesday, April 05, 2011 10:30 PM
To: solr-user@lucene.apache.org
Subject: Re: FW: Very very large scale Solr
Thank you for pointing out #2. The commitsToKeep is interesting, but I
thought each commit would create a segment (before optimized) and be
self contained in the index.* directory?
I would only run this on the slave.
Bill
On Tue, Apr 5, 2011 at 2:54 PM, Markus Jelsma
wrote:
> Hi,
>
> This seem
Hello all, I am having an issue with Solr and the SynonymFilterFactory. I am
using a library to interface with Solr called "sunspot." I realize that is not
what this list is for, but I believe this may be an issue with Solr, not the
library (plus the lib author doesn't know the answer). I am u
Hello Ephraim,
thank you so much for the great Document/Scaling-Concept!!
First I think you really should publish this on the solr wiki. This approach
is nowhere documented there and not really obvious for newbies and your
document is great and explains this very well!
Please allow me to further
Hi Stefan,
Thanks for the information.
I used "Checkout Projects from SVN" inside eclipse which does not have the
root build.xml file.
What does this "eclipse" build actually do?
Thanks & Regards
Eric
On Tue, Apr 5, 2011 at 11:34 PM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:
> Eri
On Wed, Apr 06, 2011 at 12:05:57AM +0200, Jan Høydahl said:
> Just curious, was there any resolution to this?
Not really.
We tuned the GC pretty aggressively - we use these options
-server
-Xmx20G -Xms20G -Xss10M
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrem
Eric,
have a look at Line #67 in build.xml :)
Regards
Stefan
Am 06.04.2011 00:28, schrieb Eric Grobler:
Hi Robert,
Thanks for the fast response!
I used
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/
but did not find 'ant eclipse'.
However setting my projects Resouce
Hi Robert,
Thanks for the fast response!
I used
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/
but did not find 'ant eclipse'.
However setting my projects Resouce encoding to UTF-8 worked.
Thanks for your help and have a nice day :-)
Regards
Ericz
On Tue, Apr 5, 2011 at
in eclipse you need to set your project's character encoding to UTF-8.
if you are checking out the source code from svn, you can run 'ant eclipse'
from the top level, and then hit refresh on your project. it will set your
encoding and your classpath up.
On Tue, Apr 5, 2011 at 6:10 PM, Eric Groble
Hi Everyone,
Some language specific classes like GermanLightStemmer has invalid character
compiler errors for code like:
switch(s[i]) {
case 'ä':
case 'Ã ':
case 'á':
in Eclipse with JDK 1.6
How do I get rid of these errors?
Thanks & Regards
Ericz
Hi,
Just curious, was there any resolution to this?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 8. feb. 2011, at 03.40, Markus Jelsma wrote:
> Do you have GC logging enabled? Tail -f the log file and you'll see what CMS
> is
> telling you. Tuning the occupati
Yes, you may be right sorry for the confusion.
Our ultimate goal is to collect user entered data, with least possible
interaction (users are lazy you know) from them. So basically users just
point out where they found that particular item, and app's job is to index
it and later show it in search r
https://issues.apache.org/jira/browse/SOLR-1797
> I'm using solr 1.4.1 and just noticed a bunch of these errors in the
> solr.log file:
>
> SEVERE: java.util.concurrent.ExecutionException:
> java.lang.NoSuchMethodError:
> org.apache.solr.common.util.ConcurrentLRUCache$Stats.add(Lorg/apache/solr/c
I'm using solr 1.4.1 and just noticed a bunch of these errors in the
solr.log file:
SEVERE: java.util.concurrent.ExecutionException:
java.lang.NoSuchMethodError:
org.apache.solr.common.util.ConcurrentLRUCache$Stats.add(Lorg/apache/solr/common/util/ConcurrentLRUCache$Stats;)V
They appear to happen
Hello fellow enthusiastic solr users,
I tried to find the answer to this simple question online, but failed.
I was wondering about this, what happens to uncommitted docsPending if I
stop solr and then restart solr? Are they lost? Are they still there
but still uncommitted? Do they get commit
Thank you so much. I will give this a try. Thanks again everybody for
your help
Raj
-Original Message-
From: lboutros [mailto:boutr...@gmail.com]
Sent: Tuesday, April 05, 2011 2:28 PM
To: solr-user@lucene.apache.org
Subject: RE: question on solr.ASCIIFoldingFilterFactory
this analyzer
I don't completely understand. I think maybe you replaced your
domain-specific actualities with another example in an attempt to be
more general or not reveal your business, but just made your explanation
even more confusing!
But. At the point you are indexing, is it possible to know that "sh
Hi there,
I'm trying to use Solr in one of my projects and I've got a small problem
that I can't figure out.
Basically our application is collecting data submitted by users. Now the
problem is that submitted data may contain some incorrect info, like some
keywords that will mess up search results
Hi,
thank you for making the new apache-solr-3.1 available.
I have installed the version from
http://apache.tradebit.com/pub//lucene/solr/3.1.0/
and am running into very slow stats component queries (~ 1 minute)
for fetching the computed sum of the stats field
url: ?q=*:*&start=0&rows=0&stats=
Short answer: the existence is entirely historic. I added bq because i
needed it, and then i added bf because the _val_:"..." syntax was anoying.
: can't think of a useful case when I want to both *add* a component to
: the ultimate score, and for that component to be a non-function query
: (
: It wasn't just a single file, it was dozens of files all having problems
: toward the end just before I killed the process.
...
: That is by no means all the errors, that is just a sample of a few.
: You can see they all threw HTTP 500 errors. What is strange is, nearly
: every file
Hi,
This seems alright as it leaves the current index in place, doesn't mess with
the spellchecker and leave the properties alone. But, there are two problems:
1. it doesn't take into account the commitsToKeep value set in the deletion
policy, and;
2. it will remove any directory to which a cur
There is a bug that leaves old index.* directories in the Solr data directory.
Here is a script that will clean it up. I wanted to make sure this is
okay, without doing a core reload.
Thanks.
#!/bin/bash
DIR="/mnt/servers/solr/data"
LIST=`ls $DIR`
INDEX=`cat $DIR/index.properties | grep index\=
this analyzer seems to work :
I used Spanish stemming, put the ASCIIFoldingFilterFactory before the
stemming filter and added it in the que
Hi Brandon,
Sorry, I can't make out much here. The exception gives TIKA error that
signifies the parsing issue with PDF. That's all I can make out.
May be someone else on this mailing list can help.
Sorry.
- Anuj
On Tue, Apr 5, 2011 at 6:35 PM, Brandon Waterloo <
brandon.water...@matrix.msu.edu
Your analyzer contains these two filters :
before :
So two things :
The words you are testing are not english words (no ?), so the stemming will
have strange behavior.
If you really want to remove accents, try to put the
ASCIIFoldingFilterFactory before the two others.
Ludovic.
-
Jo
It's not the ASCII folding filter but the stemmer that's removing some trailing
characters. Something you can easily spot on the analysis page.
> Here is the field type definition for ‘text’ field which is what I am using
> for the indexed fields. Can you guys notice any obvious filter that coul
Oh I see. I unfortunately didn't see your earlier email. Thank you!
On Tue, Apr 5, 2011 at 6:41 PM, Chris Hostetter wrote:
>
> : As I had the same problem I went to the wiki looking for the page to
> solve
> : my problem again, and there under recent changes I found that you had
> : trashed it.
>
Here is the field type definition for ‘text’ field which is what I am using for
the indexed fields. Can you guys notice any obvious filter that could be the
issue?
---
Looks like you are using openjdk. Can you try using Sun jdk?
On Mon, Apr 4, 2011 at 6:53 AM, Upayavira wrote:
> This is not Solr crashing, per se, it is your JVM. I personally haven't
> generally had much success debugging these kinds of failure - see
> whether it happens again, and if it does,
Any word delimiter filter will get rid of that symbol. Use a char pattern
replace filter, that should work.
> Use admin/analysis.jsp to see which filter is removing it.
> Configure a field type appropriate to what you want to index.
>
> On Mon, Apr 4, 2011 at 9:55 AM, mechravi25 wrote:
> > Hi,
Use admin/analysis.jsp to see which filter is removing it.
Configure a field type appropriate to what you want to index.
On Mon, Apr 4, 2011 at 9:55 AM, mechravi25 wrote:
> Hi,
> Has anyone indexed the data with Trade Mark symbol??...when i tried to
> index, the data appears as below.
>
> Data:
Hello everyone!
I need your help. I have tried to add a qf that agregate a boost to a field
in my queries by solrconfig.xml. I have tested the solution in a solr server
running in standalone mode and it runs perfectly but when I try to do it on
a embedded server, the query doesn´t returns me nothi
Hi,
Has anyone indexed the data with Trade Mark symbol??...when i tried to
index, the data appears as below.
Data:
79797 - Siebel Research AI Fund,
79797 - Siebel Research AI Fund,l
Original Data:
79797 - Siebel Research™ AI Fund,
Please help me to resolve this
Regards,
Ravi
-
: As I had the same problem I went to the wiki looking for the page to solve
: my problem again, and there under recent changes I found that you had
: trashed it.
I'm confused -- the page did not have any troubleshooting suggestions or
advice, it was just the details of a specific -- it seemed t
I added this test method locally to TestASCIIFoldingFilter.java in the
Lucene/Solr 3.1.0 source tree, and it passed, so the filter is not the problem
(and the Solr factory certainly isn't either - it's just a wrapper) - I second
Ludovic's question - you must have other filters configured:
pub
Is there any Stemming configured in for this field in your schema
configuration file ?
Ludovic.
2011/4/5 Nemani, Raj [via Lucene] <
ml-node+2780463-48954297-383...@n3.nabble.com>
> All,
>
> I am using solr.ASCIIFoldingFilterFactory to perform accent insensitive
> search. One of the words that g
I can't remember where I read it, but I think MappingCharFilterFactory is
prefered.
There is an example in the example schema.
>From this, I get:
org.apache.solr.analysis.MappingCharFilterFactory
{mapping=mapping-ISOLatin1Accent.txt}
|text|despues|
On Tue, Apr 5, 2011 at 5:06 PM, Nemani, Raj
All,
I am using solr.ASCIIFoldingFilterFactory to perform accent insensitive search.
One of the words that got indexed as part my indexing process is "después".
Having used the ASCIIFoldingFilterFactory,I expected that If I searched for
word "despues" I should have the document containing the
If you check the code for TextProfileSignature [1] your'll notice the init
method reading params. You can set those params as you did. Reading Javadoc
[2] might help as well. But what's not documented in the Javadoc is how QUANT
is computed; it rounds.
[1]:
http://svn.apache.org/viewvc/lucene/
Thank you, I'll try to create a c# method to create the same sig of SOLR, and
then compare both sigs before index the doc. This way I can avoid the
indexation of existing docs.
If anyone needs to use this parameter (as this info is not on the wiki), you
can add the option
5
On the processor t
Hi,
you could try the SIREn plugin [1] which supports multi-valued fields.
[1] http://siren.sindice.com
--
Renaud Delbru
On 29/03/11 21:57, Brian Lamb wrote:
Hi all,
I have a field set up like this:
And I have some records:
RECORD1
man's best friend
pooch
RECORD2
man's worst
It wasn't just a single file, it was dozens of files all having problems toward
the end just before I killed the process.
IPADDR - - [04/04/2011:17:17:03 +] "POST
/solr/update/extract?literal.id=32-130-AFB-84&commit=false HTTP/1.1" 500 4558
IPADDR - - [04/04/2011:17:17:05 +] "POST
/
Hello every body,
I am using Solr for indexing and searching.
I am using 2 classes for searching document: In the first one I'm
instanciating a SolrServer to search documents as follows :
server = new EmbeddedSolrServer(coreContainer, "");
server.add(doc);
query.setQuery("id:"+idDoc);
server.que
Could you try creating fields dynamically: common_names_1,
common_names_2, etc.
Keep track of the max number of fields and generate queries listing all
the fields?
Gross, but it handles all the cases mentioned in the thread (wildcards,
phrases, etc).
-Mike
On 3/29/2011 4:57 PM, Brian Lamb
And if you have control over machine placement, split them across racks so that
a power outage on one rack does not take out your search cluster.
François
On Apr 5, 2011, at 3:19 AM, Ephraim Ofir wrote:
> I'm not sure about the scale you're aiming for, but you probably want to
> do both shardin
Hello list,
I did not find a wiki page about normalization.
All I found was:
http://search.lucidimagination.com/search/document/9d06882d97db5c59/a_question_about_solr_score
where Hoss suggests to normalize depending on the maxScore.
I am not comfortable with that since, at least, I want that a
On Tuesday 05 April 2011 12:19:33 Frederico Azeiteiro wrote:
> Sorry, the reply I made yesterday was directed to Markus and not the
> list...
>
> Here's my thoughts on this. At this point I'm a little confused if SOLR
> is a good option to find near duplicate docs.
>
> >> Yes there is, try set
Thanks Stefan and Victor ! we are using GWT for front end. We stopped issuing
multiple asynchronous queries and issue a request and fetch results and then
filter the results based on what has
been typed subsequent to the request and then re trigger the request only if
we don't get the expected resu
Sorry, the reply I made yesterday was directed to Markus and not the
list...
Here's my thoughts on this. At this point I'm a little confused if SOLR
is a good option to find near duplicate docs.
>> Yes there is, try set overwriteDupes to true and documents yielding
the same signature will be over
Hi,
I'm wondering how to find out which version of Solr is currently running
using the Solrj library?
Thanks,
Marc.
Hi Stefan,
Thanks for clear explanation.
I've used XPathEntityProcessor as an example, because didn't found JSON
entity processor.
I'll write a script to generate XML file for data import.
Regards,
Andrew
--
View this message in context:
http://lucene.472066.n3.nabble.com/Mongo-REST-interface
andrew,
you're really wondering why the XPathEntityProcessor does not work
well, with a JSON-Structure !? The Links Erick posted are stating,
that you could push JSON-structured Data to a Solr-HTTP Interface ..
but not, that the DataImport Handler will work with them. IIRC there
is no way for proc
Hello,
As I had the same problem I went to the wiki looking for the page to solve
my problem again, and there under recent changes I found that you had
trashed it.
I can still solve my problem but why don't you keep it for others to benefit
from too? As linked it's a recurring problem for several
I'm not sure about the scale you're aiming for, but you probably want to
do both sharding and replication. There's no central server which would
be the bottleneck. The guidelines should probably be something like:
1. Split your index to enough shards so it can keep up with the update
rate.
2. Have
63 matches
Mail list logo