How soon do you need to know? Couldn't you just regenerate the index using some
kind of 'nice' factor to not use too much processor/disk/etc?
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at http://www.yer
Hello!
I' using solrj 1.4.0 with java 1.6, on two occasions when indexing
~18000 documents we got the following problem:
(trace from jconsole)
Name: pool-1-thread-1
State: WAITING on
java.util.concurrent.locks.abstractqueuedsynchronizer$conditionobj...@11
e464a
Total blocked: 25 Tota
Thanks for your comments, Jonathon. Here is some information that
gives a brief overview of the eGranary Platform in order to quickly
outline the need for a solution for bringing multiple indexes into one
searchable collection.
http://www.widernet.org/egranary/info/multipleIndexes
Thanks,
B
Hi there
I have a problem, the situation is when I issue a query to single instance,
Solr response XML like following
as you can see, the score is normal()
===
0
23
_l_title,score
0
_l_unique_key:12
*
true
999
1.9808292
GTest
12
===
But wh
Hi guys,
I have posted a thread "The search response time is too long".
The SOLR searcher instance is deployed with Tomcat 5.5.21.
.
The index file is 8.2G. The doc num is 6110745. DELL Server has Intel(R)
Xeon(TM) CPU (4 cores) 3.00GHZ and 6G RAM.
In SOLR back-end, "query=key:*" costs alm
: then in method createParser() add the following:
:
: req.getCore().getInfoRegistry().put(getName(), this);
that doesn't seem like a good idea -- createParser will be called every
time a string needs to be parsed, you're overwriting the same entry in the
infoRegistry over and over and over ag
Have you looked at SOLRs TermComponent? Assuming you have a unique key,
I think you could use TermsComponent to walk that field for comparing
against
your database rather then getting all the documents.
HTH
Erick
On Tue, Sep 28, 2010 at 5:11 PM, dshvadskiy wrote:
>
> That will certainly work fo
Thx. I will let you know the latest status.
>From: Lance Norskog
>Reply-To: solr-user@lucene.apache.org
>To: solr-user@lucene.apache.org, newsam
>Subject: Re: Re:The search response time is too loong
>Date: Tue, 28 Sep 2010 13:34:53 -0700
>
>Copy the index. Delete half of the documents. Optimize.
This may seem like a stupid question, but why on the info / stats pages do we
see two instances on SolrIndexSearcher?
The reason I ask is that we've implemented SOLR-465 to try and serve our
index from a RAMDirectory, but it appears that our index is being loaded
into memory twice, as our JVM hea
Correction, Java heap size should be RAM buffer size if i'm not too mistaken.
-Original message-
From: Markus Jelsma
Sent: Wed 29-09-2010 01:17
To: solr-user@lucene.apache.org;
Subject: RE: Re: Solr Deduplication and Field Collpasing
If you can set the digest field for your `non-nutc
If you can set the digest field for your `non-nutch` documents easily, that
would be a more quicker approach indeed. No need to create a custom update
processor or anything like that. But to do so, you would have to reindex the
whole bunch again. There is no way to update a document without comp
I have the digest field already in the schema because the index is shared
between nutch docs and others. I do not know if the second approach is the
quickest in my case.
I can set the digest value to something unique for non nutch documets easily (I
have an I'd field that I can use to populate
Honestly, I think just putting everything in the same index is your best bet.
Are you sure your "particular needs of your project" can't be served by one
combined index? You can certainly still query on just a portion of the index
when needed using fq -- you can even create a request handler (
You could create a custom update processor that adds a digest field for newly
added documents that do not have the digest field themselves. This way, the
documents that are not added by Nutch get a proper non-empty digest field so
the deduplication processor won't create the same empty hash and
In our application, we need to be able to search across multiple local
indexes. We need this not so much for performance reasons, but because
of the particular needs of our project. But the indexes, while sharing
the same schema can be vary different in terms of size and distribution
of docum
Ok, I created the issues:
IF function: SOLR-2136
AND, OR, NOT: SOLR-2137
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 28. sep. 2010, at 19.36, Yonik Seeley wrote:
> On Tue, Sep 28, 2010 at 11:33 AM, Jan Høydahl / Cominvent
> wrote:
>> Have anyone written any co
Excellent, exactly what I needed.
Thanks,
James
On Sep 28, 2010, at 4:28 PM, Luke Crouch wrote:
> Yeah. You can specify two analyzers in the same fieldType:
>
>
>
> ...
>
>
> ...
>
>
>
> -L
>
> On Tue, Sep 28, 2010 at 2:31 PM, James Norton wrote:
>
>> Hello,
>>
>> I am migrating fro
All,
I have setup Nutch to submit the crawl results to Solr index. I have
some duplicates in the documents generated by the Nutch crawl. There is
filed 'digest' that Nutch generates that is same for those documents
that are duplicates. While setting up the the dedupe processor in the
Solr co
I notice we don't have the default=true, instead we manually specify
qt=dismax in our queries. HTH.
-L
On Tue, Sep 28, 2010 at 4:24 PM, Luke Crouch wrote:
> What you have is exactly what I have on 1.4.0:
>
>
>
>
> dismax
>
> And it has worked fine. We copied our solrconfig.xml from
What you have is exactly what I have on 1.4.0:
dismax
And it has worked fine. We copied our solrconfig.xml from the examples and
changed them for our purposes. You might compare your solrconfig.xml to some
of the examples.
-L
On Tue, Sep 28, 2010 at 4:19 PM, Thumuluri, Sai <
sai.th
Can I please get some help here? I am in a tight timeline to get this
done - any ideas/suggestions would be greatly appreciated.
-Original Message-
From: Thumuluri, Sai [mailto:sai.thumul...@verizonwireless.com]
Sent: Tuesday, September 28, 2010 12:15 PM
To: solr-user@lucene.apache.org
S
No it is not same for EmbeddedSolrServer, we learned it hard way, I
guess you would have also learned it by now.
at SolrJ wiki page : http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer
"CommonsHttpSolrServer is thread-safe and if you are using the following
constructor,
you *MUST* re-use t
That will certainly work for most recent updates but I need to compare entire
index.
Dmitriy
Luke Crouch wrote:
>
> Is there a 1:1 ratio of db records to solr documents? If so, couldn't you
> simply select the most recent updated record from the db and check to make
> sure the corresponding sol
we learned it hard way, Wish I had read this
before
http://wiki.apache.org/solr/EmbeddedSolr
it is not threadsafe. start seeing concurrent modification
exception as soon as within 100 Samples, when you load it with
more than 1 Concurrent Users ( I ha
Is there a 1:1 ratio of db records to solr documents? If so, couldn't you
simply select the most recent updated record from the db and check to make
sure the corresponding solr doc has the same timestamp?
-L
On Tue, Sep 28, 2010 at 3:48 PM, Dmitriy Shvadskiy wrote:
> Hello,
> What would be the b
Hello,
What would be the best way to check Solr index against original system
(Database) to make sure index is up to date? I can use Solr fields like Id
and timestamp to check against appropriate fields in database. Our index
currently contains over 2 mln documents across several cores. Pulling all
There is already a simple Velocity app. Just hit
http://localhost:8983/solr/browse.
You can configure some handy parameters to make walkable facets in
solrconfig.xml.
On Tue, Sep 28, 2010 at 5:23 AM, Antonio Calo' wrote:
> Hi
>
> You could try to use the Velocity framework to build GUIs in a qu
Copy the index. Delete half of the documents. Optimize.
Copy the index. Delete the other half of the documents. Optimize.
2010/9/28 newsam :
> I guess you are correct. We used the default SOLR cache configuration. I will
> change the cache configuration.
>
> BTW, I want to deploy several shards f
Yeah. You can specify two analyzers in the same fieldType:
...
...
-L
On Tue, Sep 28, 2010 at 2:31 PM, James Norton wrote:
> Hello,
>
> I am migrating from a pure Lucene application to using solr. For legacy
> reasons I must support a somewhat obscure query feature: lowercase words in
>
Hello,
I am migrating from a pure Lucene application to using solr. For legacy
reasons I must support a somewhat obscure query feature: lowercase words in the
query should match lowercase or uppercase in the index, while uppercase words
in the query should only match uppercase words in the ind
Hi,
I'm getting a rather strange exception after long web server idle (TomCat
7.0.2). If I immediately run the same request -- no errors are occurred. In
what may be the problem? All server settings are defaults.
Exception:
...
at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
at
On Tue, Sep 28, 2010 at 11:33 AM, Jan Høydahl / Cominvent
wrote:
> Have anyone written any conditional functions yet for use in Function Queries?
Nope - but it makes sense and has been on my list of things to do for
a long time.
-Y
http://lucenerevolution.org Lucene/Solr Conference, Boston Oct
I removed default=true from standard request handler
-Original Message-
From: Luke Crouch [mailto:lcro...@geek.net]
Sent: Tuesday, September 28, 2010 12:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Dismax Request handler and Solrconfig.xml
Are you removing the standard default requ
Are you removing the standard default requestHandler when you do this? Or
are you specifying two requestHandler's with default="true" ?
-L
On Tue, Sep 28, 2010 at 11:14 AM, Thumuluri, Sai <
sai.thumul...@verizonwireless.com> wrote:
> Hi,
>
> I am using Solr 1.4.1 with Nutch to index some of our
Hi,
I am using Solr 1.4.1 with Nutch to index some of our intranet content.
In Solrconfig.xml, default request handler is set to "standard". I am
planning to change that to use dismax as the request handler but when I
set "default=true" for dismax - Solr does not return any results - I get
results
Hi,
Have anyone written any conditional functions yet for use in Function Queries?
I see the use for a function which can run different sub functions depending on
the value of a field.
Say you have three documents:
A: title=Sports car, color=red
B: title=Boring car, color=green
B: title=Big car
> 1) KeywordTokenizerFactory seems to be a "tokenizer
> factory" while CJKTokenizer seems to be just a tokenizer.
> Are they the same type of things at all?
> Could I just replace
>
> with
> class="org.apache.lucene.analysis.cjk.CJKTokenizer"/>
> ??
You should use org.apache.solr.analysis.CJK
You might want to look at SOLR-2010. This patch works with the "collation"
feature, having it test the collations it returns to ensure they'll return
hits. So if a user types "san jos" it will know that the combination "san
jose" is in the index and "san ojos" is not.
James Dyer
E-Commerce Sy
Yes, in the latest released version (1.4.1), there is a shards= parameter but
the client needs to fill it, i.e. the client needs to know what servers are
indexers, searchers, shard masters and shard replicas...
The SolrCloud stuff is still not committed and only available as a patch right
now.
Maybe SOLR-80 jira issue ?
As written in Solr 1.4 book; "pure negative query doesn't work correctly ."
you have to add 'AND *:* '
thx
From: Patrick Sauts [mailto:patrick.via...@gmail.com]
Sent: mardi 28 septembre 2010 11:53
To: 'solr-user@lucene.apache.org'
Subject: Limitations o
Please explain what you want to *do*, your message is so terse it makes it
really hard to figure out what you're asking. A couple of example queries
would help a lot.
Best
Erick
On Tue, Sep 28, 2010 at 5:53 AM, Patrick Sauts wrote:
> I can find the answer but is this problem solved in Solr 1.4.1
Hi
You could try to use the Velocity framework to build GUIs in a quick
and efficent manner.
Solr come with a velocity handler already integrated that could be the
best solution in your case:
http://wiki.apache.org/solr/VelocityResponseWriter
Also take these hints on the same topic:
htt
yes, there is a multisearcher in lucene. but it's idf in 2 indexes are
not global. maybe I can modify it and also the index like:
term1 df=5 doc1 doc3 doc5
term1 df=5 doc2 doc4
2010/9/28 Li Li :
> hi all
> I want to speed up search time for my application. In a query, the
> time is la
This is an excellent idea!
And, desperately needed.
It's high time Lucene can take advantage of concurrency when running a
single query. Machines have tons of cores these days! (My dev box
has 24!).
Note that one simple way to do this is use ParallelMultiSearcher: it
uses one thread per segmen
hi all
I want to speed up search time for my application. In a query, the
time is largly used in reading postlist(io with frq files) and
calculate scores and collect result(cpu, with Priority Queue). IO is
hardly optimized or already part optimized by nio. So I want to use
multithreads to utili
I can find the answer but is this problem solved in Solr 1.4.1 ?
Thx for your answers.
Interesting. So what you are saying, though, is that at the moment it
is NOT there?
On Mon, Sep 27, 2010 at 9:06 PM, Jan Høydahl / Cominvent
wrote:
> Solr will match this in version 3.1 which is the next major release.
> Read this page: http://wiki.apache.org/solr/SolrCloud for feature descriptio
Could someone help me to understand the differences between TokenizerFactory,
Tokenizer, & Analyzer?
Specifically, I'm interested in implementing auto-complete for tags that could
contain both English & Chinese. I read this article
(http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-f
I guess you are correct. We used the default SOLR cache configuration. I will
change the cache configuration.
BTW, I want to deploy several shards from the existing 8G index file, such as
4G per shards. Is there any tool to generate two shards from one 8G index file?
>From: kenf_nc
>Reply-To:
49 matches
Mail list logo