Terms.prefix was just to compare performance.
The use case was terms.regex=.*query.* And as Markus pointed out, this will
prolly remain a bottleneck.
I looked at the Suggester. But like many others I have been struggling to
make it useful. It needs a custom queryConverter to give proper suggestio
: Initially, I was looking at http://wiki.apache.org/solr/MoreLikeThis
:
: Then, it turned out that most implementations are based on a combination of
: Mahout, Solr and Hadoop.
I think you'll find that most "serious" (for some definition) content
recomendation engines use various ML algorithm
: Index has 41 000 000 documents and 9 GB size. For query like:
: 1)
:
*q=Jarecki+Jan*&fq=sex:M&fq=confirmed:1&fq=show_search:3&fl=user_id&start=0&rows=10&wt=json&version=2.2
:
: server reaches avarage *90 query/s* on 4 theards and is very small for me.
:
: For query with filer on filed city:
:
: Subject: Terms.regex performance issue
:
: As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
: results in around 100 milliseconds, while terms.regex is 10 to 20 times
: slower.
can you elaborate on how you are using terms.regex? what does your regex
look like? .. pa
FWIW: i think this is a really cool and interesting question.
: Is there a way to specify in a query that a term must match at least X
: times in a document, where X is some value greater than 1?
at the moment, i think your "phrase query" approach is really the only
viable way (allthough it di
TermsComponent uses java.util.regex which is not particulary fast. If the
number of terms grows your CPU is going to overheat. I'd prefer an analyzer
approach.
> As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
> results in around 100 milliseconds, while terms.regex is
: I'm considering to use SolrJ to run queries in a MLT fashion against my Solr
: server. I saw that there is already an open bug filed in Jira
: (https://issues.apache.org/jira/browse/SOLR-1085).
note that that issue is really just about having convinience classes for
executing MLT style request
Can you reproduce this error consistently?
Can you try using the CheckIndex tool on your index to verify that it
hasn't been corrupted in some way?
:2011-08-15 10:31:24,968 ERROR [org.apache.solr.core.SolrCore] -
: java.lang.NullPointerException
: at sun.nio.ch.Util.free(Util.java:1
: when the response comes back the facet names are
:
: 2010-08-14T01:50:58.813Z
...
: instead of something like
:
: NOW-11MONTH
...
: where as facet queries if specifying a set of facet queries like
:
: datetime:[NOW-1YEAR TO NOW]
...
: the labels come back just as speci
As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
results in around 100 milliseconds, while terms.regex is 10 to 20 times
slower.
Not storing the field made it a bit faster but not enough. The index is on a
seperate core and only about 5Mb big. Are there some tricks to ma
Hi
I have one instance of solr running on JBoss with the following schema and
partial config:
Schema:
−
−
−
−
−
user_id
search_text
Config:
10
1024
1000
1
1000
1
Index has 41 000 000 documents and 9 GB size. For query like:
1)
*q=Jarecki+Jan*&fq=sex:M
Hi
I have one instance of solr running on JBoss with the following schema and
partial config:
Schema:
-
-
-
-
-
user_id
search_text
Config:
10
1024
1000
1
1000
1
Index has 41 000 000 documents and 9 GB size. For query like:
1)* q=Jarecki+Jan*
&fq=sex:
Hi.
Either port it to Solr 3, or use Solr 4 (trunk).
I know and have used a Metacarta solution but that is also based on Solr 4 and
I don't think they've back-ported it. I have no clue what they charge for it or
where to get it; I have it as part of their larger solution.
There's also a sma
Hi all,
I was going through Solr 3.3.0 and it seems there's still no support for
performing GeoSpatial queries on documents that have more than one
latitude-longitude. The multi field value is set to false everywhere.
We absolutely need this feature. I had a look at
https://issues.apache.org/ji
Both Nutch and Solr can be used as per the need.
http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ . So the search
is implemented and I am able to search on the values. Now I need the
SpellChecker to be implemented. The changes are exactly as per the ones
listed in http://wiki.apache.org/
Please register me
On Fri, Aug 19, 2011 at 9:26 PM, anupamxyz wrote:
> I am using Nutch to crawl and Solr for searching. The search has been
> successfully implemented. Now I want a file based Suggestion or a "Do you
> mean Feature?" implemented. It is more or less like a Spell checker
Um, not quite. At least as pe
Is there a way to specify in a query that a term must match at least X times in
a document, where X is some value greater than 1?
For example, I want to only get documents that contain the word "dog" three
times. I've thought that using a proximity query with an arbitrary large
distance value
I am using Nutch to crawl and Solr for searching. The search has been
successfully implemented. Now I want a file based Suggestion or a "Do you
mean Feature?" implemented. It is more or less like a Spell checker. For the
same I am making the requisite changes to the SolrConfig.xml and the
Schema.xm
I am using Nutch to crawl and Solr for searching. The search has been
successfully implemented. Now I want a file based Suggestion or a "Do you
mean Feature?" implemented. It is more or less like a Spell checker. For the
same I am making the requisite changes to the SolrConfig.xml and the
Schema.xm
Hi Koji, thanks, it's loading right now. Can't say it's really working
though, but I believe those are other issues with FastVectorHighlighter
2011/8/18 Koji Sekiguchi
> (11/08/19 4:14), Alexei Martchenko wrote:
>
>> Hi Koji thanks for the reply.
>>
>> My is defined directly in. SOLR 3.3 warns
Am 19.08.2011 16:43, schrieb Yonik Seeley:
On Fri, Aug 19, 2011 at 10:36 AM, alexander sulz wrote:
using lsof I think I pinned down the problem: too many open files!
I already doubled from 512 to 1024 once but it seems there are many SOCKETS
involved,
which are listed as "can't identify protoco
Hard to say, so I'll list the exact steps I took:
-Downloaded apache-solr-3.3.0 (I like to stick with releases vs. svn)
-Untar and cd
-ant
-Wrote my class below (under a peer directory in apache-solr-3.3.0)
-javac -cp
../dist/apache-solr-core-3.3.0.jar:../lucene/build/lucene-core-3.3-SNAPSHOT.jar
As far as I checked creating a custom query converter is the only way to
make this work.
Unfortunately I have some problems with running it - after creating a JAR
with my class (Im using your source code, obviously besides package and
class names) and throwing it into the lib dir I've added name
On Fri, Aug 19, 2011 at 10:36 AM, alexander sulz wrote:
> using lsof I think I pinned down the problem: too many open files!
> I already doubled from 512 to 1024 once but it seems there are many SOCKETS
> involved,
> which are listed as "can't identify protocol", instead of "real files".
> over ti
Am 19.08.2011 15:48, schrieb alexander sulz:
Am 10.08.2011 17:11, schrieb Yonik Seeley:
On Wed, Aug 10, 2011 at 11:00 AM, alexander
sulz wrote:
Okay, with this command it hangs.
It doesn't look like a hang from this thread dump. It doesn't look
like any solr requests are executing at the tim
>From my understanding, seeing the cache as a set of key-value pairs, this
cache has the query as key and the list of IDs resulting from the query as
values. When the exact same query is issued, it will be found as key in this
cache, and Solr will already have the list of IDs that match it.
If you
Am 10.08.2011 17:11, schrieb Yonik Seeley:
On Wed, Aug 10, 2011 at 11:00 AM, alexander sulz wrote:
Okay, with this command it hangs.
It doesn't look like a hang from this thread dump. It doesn't look
like any solr requests are executing at the time the dump was taken.
Did you do this from th
wiki says *"size
The maximum number of entries in the cache."
andqueryResultCache
This cache stores ordered sets of document IDs — the top N results of a
query ordered by some criteria.
*
doesn't it mean number of document ids rather than number of queries ?
2011/8/19 Tomás Fernández Löbbe
Good luck, and let us know what the results are. About dropping the cache.. That
shouldn't be a problem, it should just be computed when your
component is called
the first time, so starting the server (or opening a new searcher)
should re-compute
it. Your filters shouldn't be very big, just maxDoc
I was on my phone before, and didn't see the whole thread. I wanted the same
thing, to have spellchecker not tokenize. See the "Suggester Issues" thread
for my junky replacement class that doesn't tokenize (as far as I can tell from
a few minutes of testing).
will
On Aug 19, 2011, at 8:35 A
Whenever 4.0 comes out :) Hard to put a date on that - I believe another
SolrCloud push is about to start to cover the indexing side.
On Aug 18, 2011, at 11:46 AM, Way Cool wrote:
> Hi, guys,
>
> When are you planning to release the SolrCloud feature with ZooKeeper
> currently in trunk? The ne
Hi Jame, the size for the queryResultCache is the number of queries that
will fit into this cache. AutowarmCount is the number of queries that are
going to be copyed from the old cache to the new cache when a commit occurrs
(actually, the queries are going to be executed again agains the new
IndexS
1> I don't know, where is it coming from? Looks like you've done stats call on
a freshly opened server.
2> 512 entries (i.e. results for 512 queries). Each entry is
doc IDs.
Best
Erick
On Fri, Aug 19, 2011 at 5:33 AM, jame vaalet wrote:
> 1 .what does this specify ?
>
> size="*${queryResultCa
This might be unrelated, but I had the exact same error yesterday
trying to replace the query converter with a custom class I wrote.
Ended up, I wasn't properly registering my jar. I'm still testing
with jetty, and "lib" in example is included "too late" in the startup
process. I had to re
My analyser is not empty :
/
/
and i'm sure there is words in it
I don't know where to find this file
"org.apache.solr.handler.component.SpellCheckComponent.getTokens"
--
View this message in context:
http://lucene.472066.n3.nabble.com/Full-se
or your analyzer is null? any other exception or warning in your log file?
On Fri, Aug 19, 2011 at 7:37 PM, Li Li wrote:
> Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer !=
> null;
> it seems our codes' versions don't match. could you decompile your
> SpellCheckComponent
hi,
i understand that queryResultCache tag in solrconfig is the one which
determines the cache size of SOLR in jvm.
out of the different attributes what is size? Is it the amount of memory
reserved in bytes ? or number of doc ids cached ? or is it the number of
queries it will cache?
similarly
Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer != null;
it seems our codes' versions don't match. could you decompile your
SpellCheckComponent.class ?
On Fri, Aug 19, 2011 at 7:23 PM, Valentin wrote:
> My beautiful NullPointer Exception :
>
>
> SEVERE: java.lang.NullPoin
My beautiful NullPointer Exception :
SEVERE: java.lang.NullPointerException
at
org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:476)
at
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:131)
at
o
NullPointerException? do you have the full exception print stack?
On Fri, Aug 19, 2011 at 6:49 PM, Valentin wrote:
>
> Li Li wrote:
>> If you don't want to tokenize query, you should pass spellcheck.q
>> and provide your own analyzer such as keyword analyzer.
>
> That's already what I do with my
Li Li wrote:
> If you don't want to tokenize query, you should pass spellcheck.q
> and provide your own analyzer such as keyword analyzer.
That's already what I do with my suggestTextFull fieldType, added to my
searchComponent, no ? I've copied my fieldType and my searchComponent on my
first pos
Thanks Omri, that looks interesting.
What I'm looking for is for movies and close to jinni.com.
They seem to be using JEE, but not sure about Solr/Lucene though.
Thanks.
Arcadius.
On Thu, Aug 18, 2011 at 3:25 PM, Omri Cohen wrote:
> check out OutBrain
>
I haven't used suggest yet. But in spell check if you don't
provide spellcheck.q, it will
analyze the q parameter by a converter which "tokenize" your query.
else it will use the analyzer of the field to process parameter q.
If you don't want to tokenize query, you should pass spellcheck.q
currently our full index takes around half an hour - its a big dataset ~
serveral million records of detailed product information - this is actually
very quick compared to another one of my installations. I would be
interested to know which of these methods would reduce indexing time the
most.
N
I don't think it wil lhelp me, sorry. I just want my query to not be
tokenised, I want it to be considered as a full sentence to correct.
But thanks for your answers, I keep searching.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Full-sentence-spellcheck-tp3265257p3267629.
You have different options here. You can give more boost at indexing time to
the documents that have set the fields you want. For this to take effect you
will have to reindex and set omitNorms="false" to the fields you are going
to search. This same concept can be applied to boost single fields ins
could you please show me your configuration in solrconfig.xml?
On Fri, Aug 19, 2011 at 5:31 PM, olivier sallou
wrote:
> Hi,
> I do not use spell but I use distributed search, using qt=spell is correct,
> should not use qt=\spell.
> For "shards", I specify it in solrconfig directly, not in url, bu
1 .what does this specify ?
2.
when i say *queryResultCacheSize : 512 *, does it mean 512 queries can be
cached or 512 bytes are reserved for caching ?
can some please give me an answer ?
On 14 August 2011 21:41, Erick Erickson wrote:
> Yep.
>
> ResultWindowSize in
> >> solrconfig.xml
> >
Hi,
I do not use spell but I use distributed search, using qt=spell is correct,
should not use qt=\spell.
For "shards", I specify it in solrconfig directly, not in url, but should
work the same.
Maybe an issue in your spell request handler.
2011/8/19 Li Li
> hi all,
> I follow the wiki http
this may need something like language models to suggest.
I found an issue https://issues.apache.org/jira/browse/SOLR-2585
what's going on with it?
On Thu, Aug 18, 2011 at 11:31 PM, Valentin wrote:
> I'm trying to configure a spellchecker to autocomplete full sentences from my
> query.
>
> I've
hi all,
I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent
but there is something wrong.
the url given my the wiki is
http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-shard1:8983/solr,solr-shar
hi all,
I tested it following the instructions in
http://wiki.apache.org/solr/SpellCheckComponent. but it seems something
wrong.
the sample url in the wiki is
http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-
On Fri, Aug 19, 2011 at 5:32 AM, Alexandre Sompheng wrote:
> Hi guys, i try the delta import, i got logs saying that it found delta
> data to update. But it seems that the index is not updated. Amy guess
> why this happens ? Did i miss something? I'm on solr 3.3 with no
> patch.
[...]
Please show
Actually, that's not my problem, I do specify "q".
Another idea ? It really makes me crazy...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Full-sentence-spellcheck-tp3265257p3267394.html
Sent from the Solr - User mailing list archive at Nabble.com.
55 matches
Mail list logo