Re: Terms.regex performance issue

2011-08-19 Thread O. Klein
Terms.prefix was just to compare performance. The use case was terms.regex=.*query.* And as Markus pointed out, this will prolly remain a bottleneck. I looked at the Suggester. But like many others I have been struggling to make it useful. It needs a custom queryConverter to give proper suggestio

Re: Content recommendation using solr?

2011-08-19 Thread Chris Hostetter
: Initially, I was looking at http://wiki.apache.org/solr/MoreLikeThis : : Then, it turned out that most implementations are based on a combination of : Mahout, Solr and Hadoop. I think you'll find that most "serious" (for some definition) content recomendation engines use various ML algorithm

Re: Solr performance for query without filter

2011-08-19 Thread Chris Hostetter
: Index has 41 000 000 documents and 9 GB size. For query like: : 1) : *q=Jarecki+Jan*&fq=sex:M&fq=confirmed:1&fq=show_search:3&fl=user_id&start=0&rows=10&wt=json&version=2.2 : : server reaches avarage *90 query/s* on 4 theards and is very small for me. : : For query with filer on filed city: :

Re: Terms.regex performance issue

2011-08-19 Thread Chris Hostetter
: Subject: Terms.regex performance issue : : As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets : results in around 100 milliseconds, while terms.regex is 10 to 20 times : slower. can you elaborate on how you are using terms.regex? what does your regex look like? .. pa

Re: Requiring multiple matches of a term

2011-08-19 Thread Chris Hostetter
FWIW: i think this is a really cool and interesting question. : Is there a way to specify in a query that a term must match at least X : times in a document, where X is some value greater than 1? at the moment, i think your "phrase query" approach is really the only viable way (allthough it di

Re: Terms.regex performance issue

2011-08-19 Thread Markus Jelsma
TermsComponent uses java.util.regex which is not particulary fast. If the number of terms grows your CPU is going to overheat. I'd prefer an analyzer approach. > As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets > results in around 100 milliseconds, while terms.regex is

Re: SolrJ and ContentStreams

2011-08-19 Thread Chris Hostetter
: I'm considering to use SolrJ to run queries in a MLT fashion against my Solr : server. I saw that there is already an open bug filed in Jira : (https://issues.apache.org/jira/browse/SOLR-1085). note that that issue is really just about having convinience classes for executing MLT style request

Re: A strange Exception in Solr 1.4

2011-08-19 Thread Chris Hostetter
Can you reproduce this error consistently? Can you try using the CheckIndex tool on your index to verify that it hasn't been corrupted in some way? :2011-08-15 10:31:24,968 ERROR [org.apache.solr.core.SolrCore] - : java.lang.NullPointerException : at sun.nio.ch.Util.free(Util.java:1

Re: Date Facet Question

2011-08-19 Thread Chris Hostetter
: when the response comes back the facet names are : : 2010-08-14T01:50:58.813Z ... : instead of something like : : NOW-11MONTH ... : where as facet queries if specifying a set of facet queries like : : datetime:[NOW-1YEAR TO NOW] ... : the labels come back just as speci

Terms.regex performance issue

2011-08-19 Thread O. Klein
As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets results in around 100 milliseconds, while terms.regex is 10 to 20 times slower. Not storing the field made it a bit faster but not enough. The index is on a seperate core and only about 5Mb big. Are there some tricks to ma

Solr performance for query without filter

2011-08-19 Thread mikopacz
Hi I have one instance of solr running on JBoss with the following schema and partial config: Schema: − − − − − user_id search_text Config: 10 1024 1000 1 1000 1 Index has 41 000 000 documents and 9 GB size. For query like: 1) *q=Jarecki+Jan*&fq=sex:M

Solr performance

2011-08-19 Thread Michał Kopacz
Hi I have one instance of solr running on JBoss with the following schema and partial config: Schema: - - - - - user_id search_text Config: 10 1024 1000 1 1000 1 Index has 41 000 000 documents and 9 GB size. For query like: 1)* q=Jarecki+Jan* &fq=sex:

Re: Solr support for multiple points (latitude-longitude) for a document

2011-08-19 Thread Smiley, David W.
Hi. Either port it to Solr 3, or use Solr 4 (trunk). I know and have used a Metacarta solution but that is also based on Solr 4 and I don't think they've back-ported it. I have no clue what they charge for it or where to get it; I have it as part of their larger solution. There's also a sma

Solr support for multiple points (latitude-longitude) for a document

2011-08-19 Thread Jean Croteau
Hi all, I was going through Solr 3.3.0 and it seems there's still no support for performing GeoSpatial queries on documents that have more than one latitude-longitude. The multi field value is set to false everywhere. We absolutely need this feature. I had a look at https://issues.apache.org/ji

Re: How to implement Spell Checker using Solr?

2011-08-19 Thread anupamxyz
Both Nutch and Solr can be used as per the need. http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ . So the search is implemented and I am able to search on the values. Now I need the SpellChecker to be implemented. The changes are exactly as per the ones listed in http://wiki.apache.org/

Please register me

2011-08-19 Thread Anupam
Please register me

Re: How to implement Spell Checker using Solr?

2011-08-19 Thread Gora Mohanty
On Fri, Aug 19, 2011 at 9:26 PM, anupamxyz wrote: > I am using Nutch to crawl and Solr for searching. The search has been > successfully implemented. Now I want a file based Suggestion or a "Do you > mean Feature?" implemented. It is more or less like a Spell checker Um, not quite. At least as pe

Requiring multiple matches of a term

2011-08-19 Thread Michael Ryan
Is there a way to specify in a query that a term must match at least X times in a document, where X is some value greater than 1? For example, I want to only get documents that contain the word "dog" three times. I've thought that using a proximity query with an arbitrary large distance value

How to implement Spell Checker using Solr?

2011-08-19 Thread anupamxyz
I am using Nutch to crawl and Solr for searching. The search has been successfully implemented. Now I want a file based Suggestion or a "Do you mean Feature?" implemented. It is more or less like a Spell checker. For the same I am making the requisite changes to the SolrConfig.xml and the Schema.xm

Re: File based index doesn't work in spellcheck component

2011-08-19 Thread anupamxyz
I am using Nutch to crawl and Solr for searching. The search has been successfully implemented. Now I want a file based Suggestion or a "Do you mean Feature?" implemented. It is more or less like a Spell checker. For the same I am making the requisite changes to the SolrConfig.xml and the Schema.xm

Re: hl.useFastVectorHighlighter, fragmentsBuilder and HighlightingParameters

2011-08-19 Thread Alexei Martchenko
Hi Koji, thanks, it's loading right now. Can't say it's really working though, but I believe those are other issues with FastVectorHighlighter 2011/8/18 Koji Sekiguchi > (11/08/19 4:14), Alexei Martchenko wrote: > >> Hi Koji thanks for the reply. >> >> My is defined directly in. SOLR 3.3 warns

Re: Solr 3.3 crashes after ~18 hours?

2011-08-19 Thread alexander sulz
Am 19.08.2011 16:43, schrieb Yonik Seeley: On Fri, Aug 19, 2011 at 10:36 AM, alexander sulz wrote: using lsof I think I pinned down the problem: too many open files! I already doubled from 512 to 1024 once but it seems there are many SOCKETS involved, which are listed as "can't identify protoco

Re: suggester issues

2011-08-19 Thread William Oberman
Hard to say, so I'll list the exact steps I took: -Downloaded apache-solr-3.3.0 (I like to stick with releases vs. svn) -Untar and cd -ant -Wrote my class below (under a peer directory in apache-solr-3.3.0) -javac -cp ../dist/apache-solr-core-3.3.0.jar:../lucene/build/lucene-core-3.3-SNAPSHOT.jar

Re: suggester issues

2011-08-19 Thread Kuba Krzemien
As far as I checked creating a custom query converter is the only way to make this work. Unfortunately I have some problems with running it - after creating a JAR with my class (Im using your source code, obviously besides package and class names) and throwing it into the lib dir I've added name

Re: Solr 3.3 crashes after ~18 hours?

2011-08-19 Thread Yonik Seeley
On Fri, Aug 19, 2011 at 10:36 AM, alexander sulz wrote: > using lsof I think I pinned down the problem: too many open files! > I already doubled from 512 to 1024 once but it seems there are many SOCKETS > involved, > which are listed as "can't identify protocol", instead of "real files". > over ti

Re: Solr 3.3 crashes after ~18 hours?

2011-08-19 Thread alexander sulz
Am 19.08.2011 15:48, schrieb alexander sulz: Am 10.08.2011 17:11, schrieb Yonik Seeley: On Wed, Aug 10, 2011 at 11:00 AM, alexander sulz wrote: Okay, with this command it hangs. It doesn't look like a hang from this thread dump. It doesn't look like any solr requests are executing at the tim

Re: query cache result

2011-08-19 Thread Tomás Fernández Löbbe
>From my understanding, seeing the cache as a set of key-value pairs, this cache has the query as key and the list of IDs resulting from the query as values. When the exact same query is issued, it will be found as key in this cache, and Solr will already have the list of IDs that match it. If you

Re: Solr 3.3 crashes after ~18 hours?

2011-08-19 Thread alexander sulz
Am 10.08.2011 17:11, schrieb Yonik Seeley: On Wed, Aug 10, 2011 at 11:00 AM, alexander sulz wrote: Okay, with this command it hangs. It doesn't look like a hang from this thread dump. It doesn't look like any solr requests are executing at the time the dump was taken. Did you do this from th

Re: query cache result

2011-08-19 Thread jame vaalet
wiki says *"size The maximum number of entries in the cache." andqueryResultCache This cache stores ordered sets of document IDs — the top N results of a query ordered by some criteria. * doesn't it mean number of document ids rather than number of queries ? 2011/8/19 Tomás Fernández Löbbe

Re: Filtering results based on a set of values for a field

2011-08-19 Thread Erick Erickson
Good luck, and let us know what the results are. About dropping the cache.. That shouldn't be a problem, it should just be computed when your component is called the first time, so starting the server (or opening a new searcher) should re-compute it. Your filters shouldn't be very big, just maxDoc

Re: Full sentence spellcheck

2011-08-19 Thread William Oberman
I was on my phone before, and didn't see the whole thread. I wanted the same thing, to have spellchecker not tokenize. See the "Suggester Issues" thread for my junky replacement class that doesn't tokenize (as far as I can tell from a few minutes of testing). will On Aug 19, 2011, at 8:35 A

Re: When are you planning to release SolrCloud feature with ZooKeeper?

2011-08-19 Thread Mark Miller
Whenever 4.0 comes out :) Hard to put a date on that - I believe another SolrCloud push is about to start to cover the indexing side. On Aug 18, 2011, at 11:46 AM, Way Cool wrote: > Hi, guys, > > When are you planning to release the SolrCloud feature with ZooKeeper > currently in trunk? The ne

Re: query cache result

2011-08-19 Thread Tomás Fernández Löbbe
Hi Jame, the size for the queryResultCache is the number of queries that will fit into this cache. AutowarmCount is the number of queries that are going to be copyed from the old cache to the new cache when a commit occurrs (actually, the queries are going to be executed again agains the new IndexS

Re: paging size in SOLR

2011-08-19 Thread Erick Erickson
1> I don't know, where is it coming from? Looks like you've done stats call on a freshly opened server. 2> 512 entries (i.e. results for 512 queries). Each entry is doc IDs. Best Erick On Fri, Aug 19, 2011 at 5:33 AM, jame vaalet wrote: > 1 .what does this specify ? > > size="*${queryResultCa

Re: Full sentence spellcheck

2011-08-19 Thread Will Oberman
This might be unrelated, but I had the exact same error yesterday trying to replace the query converter with a custom class I wrote. Ended up, I wasn't properly registering my jar. I'm still testing with jetty, and "lib" in example is included "too late" in the startup process. I had to re

Re: Full sentence spellcheck

2011-08-19 Thread Valentin
My analyser is not empty : / / and i'm sure there is words in it I don't know where to find this file "org.apache.solr.handler.component.SpellCheckComponent.getTokens" -- View this message in context: http://lucene.472066.n3.nabble.com/Full-se

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
or your analyzer is null? any other exception or warning in your log file? On Fri, Aug 19, 2011 at 7:37 PM, Li Li wrote: > Line 476 of  SpellCheckComponent.getTokens of mine  is  assert analyzer != > null; > it seems our codes' versions don't match. could you decompile your > SpellCheckComponent

query cache result

2011-08-19 Thread jame vaalet
hi, i understand that queryResultCache tag in solrconfig is the one which determines the cache size of SOLR in jvm. out of the different attributes what is size? Is it the amount of memory reserved in bytes ? or number of doc ids cached ? or is it the number of queries it will cache? similarly

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer != null; it seems our codes' versions don't match. could you decompile your SpellCheckComponent.class ? On Fri, Aug 19, 2011 at 7:23 PM, Valentin wrote: > My beautiful NullPointer Exception : > > > SEVERE: java.lang.NullPoin

Re: Full sentence spellcheck

2011-08-19 Thread Valentin
My beautiful NullPointer Exception : SEVERE: java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:476) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:131) at o

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
NullPointerException? do you have the full exception print stack? On Fri, Aug 19, 2011 at 6:49 PM, Valentin wrote: > > Li Li wrote: >> If you don't want to tokenize  query, you should pass spellcheck.q >> and provide your own analyzer such as keyword analyzer. > > That's already what I do with my

Re: Full sentence spellcheck

2011-08-19 Thread Valentin
Li Li wrote: > If you don't want to tokenize query, you should pass spellcheck.q > and provide your own analyzer such as keyword analyzer. That's already what I do with my suggestTextFull fieldType, added to my searchComponent, no ? I've copied my fieldType and my searchComponent on my first pos

Re: Content recommendation using solr?

2011-08-19 Thread Arcadius Ahouansou
Thanks Omri, that looks interesting. What I'm looking for is for movies and close to jinni.com. They seem to be using JEE, but not sure about Solr/Lucene though. Thanks. Arcadius. On Thu, Aug 18, 2011 at 3:25 PM, Omri Cohen wrote: > check out OutBrain >

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
I haven't used suggest yet. But in spell check if you don't provide spellcheck.q, it will analyze the q parameter by a converter which "tokenize" your query. else it will use the analyzer of the field to process parameter q. If you don't want to tokenize query, you should pass spellcheck.q

Re: Solr Copyfields

2011-08-19 Thread Nicholas Fellows
currently our full index takes around half an hour - its a big dataset ~ serveral million records of detailed product information - this is actually very quick compared to another one of my installations. I would be interested to know which of these methods would reduce indexing time the most. N

Re: Full sentence spellcheck

2011-08-19 Thread Valentin
I don't think it wil lhelp me, sorry. I just want my query to not be tokenised, I want it to be considered as a full sentence to correct. But thanks for your answers, I keep searching. -- View this message in context: http://lucene.472066.n3.nabble.com/Full-sentence-spellcheck-tp3265257p3267629.

Re: Boost documents based on the number of their fields

2011-08-19 Thread Marc Sturlese
You have different options here. You can give more boost at indexing time to the documents that have set the fields you want. For this to take effect you will have to reindex and set omitNorms="false" to the fields you are going to search. This same concept can be applied to boost single fields ins

Re: solr distributed search don't work

2011-08-19 Thread Li Li
could you please show me your configuration in solrconfig.xml? On Fri, Aug 19, 2011 at 5:31 PM, olivier sallou wrote: > Hi, > I do not use spell but I use distributed search, using qt=spell is correct, > should not use qt=\spell. > For "shards", I specify it in solrconfig directly, not in url, bu

Re: paging size in SOLR

2011-08-19 Thread jame vaalet
1 .what does this specify ? 2. when i say *queryResultCacheSize : 512 *, does it mean 512 queries can be cached or 512 bytes are reserved for caching ? can some please give me an answer ? On 14 August 2011 21:41, Erick Erickson wrote: > Yep. > > ResultWindowSize in > >> solrconfig.xml > >

Re: solr distributed search don't work

2011-08-19 Thread olivier sallou
Hi, I do not use spell but I use distributed search, using qt=spell is correct, should not use qt=\spell. For "shards", I specify it in solrconfig directly, not in url, but should work the same. Maybe an issue in your spell request handler. 2011/8/19 Li Li > hi all, > I follow the wiki http

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
this may need something like language models to suggest. I found an issue https://issues.apache.org/jira/browse/SOLR-2585 what's going on with it? On Thu, Aug 18, 2011 at 11:31 PM, Valentin wrote: > I'm trying to configure a spellchecker to autocomplete full sentences from my > query. > > I've

solr distributed search don't work

2011-08-19 Thread Li Li
hi all, I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong. the url given my the wiki is http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-shard1:8983/solr,solr-shar

can't use distributed spell check

2011-08-19 Thread Li Li
hi all, I tested it following the instructions in http://wiki.apache.org/solr/SpellCheckComponent. but it seems something wrong. the sample url in the wiki is http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-

Re: get update record from database using DIH

2011-08-19 Thread Gora Mohanty
On Fri, Aug 19, 2011 at 5:32 AM, Alexandre Sompheng wrote: > Hi guys, i try the delta import, i got logs saying that it found delta > data to update. But it seems that the index is not updated. Amy guess > why this happens ? Did i miss something? I'm on solr 3.3 with no > patch. [...] Please show

RE: Full sentence spellcheck

2011-08-19 Thread Valentin
Actually, that's not my problem, I do specify "q". Another idea ? It really makes me crazy... -- View this message in context: http://lucene.472066.n3.nabble.com/Full-sentence-spellcheck-tp3265257p3267394.html Sent from the Solr - User mailing list archive at Nabble.com.