Using recency rord on /distrib

2009-09-23 Thread Pooja Verlani
Hi, I have to put recency using recip and rord functions on an app using /distrib requesthandler. Can i put bf param in /distrib directly call the url like: http://localhost:8983/solr/distrib/?q=cable where in /distrib requesthandler bf is defined as: recip(rord(last_sold_date),1,1000,100

Re: java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
Okay, but {!dismax qf="myfield mytitle^2"}foo works {!dismax qf=myfield mytitle^2}foo does NOT work - Naomi On Sep 23, 2009, at 5:52 PM, Yonik Seeley wrote: On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay wrote: It's not just the spaces - it's that the quotes (single or double flav

Re: Can solr build on top of HBase

2009-09-23 Thread Amit Nithian
Would FUSE (http://wiki.apache.org/hadoop/MountableHDFS) be of use? I wonder if you could take the data from HBase and index it into a Lucene index stored on HDFS. 2009/9/23 Noble Paul നോബിള്‍ नोब्ळ् > can hbase be mounted on the filesystem? Solr can only read data from a > filesystem > > On Th

Can we point a Solr server to index directory dynamically at runtime..

2009-09-23 Thread Silent Surfer
Hi, Is there any way to dynamically point the Solr servers to an index/data directories at run time? We are generating 200 GB worth of index per day and we want to retain the index for approximately 1 month. So our idea is to keep the first 1 week of index available at anytime for the users i.

Re: Can solr build on top of HBase

2009-09-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
can hbase be mounted on the filesystem? Solr can only read data from a filesystem On Thu, Sep 24, 2009 at 7:27 AM, 梁景明 wrote: > hi,  i use hbase and solr ,now i have a large data need to index ,it means > solr-index  will be large, > as the data increases,it will be more larger than now. > so  s

Re: solr caching problem

2009-09-23 Thread satya
Is there any way to analyze or see that which documents are getting cached by documentCache - On Wed, Sep 23, 2009 at 8:10 AM, satya wrote: > First of all , thanks a lot for the clarification.Is there any way to see, > how this cache is working internally and what are the objects being sto

Can solr build on top of HBase

2009-09-23 Thread 梁景明
hi, i use hbase and solr ,now i have a large data need to index ,it means solr-index will be large, as the data increases,it will be more larger than now. so solrconfig.xml 's /solrhome/data ,can i used it from api ,and point to my distrabuted hbase data storage, and if the index is too large ,w

Re: java doc error local params syntax for dismax

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay wrote: > It's not just the spaces - it's that the quotes (single or double flavor) is > required as well. LocalParams are space delimited, so the original example would have worked if the dismax parser accepted comma delimited fields. -Yonik http://w

Re: java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
It's not just the spaces - it's that the quotes (single or double flavor) is required as well. On Sep 23, 2009, at 3:10 PM, Yonik Seeley wrote: On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay wrote: The javadoc for DisMaxQParserPlugin states: {!dismax qf=myfield,mytitle^2}foo creates a di

Re: Solrj possible deadlock

2009-09-23 Thread Ryan McKinley
do you have anything custom going on? The fact that the lock is in java2d seems suspicious... On Sep 23, 2009, at 7:01 PM, pof wrote: I had the same problem again yesterday except the process halted after about 20mins this time. pof wrote: Hello, I was running a batch index the other

Re: Solrj possible deadlock

2009-09-23 Thread pof
I had the same problem again yesterday except the process halted after about 20mins this time. pof wrote: > > Hello, I was running a batch index the other day using the Solrj > EmbeddedSolrServer when the process abruptly froze in it's tracks after > running for about 4-5 hours and indexing ~4

Re: java doc error local params syntax for dismax

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay wrote: > The javadoc for  DisMaxQParserPlugin states: > > {!dismax qf=myfield,mytitle^2}foo creates a dismax query > > but actually, that gives an error. > > The correct syntax is > > {!dismax qf="myfield mytitle^2"}foo > > (could use single quote inst

java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
The javadoc for DisMaxQParserPlugin states: {!dismax qf=myfield,mytitle^2}foo creates a dismax query but actually, that gives an error. The correct syntax is {!dismax qf="myfield mytitle^2"}foo (could use single quote instead of double quote). - Naomi

Re: Solr http post performance seems slow - help?

2009-09-23 Thread Dan A. Dickey
On Friday 11 September 2009 11:06:20 am Dan A. Dickey wrote: ... > Our JBoss expert and I will be looking into why this might be occurring. > Does anyone know of any JBoss related slowness with Solr? > And does anyone have any other sort of suggestions to speed indexing > performance? Thanks for

Mixed field types and boolean searching

2009-09-23 Thread Ensdorf Ken
Hi- let's say you have two indexed fields, "F1" and "F2". F1 uses the StandardAnalyzer, while F2 doesn't. Now imagine you index a document where you have F1="A & B" F2="C + D" Now imagine you run a query: (F1:A OR F2:A) AND (F1:B OR F2:B) in other words, both "A" and "B" must exist in at

Very big numbers

2009-09-23 Thread Jonathan Ariel
Hi! I need to index in solr very big numbers. Something like 99,999,999,999,999.99 Right now i'm using an sdouble field type because I need to make range queries on this field. The problem is that the field value is being returned in scientific notation. Is there any way to avoid that? Thanks! Jon

ReversedWildcardFilterFactory (SOLR-1321) and KeywordTokenizerFactory

2009-09-23 Thread Ravi Kiran
Hello, Can ReversedWildcardFilterFactory be used with KeywordTokenizerFactory ? I get the following error, looks like solr expects WhitespaceTokenizerFactory...Can anybody suggest how to rectify it. My schema snippet is also given below. Data is extracted via OpenNLP and indexed into S

Multiple DisMax Queries spanning across multiple fields

2009-09-23 Thread Kay Kay
For a particular requirement we have - we need to do a query that is a combination of multiple dismax queries behind the scenes. (Using solr 1.4 nightly ). The DisMaxQParser org.apache.solr.search.DisMaxQParser ( details at - http://wiki.apache.org/solr/DisMaxRequestHandler ) takes in the /qf

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Jason Rutherglen
I think don't this handle near duplicates which would require some of the methods mentioned recently on the Mahout list. On Wed, Sep 23, 2009 at 2:59 AM, Shalin Shekhar Mangar wrote: > On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote: > >> Hi, >> When we have news content crawled we face a probl

Multivalue Field Cache

2009-09-23 Thread Amit Nithian
Are there any good implementations of a field cache that will return all values of a multivalued field? I am in the process of writing one for my immediate needs but I was wondering if if there is a complete implementation of one that handles the different field types. If not, then I can continue o

Re: Ranking of search results

2009-09-23 Thread Amit Nithian
It depends on several things:1) The query handler that you are using 2) The fields that you are searching on and default fields specified For the default handler, it will issue a query for the default field and return results accordingly. To see what is going on pass the &debugQuery=true to the e

Ranking of search results

2009-09-23 Thread bhaskar chandrasekar
Hi,   When i give a input string for search in Solr , it displays me the corresponding results for the given input string.   How the results are ranked and displayed.On what basis the search results are displayed. Is there any algorithm followed for displaying the results with first result and s

Re: Solr via ruby

2009-09-23 Thread rajan chandi
Thanks Ian for sharing your knowledge on this. We've been going through the recently published "Solr 1.4 Enterprise Search Server" book and came across some stuff that means - Acts_as_Solr schema could be less flexible when it comes to complex indexing and faceting of the fields. We are still exp

Re: Parallel requests to Tomcat

2009-09-23 Thread Grant Ingersoll
On Sep 23, 2009, at 12:09 PM, Michael wrote: On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley wrote: On Wed, Sep 23, 2009 at 11:47 AM, Michael wrote: If this were IO bound, wouldn't I see the same results when sending my 8 requests to 8 Tomcats? There's only one "disk" (well, RAM) whethe

Re: Solr via ruby

2009-09-23 Thread Ian Connor
Hi, Thanks for the discussion. We use the distributed option so I am not sure embedded is possible. As you also guessed, we use haproxy for load balancing and failover between replicas of the shards so giving this up for a minor performance boost is probably not wise. So essentially we have: Use

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
> I'm not sure whether you'd call my operations IO heavy -- each query has so > many terms (~50) that even against a 45K document index a query takes 130ms, > but the entire index is in a ramfs. The more terms, the more it takes to find docset intersections (belonging to each term); something in

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
> 8 threads sharing something may have *some* overhead versus 8 processes, but > as you say, 410ms overhead points to a different problem. - You have baseline (single-threaded load-stress script sending requests to SOLR) (1-request-in-parallel, 8 requests to 8 Tomcats); 200ms looks extremely high

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley wrote: > On Wed, Sep 23, 2009 at 11:47 AM, Michael wrote: > > If this were IO bound, wouldn't I see the same results when sending my 8 > > requests to 8 Tomcats? There's only one "disk" (well, RAM) whether I'm > > querying 8 processes or 8 threads i

Re: Parallel requests to Tomcat

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 11:47 AM, Michael wrote: > Hi Yonik, > > On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley > wrote: >> >> This could well be IO bound - lots of seeks and reads. > > If this were IO bound, wouldn't I see the same results when sending my 8 > requests to 8 Tomcats?  There's only

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Thanks for the suggestion, Walter! I've been using Gaze 1.0 for a while now, but when I moved to a multicore approach (which was the impetus behind all of this testing) Gaze failed to start and I had to comment it out of solrconfig.xml to get Solr to start. Are you aware whether Gaze is able to w

Re: Parallel requests to Tomcat

2009-09-23 Thread Walter Underwood
This sure seems like a good time to try LucidGaze for Solr. That would give some Solr-specific profiling data. http://www.lucidimagination.com/Downloads/LucidGaze-for-Solr wunder On Sep 23, 2009, at 8:47 AM, Michael wrote: Hi Yonik, On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley wrote:

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Yonik, On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley wrote: > > This could well be IO bound - lots of seeks and reads. > If this were IO bound, wouldn't I see the same results when sending my 8 requests to 8 Tomcats? There's only one "disk" (well, RAM) whether I'm querying 8 processes or 8

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Fuad, On Wed, Sep 23, 2009 at 11:37 AM, Fuad Efendi wrote: > > 8 queries against 1 Tomcat average 600ms per query, while 8 queries > against > > 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G > RAM). > > I don't see how to interpret these numbers except that Tomcat is

Re: Parallel requests to Tomcat

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 11:17 AM, Michael wrote: > I'm using a Solr 1.4 nightly from around July.  Is that recent enough to > have the improved reader implementation? > I'm not sure whether you'd call my operations IO heavy -- each query has so > many terms (~50) that even against a 45K document i

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
> 8 queries against 1 Tomcat average 600ms per query, while 8 queries against > 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM). > I don't see how to interpret these numbers except that Tomcat is not > multithreading as well as it should :) Hi Michael, I think it is ve

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
On Wed, Sep 23, 2009 at 11:26 AM, Fuad Efendi wrote: > > - something obviously wrong in your case, 130ms is too high. Is it > dedicated > server? Disk swapping? Etc. > It's that my queries are ridiculously complex. My users are very familiar with boolean searching, and I'm doing a lot of process

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
Correction: 0 - 150ms (depends on size of query results; 150ms for non-cached (new) queries returning more than 50K docs). > -Original Message- > From: Fuad Efendi [mailto:f...@efendi.ca] > Sent: September-23-09 11:26 AM > To: solr-user@lucene.apache.org > Subject: RE: Parallel requests

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Fuad, thanks for the reply. My queries are heavy enough that the difference in performance is obvious. I am using a home-grown load testing script that sends 1000 realistic queries to the server and takes the average response time. My index is on a ramfs which I've shown makes the QR and doc c

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
I have 0-15ms for 50M (millions docs), Tomcat, 8-CPU: http://www.tokenizer.org == - something obviously wrong in your case, 130ms is too high. Is it dedicated server? Disk swapping? Etc. > -Original Message- > From: Michael [mailto:solrco...@gmail.com] >

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
I'm using a Solr 1.4 nightly from around July. Is that recent enough to have the improved reader implementation? I'm not sure whether you'd call my operations IO heavy -- each query has so many terms (~50) that even against a 45K document index a query takes 130ms, but the entire index is in a ram

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
For 8-CPU load-stress testing of Tomcat you are probably making mistake: - you should execute load-stress software and wait 5-30 minutes (depends on index size) BEFORE taking measurements. 1. JVM HotSpot need to compile everything into native code 2. Tomcat Thread Pool needs warm up 3. SOLR caches

Re: Phrase stopwords

2009-09-23 Thread AHMET ARSLAN
> From: Pooja Verlani > Subject: Phrase stopwords > To: solr-user@lucene.apache.org > Date: Wednesday, September 23, 2009, 1:15 PM > Hi, > Is it possible to have a phrase as a stopword in solr? In > case, please share > how to do so? > > regards, > Pooja > I think that can be implemented castin

about url field error

2009-09-23 Thread net_nav
hello guy I am newbie on solr. I have running solr on tomcat6, all is ok, when i add data to solrserver via http post cause a error the below is code SolrInputDocument solrdoc=new SolrInputDocument(); solrdoc.addField("url",request.getParameter(URL)); 2009-9-23 21:18:03 org

Re: Exact match

2009-09-23 Thread AHMET ARSLAN
> Hi, >   > I am doing exact search in Solr .In Solr admin page I  am > giving the search input string for search. > For ex: I am giving “channeL12” as search input string > in solr home page it displays search results as >   > >   http://rediff >   first >   channeL12 > >   > As there is a matc

Exact match

2009-09-23 Thread bhaskar chandrasekar
Hi,   I am doing exact search in Solr .In Solr admin page I  am giving the search input string for search. For ex: I am giving “channeL12” as search input string in solr home page it displays search results as     http://rediff   first   channeL12   As there is a matching input for “channeL12”

Re: Oracle incomplete DataImport results

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:53 PM, Daniel Bradley wrote: > After investigating the log files, the DataImporter was throwing an error > from the Oracle DB driver: > > java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB > to RAW conversion (actual: 2890, maximum: 2000) > > Aka

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:50 PM, Ninad Raut wrote: > Is this feature included in SOLR 1.4?? > Yep. -- Regards, Shalin Shekhar Mangar.

RE: Oracle incomplete DataImport results

2009-09-23 Thread Daniel Bradley
After investigating the log files, the DataImporter was throwing an error from the Oracle DB driver: java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 2890, maximum: 2000) Aka. There was a problem with the 551st item where a related item had

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Is this feature included in SOLR 1.4?? On Wed, Sep 23, 2009 at 3:29 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut >wrote: > > > Hi, > > When we have news content crawled we face a problme of same content being > > repeated in many docume

Phrase stopwords

2009-09-23 Thread Pooja Verlani
Hi, Is it possible to have a phrase as a stopword in solr? In case, please share how to do so? regards, Pooja

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote: > Hi, > When we have news content crawled we face a problme of same content being > repeated in many documents. We want to add a near duplicate document > filter > to detect such documents. Is there a way to do that in SOLR? > Look at http://wik

Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Hi, When we have news content crawled we face a problme of same content being repeated in many documents. We want to add a near duplicate document filter to detect such documents. Is there a way to do that in SOLR? Regards, Ninad Raut.

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Avlesh Singh
> > I'm sorry I don't understand the question. Do you mean to say that > highlighting works with one but not with another? > Yes. Cheers Avlesh On Wed, Sep 23, 2009 at 12:59 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh wrote: > > >

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh wrote: > Hmmm .. But ngrams with KeywordTokenizerFactory instead of the > WhitespaceTokenizerFactory work just as fine. Related issues? > > I'm sorry I don't understand the question. Do you mean to say that highlighting works with one but not with an

Re: Solr with Auto-suggest

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 11:30 AM, dharhsana wrote: > > Hi Ryan, > > I gone through your post > https://issues.apache.org/jira/browse/SOLR-357 > > where you mention about prefix filter,can you tell me how to use that > patch,and you mentioned to use the code as bellow, > > positionIncrementGap="1"

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Avlesh Singh
Hmmm .. But ngrams with KeywordTokenizerFactory instead of the WhitespaceTokenizerFactory work just as fine. Related issues? Cheers Avlesh On Wed, Sep 23, 2009 at 12:27 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Wed, Sep 23, 2009 at 12:23 PM, Avlesh Singh wrote: > > > I hav