Re: how large can the index be?

2009-01-08 Thread Rayudu
Hi, with 20 fields for 3 millonDocs. how much time will it take for indexing and searching ( searhing all indices whether distributed or on a singleNode). Assume if one has to fire a 3 million queries in this case which returns all the docs. I just want to know the metrics. I think If we have

Re: Clustering Carrot2 + Solr

2009-01-08 Thread Jean-Philip EIMECKE
Thanks for answers.. So, I download Solr thanks to SVN -> https://svn.apache.org/repos/asf/lucene/solr/trunk/ I apply the patch SOLR-769 and i have these errors messages : debian:/home/jeimecke/Desktop/solr# patch -p 0 -i SOLR-769.patch patching file NOTICE.txt Hunk #5 FAILED at 106. 1 out of 5 hu

Does search query return specific result.?

2009-01-08 Thread Kalidoss MM
Hi, We are trying to implement an auto-suggest feature in our application that uses Solr as the core engine for search. The XML is structured as follows: QLrKnirLDEo9DThnL2h Description Cat1 Cat2 Kalidoss Kaling Soundoss We transform the same in solr understandable fo

Re: Solr query for date

2009-01-08 Thread prerna07
My requirement is to fetch records whthin range of 45 days. 1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results 2) ?q=date_field:[NOW TO NOW+45DAYS] is throwing exception however I get correct results when i run following query : ?q=date_field:[* TO NOW] Please suggest the c

Query regarding Spelling Suggestions

2009-01-08 Thread Deshpande, Mukta
Hi, I am using Wordnet dictionary for spelling suggestions. The dictionary is converted to Solr index with only one field "word" and stored in location /data/syn_index, using syns2Index.java program available at http://www.tropo.com/techno/java/lucene/wordnet.html I have added the "word" fie

Re: Solr query for date

2009-01-08 Thread Akshay
On Thu, Jan 8, 2009 at 3:38 PM, prerna07 wrote: > > > > My requirement is to fetch records whthin range of 45 days. > > 1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results this works for me if you interchange the range limits viz. [NOW-45DAYS TO NOW] > > 2) ?q=date_field:[NOW TO

Querying back with top few results in the same XMLWriter!

2009-01-08 Thread Pooja Verlani
Hi, I am using a ranking algorithm by modifying the XMLWriter to use a formulation which takes the top 3 results and query with the 3 results and now presents the result with as function of the results from these 3 queries. Can anyone reply if I can take the top 3results and query with them in the

Re: Solr query for date

2009-01-08 Thread prerna07
1) [NOW-45 TO NOW] works for me now. 2) [NOW TO NOW+45DAYS] is still throwing following exception : -- message org.apache.lucene.queryParser.ParseException: Cannot parse 'dateToTest_product_s:[NOW TO NOW 45DAYS]': Encountered "45DAYS" at line 1, column 33. Was ex

Re: Solr query for date

2009-01-08 Thread Akshay
On Thu, Jan 8, 2009 at 4:46 PM, prerna07 wrote: > > > 1) [NOW-45 TO NOW] works for me now. > 2) [NOW TO NOW+45DAYS] is still throwing following exception : > > -- > message org.apache.lucene.queryParser.ParseException: Cannot parse > 'dateToTest_product_s:[NOW TO N

Problem in Out Put of Search

2009-01-08 Thread rohit arora
Hi, I have installed solr lucene 1.3. I am facing a problem wile searching it did not provides multiple records. Instead of providing multiple records it provides single record multiple times.. with regards  Rohit Arora

Re: Problem in Out Put of Search

2009-01-08 Thread Erik Hatcher
Please provide an example of what you mean. What and how did you index? What was the query? Erik On Jan 8, 2009, at 8:34 AM, rohit arora wrote: Hi, I have installed solr lucene 1.3. I am facing a problem wile searching it did not provides multiple records. Instead of providi

Re: Does search query return specific result.?

2009-01-08 Thread Kalidoss MM
Any update on this?? Please let me know? On Thu, Jan 8, 2009 at 3:34 PM, Kalidoss MM wrote: > Hi, > > We are trying to implement an auto-suggest feature in our application that > uses Solr as the core engine for search. > > The XML is structured as follows: > > > QLrKnirLDEo9DThnL2h > > Descri

Re: Clustering Carrot2 + Solr

2009-01-08 Thread Grant Ingersoll
Hmm, OK. This is due, I bet, to some source being moved around in trunk and being in a different location in the build area. The trick would be to change the classpath as appropriate in the clustering contrib build. I will try to put up a new patch this weekend. On Jan 8, 2009, at 4:51

Re: Query regarding Spelling Suggestions

2009-01-08 Thread Grant Ingersoll
Did you send in the build command? See http://wiki.apache.org/solr/SpellCheckComponent On Jan 8, 2009, at 5:14 AM, Deshpande, Mukta wrote: Hi, I am using Wordnet dictionary for spelling suggestions. The dictionary is converted to Solr index with only one field "word" and stored in location

Re: Amount range and facet fields returns [facet_fields]

2009-01-08 Thread Yevgeniy Belman
Ah ok, the response i get when executing only the following, produces no facet counts. It could be a bug. facet.query=[price:[* TO 500], price:[500 TO *] However, when i add an unrelated facet field, i do get the desired count: query=[price:[* TO 500], price:[500 TO *]],q=*:*,facet.field=cat

Re: Clustering Carrot2 + Solr

2009-01-08 Thread Jean-Philip EIMECKE
Thanks for considering my problem Cheers, Jean-Philip Eimecke

Re: Date Range query in Solr

2009-01-08 Thread Rayudu
Hi, I too have a similar question on getting the query results based on dateRange. I have both startDate and endDate fields in my schema and if I want to get the query results that fall into two date values for eg: get all the docs. whose date is between startDate and endDate, then how can

Querying based on term position possible?

2009-01-08 Thread Mark Tovey
I'm a relative newbie at Solr/Lucene so apologies if this question is overly simplistic. I have an index built and functioning as expected, but I am trying to build a query that can sort/score results based on the search terms position in the document, with a document appearing higher in the result

Re: Querying based on term position possible?

2009-01-08 Thread Otis Gospodnetic
Hello Mark, You could have position information play a role in scoring if you use Span* family of queries. I believe they are currently not supported by Solr, but I believe you could use QSolr + https://issues.apache.org/jira/browse/SOLR-896 to get what you need. As for assigning different we

Solr Replication Performance

2009-01-08 Thread David Giffin
Hi There, I have been building a Solr environment that indexes roughly 3 million products. The current index is roughly 9gig in size. We have bumped into some issues performance issues with Solr's Replication. During the Solr slave snapshot installation, query times take longer and may in some cas

2 questions about solr spellcheck

2009-01-08 Thread Qingdi
Hi, I use solr 1.3 and I have two questions about spellcheck. 1) if my index docs are like: university1 UNIVERSITY street1, city1 LOCATION is it possible to build the spell check dictionary using field "NAME" but with filter "TYPE"="UNIVERSITY"? That is, I only want to include the university

Missing high-scoring results in 1.3

2009-01-08 Thread Walter Underwood
I'm seeing a really weird problem with Solr 1.3. The best match for a query will not show up with 10 rows, but will show up if I request more, sometimes 200, sometimes it takes 1000 rows. I tried increasing the row size by 10 and with some of those increments, the first hit would change to a more

Re: Flipping data dirs for an (/multiple) SolrCore without affecting search / IndexReaders

2009-01-08 Thread Kay Kay
Chris Hostetter wrote: : We have an architecture where we want to flip the solr data.dir (massive : dataset) while running and serving search requests with minimal downtime. ... : 1) What is the fastest / best possible way to get step 1 done ,through a : pluggable architecture. : : Curr

Solr on a multiprocessor machine

2009-01-08 Thread smock
Hi All, I'm very new to Solr, and also fairly new Java and servlet containers, etc. I'm trying to set up Solr on a single machine with a distributed index. My current implementation uses Tomcat as a servlet container with multiple instances of Solr being served. Each instance of Solr is a shar

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
Distributed search requires more work (more than one pass.) If you weren't CPU bound to begin with, it's definitely going to make things worse by splitting up the index on the same box. -Yonik On Thu, Jan 8, 2009 at 3:53 PM, smock wrote: > > Hi All, > > I'm very new to Solr, and also fairly new

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Hi Yonik, Thanks for the reply - could you please give me some more details on what you mean? I was able to obtain a performance boost by distributing Sphinx on the same box, with multiple processors. Each instance of sphinx ran on a different processor, and given that there was a performance b

Beginner: importing own data

2009-01-08 Thread phil cryer
So I have Solr running, I've run through the tutorials online, can import data from the example xml and see the results, so it works! Now, I take some xml data I have, convert it over to the add / doc type that the demo ones are, run it and find out which fields aren't defined in schema.xml, I add

Re: Missing high-scoring results in 1.3

2009-01-08 Thread Walter Underwood
Hmm, this was fixed by restarting Solr. When does Solr/Lucene check for index file formats? We switched from a Lucene 1.9 index to a Lucene 2.4 index without a restart. Could that cause this? wunder On 1/8/09 11:40 AM, "Walter Underwood" wrote: > I'm seeing a really weird problem with Solr 1.3

Re: Beginner: importing own data

2009-01-08 Thread Otis Gospodnetic
Phil, The easiest thing to do at this stage in Solr learning experience is to restart Solr (servlet container) and redo the search. Results shouls start showing up then because this will effectively reopen the index. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - O

Re: Solr Replication Performance

2009-01-08 Thread Otis Gospodnetic
Hm, this is becomeing a FAQ :) Have you checked recent discussions about this via markmail.org? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: David Giffin > To: solr-user@lucene.apache.org > Sent: Thursday, January 8, 2009 2:04:39 PM > S

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
On Thu, Jan 8, 2009 at 4:51 PM, smock wrote: > Thanks for the reply - could you please give me some more details on what > you mean? If there isn't enough memory to cache the index in RAM, then your bottleneck could be from retrieving stored fields from disk. Distributed search will make this muc

Overlapping Replication Scripts

2009-01-08 Thread wojtekpia
I have set up cron jobs that update my index every 15 minutes. I have a distributed setup, so the steps are: 1. Update index on indexer machine (and possibly optimize) 2. Invoke snapshooter on indexer 3. Invoke snappuller on searcher 4. Invoke snapinstaller on searcher. These updates are small, d

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Assuming I have enough RAM then, should I be able to get a performance boost with my current setup? Basically, the question I am trying to answer is - will the Tomcat+Solr setup I have above utilize multiple processors or do I need to do something else (like having a different tomcat instance for

Re: Solr on a multiprocessor machine

2009-01-08 Thread Walter Underwood
Solr will use multiple processors. Most of your speed will come from cached responses. Use a single instance, test with real query logs, and tune the cache sizes by looking at the cache hit statistics in the statistics page of the Solr admin UI. wunder On 1/8/09 3:37 PM, "smock" wrote: > > Ass

Re: Solr on a multiprocessor machine

2009-01-08 Thread Mike Klaas
On 8-Jan-09, at 3:37 PM, smock wrote: Assuming I have enough RAM then, should I be able to get a performance boost with my current setup? Basically, the question I am trying to answer is - will the Tomcat+Solr setup I have above utilize multiple processors or do I need to do something el

Re: Solr/Lucene MoreLikeThis with RangeQuery

2009-01-08 Thread Chris Hostetter
: Solr/Lucene. I am in a situation where I think that I can improve the : quality of the LikeThis-documents significantly by restricting the : MoreLikeThis-query to documents where one field has its term in a : specified range. That is, I would like to add a RangeQuery to the : default MoreLikeThi

Re: Dismax Minimum Match/Stopwords Bug

2009-01-08 Thread Chris Hostetter
: Hmm, that makes sense to me - however I still think that even if we have mm : set to "2" and we have "the 7449078" it should still match 7449078 in a : productId field (it does not: : http://zeta.zappos.com/search?department=&term=the+7449078). This seems like : it works against the way one woul

Re: Beginner: importing own data

2009-01-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
did you explore using SolrJ to index data? http://wiki.apache.org/solr/Solrj or DataImportHandler. http://wiki.apache.org/solr/DataImportHandler On Fri, Jan 9, 2009 at 3:49 AM, Otis Gospodnetic wrote: > Phil, > > The easiest thing to do at this stage in Solr learning experience is to > restart

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Mike, I should have more than enough RAM to fit the index in, I don't think my searches will be IO bound. One question - just to make sure I understand - did you use one Jetty instance per shard? In my case, what I'm doing is using one Tomcat instance to run multiple Solr webapps. I'm not sur

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
On Thu, Jan 8, 2009 at 9:25 PM, smock wrote: > I should have more than enough RAM to fit the index in, I don't think my > searches will be IO bound. There is still overhead to distributed search - if the actual CPU bound search/faceting stuff isn't your bottleneck, or if the index is too small, t

Re: Problem with WT parameter when upgrading from Solr1.2 to solr1.3

2009-01-08 Thread Chris Hostetter
: I just upgraded my system from Solr 1.2 to Solr 1.3. I am using the same : plugin for the queryResponseWriter that I used in Solr1.2. Problem here is : that when I am using *wt* parameter as the plugin name with full package : then I don't get the response which I used to get in 1.2 and when I d

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Yonik, I don't mean to be argumentative - just trying to understand, what is the difference between distributed search across processors, and distributed search across boxes (again, assuming that my searches are truly CPU bound)? My only basis for comparison is sphinx, which I was able to get to

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
On Thu, Jan 8, 2009 at 10:03 PM, smock wrote: > I don't mean to be argumentative - just trying to understand, what is the > difference between distributed search across processors, and distributed > search across boxes (again, assuming that my searches are truly CPU bound)? Even if your searches

Re: Problem with WT parameter when upgrading from Solr1.2 to solr1.3

2009-01-08 Thread Yonik Seeley
On Thu, Jan 8, 2009 at 9:40 PM, Chris Hostetter wrote: > you have a custom response writer you had working in > Solr 1.2, and now you are trying to use that same custom response writer in > Solr 1.3 with distributed requests? Right, that's probably the crux of it - distributed search required som

Re: Missing high-scoring results in 1.3

2009-01-08 Thread Yonik Seeley
On Thu, Jan 8, 2009 at 5:07 PM, Walter Underwood wrote: > Hmm, this was fixed by restarting Solr. That's a bit spooky. > When does Solr/Lucene check for index file formats? We switched from > a Lucene 1.9 index to a Lucene 2.4 index without a restart. Could that > cause this? It should be whene

Re: Flipping data dirs for an (/multiple) SolrCore without affecting search / IndexReaders

2009-01-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Kay, do you wish to change data dir or just indexdir. It is possible to switch indexDir w/ minimal overhead. dataDir can contain a index.properties file which can contain a property called 'index' it can point to your new index . but you will have to find a way to populate this index Replica

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Hi Yonik, I see, I didn't realize that there was a 2nd phase to retrieve stored values. Sphinx also queries the top n number of documents and combines the results - unless the algorithm is very different, I wouldn't expect that this adds a lot of overhead as sphinx has a very definite performance

RE: Query regarding Spelling Suggestions

2009-01-08 Thread Deshpande, Mukta
Yes. I send the build command as: http://localhost:8080/solr/select/?q=documnet&spellcheck=true&spellcheck .build=true&spellcheck.count=2&spellcheck.q=parfect&spellcheck.dictionar y=dict The Tomcat log shows: Jan 9, 2009 9:55:19 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
Maybe we should back up a bit and look at your requirements: both query latency and throughput. If the index is small enough, distributed search is definitely not the first step to take to address performance issues - there are many other things to look into first. Start by looking at what queries

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
Hi Yonik, In some ways I have a 'small index' (~8 million documents at the moment). However, I have a lot of attributes (currently about 30, but I'm expecting that number to keep growing) and am interested in faceting across all of them for every search (on a completely unrelated note, if you h

Re: Solr on a multiprocessor machine

2009-01-08 Thread Yonik Seeley
Are you on Solr 1.3 or a recent nightly build? The development version of 1.4 has a number of scalability enhancements. -Yonik On Fri, Jan 9, 2009 at 12:18 AM, smock wrote: > > Hi Yonik, > > In some ways I have a 'small index' (~8 million documents at the moment). > However, I have a lot of at

Re: Solr on a multiprocessor machine

2009-01-08 Thread smock
I'm using 1.3 - are the nightly builds stable enough to use in production? yonik wrote: > > Are you on Solr 1.3 or a recent nightly build? The development > version of 1.4 has a number of scalability enhancements. > > -Yonik > > On Fri, Jan 9, 2009 at 12:18 AM, smock wrote: >> >> Hi Yonik,

Re: Problem in Out Put of Search

2009-01-08 Thread rohit arora
Hi, It gives this out put ..     5.361002     8232     Quality Testing International     Quality Testing International the ideal exhibition for measuring technique testing of materials and quality assurance. Profile for exhibit include Customer profiling; customer marketing; loyalty systems

Re: Problem in Out Put of Search

2009-01-08 Thread Shalin Shekhar Mangar
There are two documents in that response. Are you adding the same document multiple times to Solr? You can also specify a uniqueKey in the schema.xml which will make sure that Solr keeps only one document for a given key and removes the duplicate documents. In the response you have pasted, the 'i

Querying Solr Index for date fields

2009-01-08 Thread Rayudu
Hi All, I have a field with is solr.DateField in my schema file. If I want to get the docs. for a given date for eg: get all the docs. whose date value is 2009-01-09 then how can I query my index. As solr's date format is -mm-ddThh:mm:ss, if I give the date as 2009-01-09T00

Re: Problem in Out Put of Search

2009-01-08 Thread rohit arora
Hi, I have add one document only single time but the out put provided by lucene give me the same document multiple times.. If i specify rows=2 in out put same document will be 2 times. If i specify rows=10 in out put same document will be 10 times. I have already defined 'id' field as a unique

Re: Querying Solr Index for date fields

2009-01-08 Thread Akshay
You will have to URL encode the string correctly and supply date in format Solr expects. Please check this: http://wiki.apache.org/solr/SolrQuerySyntax On Fri, Jan 9, 2009 at 12:21 PM, Rayudu wrote: > > Hi All, > I have a field with is solr.DateField in my schema file. If I want to > get th