largish test data set?

2007-09-17 Thread David Welton
Hi, I'm in the process of evaluating solr and sphinx, and have come to realize that actually having a large data set to run them against would be handy. However, I'm pretty new to both systems, so thought that perhaps asking around my produce something useful. What *I* mean by largish is somethi

solr locked itself out

2007-09-17 Thread vanderkerkoff
Hello everyone. I've been reading some posts on this forum and I thought it best to start my own post as our situation is different from evveryone elses, isn't it always :-) We've got a django powered website that has solr as it's search engine. We're using the example solr application and star

Re: solr locked itself out

2007-09-17 Thread vanderkerkoff
I found another post that suggested editing the unlockonstartup value in solrconfig.xml. Is that a wise idea? I'm attaching my .xml config file as well as it might shed some light on the matter http://www.nabble.com/file/p12735259/solrconfig.xml solrconfig.xml -- View this message in contex

Can we build complex filter queries in SOLR

2007-09-17 Thread Dilip.TS
Hi, I would like to know if we can build a complex filter queryString in SOLR using the following condition. (Field1 = "abc" AND Field2 = "def") OR (Field3 = "abcd" AND Field4 = "defgh" AND (...)). so on... Thanks in advance Regards, Dilip TS

Re: solr locked itself out

2007-09-17 Thread Ryan McKinley
vanderkerkoff wrote: I found another post that suggested editing the unlockonstartup value in solrconfig.xml. Is that a wise idea? If you only have a single solr instance at at time, it should be totally fine.

Re: Can we build complex filter queries in SOLR

2007-09-17 Thread Alessandro Ferrucci
yeah that is possible, I just tried on one of my solr instances..let's say you have an index of player names: (first-name:Tim AND last-name:Anderson) OR (first-name:Anwar AND last-name:Johnson) OR (conference:Mountain West) will give you the results that logically match this query.. HTH. Alessa

Re: Triggering snapshooter through web admin interface

2007-09-17 Thread Bill Au
There is no way to trigger snapshots taking through Solr's admin interface now. Taking a snapshot is a very light-weight operation. It uses hard links so each snapshot doesn't take up much additional disk space. If you don't want to replicate your index while the big batch job is still running,

Re: largish test data set?

2007-09-17 Thread Grant Ingersoll
You might be interested in the Lucene Java contrib/Benchmark task, which provides an indexing implementation of a download of Wikipedia (available at http://people.apache.org/~gsingers/wikipedia/) It is pretty trivial to convert the indexing code to send add commands to Solr. HTH, Grant

Re: largish test data set?

2007-09-17 Thread Daniel Alheiros
Hi Yonik. Do you have any performance statistics about those changes? Is it possible to upgrade to this new Lucene version using the Solr 1.2 stable version? Regards, Daniel On 17/9/07 17:37, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: > If you want to see what performance will be like on the ne

Re: 'suggest' query sorting

2007-09-17 Thread Matthew Runo
Hello! Were you able to find out anything? I'd be interested to know what you found out. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 15,

Re: largish test data set?

2007-09-17 Thread Yonik Seeley
If you want to see what performance will be like on the next release, you could try upgrading Solr's internal version of lucene to trunk (current dev version)... there have been some fantastic improvements in indexing speed. For query speed/throughput, Solr 1.2 or trunk should do fine. -Yonik On

Re: largish test data set?

2007-09-17 Thread Karl Wettin
17 sep 2007 kl. 12.06 skrev David Welton: I'm in the process of evaluating solr and sphinx, and have come to realize that actually having a large data set to run them against would be handy. However, I'm pretty new to both systems, so thought that perhaps asking around my produce something us

Re: Re[2]: multiple indices

2007-09-17 Thread Matt Kangas
Jack, the JNDI-enabling jarfiles now ship as part of the main .zip distribution. There is no need for a separate "JettyPlus" download as of Jetty 6. I used Jetty 6.1.3 (http://dist.codehaus.org/jetty/jetty-6.1.x/ jetty-6.1.3.zip) at the time, and I am using only these jarfiles from the mai

Re: commit, concurrency, full text search

2007-09-17 Thread Mike Klaas
On 16-Sep-07, at 11:23 PM, Dilip.TS wrote: Hi, 1)How does the commit works with multiple requests? Multiple updates? They block while the commit completes. 2)Does SOLR handle the concurrency during updates? It is parallelized as much as possible, yes. 3)Does solr support any thing like,

Re: Indexing Speed

2007-09-17 Thread Mike Klaas
On 16-Sep-07, at 8:01 PM, erolagnab wrote: Hi, Just a FYI. I've seen some posts mentioned that Solr can index 100-150 docs/s and the comparison between embedded solr and HTTP. I've tried to do the indexing with 1.7+ million docs, each doc has 30 fields among which 10 fields are indexed

Re: 'suggest' query sorting

2007-09-17 Thread Ryan McKinley
The prefix query work fine with EdgeNGramFilterFactory, but I'm still not sure how to get the sorting to work. I'm using: maxGramSize="20"/> If you have any ideas on the sorting, let me know! Matthew Runo wrote: Hello! Were you able to find out anythin

RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Wu, Daniel
> There is no way to trigger snapshots taking through Solr's admin interface > now. Taking a snapshot is a very light-weight operation. It uses hard > links so each snapshot doesn't take up much additional disk space. If you [Wu, Daniel] It is not a concern on the snapshot performance. Rather,

Faceting Vs using lucene filters ?

2007-09-17 Thread cricdigs
Hi, I have a collection of blogs. Each Solr document has one blog with 3 fields - blogger(id), title and blog text. The search is performed over all 3 fields. When doing the search I need to show 2 things: 1. Bloggers block with all the matching bloggers (so if a title, blog or blogger contains

RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Chris Hostetter
: I was also suggesting a new feature to allow sending messages to Solr : through http interface and a mechanism to handling the message on the : Solr server; in this case, a message to trigger snapshooter script. It : seems to me, a very useful feature to help simplify operational issues. it's

Re: Faceting Vs using lucene filters ?

2007-09-17 Thread Chris Hostetter
: 1. Bloggers block with all the matching bloggers (so if a title, blog or : blogger contains the search term, I show the blogger's id) : The first block is my problem since it shows multiple instances of the same : blogger if that blogger has multiple matching blogs. I can use faceting to : show

Re: Combining Proximity & Range search

2007-09-17 Thread Chris Hostetter
: My document will have a multivalued compound field like : : revision_01012007 : review_02012007 : : i am thinking of a query like comp:"type:review date:[02012007 TO : 02282007]"~0 your best bet is to change that so "revision" and "review" are the names of a field, and do a range search on t

Re: 'suggest' query sorting

2007-09-17 Thread Chris Hostetter
: How can I boost words where the whole value (not just the token) is closer to : the front of the value? That is, I want 'ca' to return: : 1. Canon PowerShot : 2. Canon EX PIXMA : 3. iPod Cable : 4. Video Card : (actually 1&2 could be swapped) i would argue that you don't want #3 and #4 at

Re: EdgeNGramTokenFilter, term position?

2007-09-17 Thread Chris Hostetter
: Should the EdgeNGramFilter use the same term position for the ngrams within a : single token? i can see the argument going both ways ... imagine a hypothetical CharSplitterTokenFilter that takes replaces each token in the stream with one token per character in the orriginal token (ie: "hello"

Re: EdgeNGramTokenFilter, term position?

2007-09-17 Thread Yonik Seeley
On 9/16/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Should the EdgeNGramFilter use the same term position for the ngrams > within a single token? It feels like that is the right approach. I don't see value in having them sequential, and I can think of uses for having them overlap. -Yonik

Re: Control index/store at document level

2007-09-17 Thread Chris Hostetter
: nope, the field options are created on startup -- you can't change them : dynamically (i don't know all the details, but I think it is a file format : issue, not just a configuration issue) In the underlying Lucene library most of these options can be controlled per document, but Solr simplifi

Re: Solr - rudimentary problems

2007-09-17 Thread Chris Hostetter
: The corresponding entry for this field in schema.xml is : : i'm guessing "text" is from the example schema.xml ... this is not a good type to use for a uniqueId field ... that alone might be causing some of your problems with replaceing docs ... try "string" : 2) Also, at the time of dele

Re: context-relative solr/home

2007-09-17 Thread Chris Hostetter
: As you can see, Iÿÿve used a variable called ${context.name}, so that : the different contexts can use the same fragment, but be named something : different. This allows for a much simpler deployment, where no : context.xml needs to be generated and packaged into the war at deploy : time. Un

Re: solr locked itself out

2007-09-17 Thread Adrian Sutton
ulimit is unlimited and cat /proc/sys/fs/file-max 11769 I just went through the same kind of mistake - ulimit doesn't report what you think it does, what you should check is ulimit -n (the -n isn't just the option to set the value). If you're using bash as your shell that will almost certa

UserTagDesign

2007-09-17 Thread Karl Wettin
I've been looking at on and off for a while and think all the use cases could be explained with simple UML class diagram semantics: [Taggable](tag:Tag)-- {0..*} |--- {0..*} --(tag:Tag)[Tagger] |

Re: 'suggest' query sorting

2007-09-17 Thread Ryan McKinley
if you really want #3 and #4 to show up, then have two fields: one using whitespace tokenizer, one using keyword tokenizer; both using EdgeNGramFilter ... boost the query to the first field higher then the second field (or just rely on the coordFactor and the fact that "ca" will match on both

RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Wu, Daniel
> -Original Message- > From: Chris Hostetter [mailto:[EMAIL PROTECTED] > Sent: Monday, September 17, 2007 1:28 PM > To: solr-user@lucene.apache.org > Subject: RE: Triggering snapshooter through web admin interface > > > : I was also suggesting a new feature to allow sending messages to

Re: Solr - rudimentary problems

2007-09-17 Thread Venkatraman S
C'est Parfait! .. yes - that was the problem. thanks a lot. I am compiling a complete list of FAQs - will update it in the wiki soon. -vEnKAt On 9/18/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : The corresponding entry for this field in schema.xml is : > : : stored="true" multiValued

Authentication for REST-RPC Webservices

2007-09-17 Thread Dilip.TS
Hi, Has anybody successfully called a REST-RPC Webservice for basic authentication. I would like to which is better one REST-RPC or REST with SOAP/WSDL and why? Regards Dilip

RE: Authentication for REST-RPC Webservices

2007-09-17 Thread Dilip.TS
Hi, To add to my earlier query which would be better a) using REST-RPC or b) using RESTFul Webservices using JAX-WS ? Regards Dilip -Original Message- From: Dilip.TS [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 18, 2007 11:41 AM To: solr-user@lucene.apache.org Subject: Authentic

Searching items with in the search results with SOLR

2007-09-17 Thread Dilip.TS
Hi, Is it possible to "Search items with in the search results" using SOLR. If so how? Thanks in advance, Regards, Dilip