Re: Composition of multiple smaller fields into another larger field?

2008-05-06 Thread Brian Johnson
Thank you for the reference to the ${foo} format. I am looking at trying to minimize the redundant data in my document feed since I have lots of records with an overall small footprint per record. This simple change can save me maybe 20% of my data set size. It also provides a mechanism to isol

Re: SOLR-470 & default value in schema with NOW (update)

2008-05-06 Thread Brian Johnson
Unfortunately that data set is long gone, but I can say that I am quite sure the data was consistently sent to Solr with 3 digits of millis when I provided the data in the documents. I confirmed this using luke and the data was consistent, but the exception persisted. I looked into the associate

Re: Solr (text) <> RDMBS (dynamic data) - best practies?

2008-05-06 Thread Ryan McKinley
* write response using custom response writer? (this may not be right, I'd have to check) that grabs the extra data from cache and includes it with each hit Not a custom response writer... use a custom QueryComponent to augment the document. Localsolr has a good example of this

Re: top documented in faceted query?

2008-05-06 Thread Chris Hostetter
: I could then get the top document for each value by issuing a sequence of : queries : q=x&fq=f:a&row=1 : q=x&fq=f:b&row=1 : q=x&fq=f:c&row=1 : ... : : Is there a way to do this in one query? Only if you write your own plugin ... Solr doesn't have anything do it for you. -Hoss

Re: Sorting results

2008-05-06 Thread Chris Hostetter
: I perform the search like Matahari. The returned results may include "A big : life: Matahari", "War and Matahari", "Matahari" (in that order). How can I : return results by sorting at first the results that matches the begiging of : string? I want to score higher the results that starts with sea

Re: stemming the synonyms

2008-05-06 Thread Chris Hostetter
: things related to vacation. However, when I enter in travelling it does : not find anything related to vacation, I assume it's because I'm not : explicitly putting travelling in the synonyms file. Is there a way to : activate stemming for all of the synonym terms in the file without having t

Re: SOLR-470 & default value in schema with NOW (update)

2008-05-06 Thread Chris Hostetter
: Second Try: : * same date column setup : * 2 files uploaded into the index. Updated the file with the timestamps : to be 3 digit millis to 'match' what NOW was supposed to be doing. I : left the other file alone. : --> got the exception.. check data in Luke to confirm it was all 3 digit : mi

Re: Multiple Index creation

2008-05-06 Thread Shalin Shekhar Mangar
Hi Vajinath, I believe you want multiple schemas. Take a look at http://wiki.apache.org/solr/MultiCore Note that this feature is available only with the Solr 1.3 trunk code. With Solr 1.2, you can have two instances of Tomcat or two solr webapps deployed in one Tomcat instance. You can also thin

Re: access control list

2008-05-06 Thread Chris Hostetter
: I thought of that method. The problem I was thinking of is that if a new : customer is added, that could potentially cause an update of about : 2,000,000 records or so. Fortunately, this does not happen everyday. It FWIW: at some point i nthe future, LUCENE-1231 might make this type of thin

Re: Your valuable suggestion on autocomplete

2008-05-06 Thread Walter Underwood
Query logs are full of junk. We fill from the correct values in the search index. We used to fill directly from the DB, but there were updates in the DB that weren't in Solr. Every two hours, it does a search for "type:movie" and retrieves the title field for every match. Those are loaded into the

Re: Solr (text) <> RDMBS (dynamic data) - best practies?

2008-05-06 Thread igrigorik
Otis Gospodnetic wrote: > AideRSS, eh, nice, welcome :) ;-) > > Since for 1) you will have to go to your DB, why not just store the > retrieved data somewhere (JVM, memcached...) and simply re-use it for 2? > * get query > * get data from DB for filtering > * store data from DB in cache > * run

Multiple Index creation

2008-05-06 Thread Vaijanath N. Rao
Hi All, I tried to search within the SOLR archive, but could not find the answer of how can I create multiple index within SOLR. In case of lucene I can create an IndexWriter with a new Index, and hence can have multiple Index, I can allow search on that multiple index. How can I create in So

Re: complex queries

2008-05-06 Thread Kevin Osborn
Unfortunately, I don't know value1, value2, value3, etc. This goes back to my question about access control lists. So, I have all my documents, which are products. And then someone suggested that I have a separate user document type with a multi-value field of productIDs. In SQL, this would be

Re: Solr (text) <> RDMBS (dynamic data) - best practies?

2008-05-06 Thread Otis Gospodnetic
AideRSS, eh, nice, welcome :) Since for 1) you will have to go to your DB, why not just store the retrieved data somewhere (JVM, memcached...) and simply re-use it for 2? * get query * get data from DB for filtering * store data from DB in cache * run query * write response using custom respo

Re: Composition of multiple smaller fields into another larger field?

2008-05-06 Thread Otis Gospodnetic
Brian, I think most people would just manipulate the data prior to sending it to Solr for indexing but you don't want that. Your composeField proposal looks fine to me - I can't think of a problem there. It sounds like you are asking about the language/syntax for field specification. Coul

Re: Help optimizing

2008-05-06 Thread Otis Gospodnetic
Daniel, The main difference is that string type fields are not tokenized, while text type fields are. Example: input text: milk with honey is god String fields will end up with a single token: "milk with honey is god" Text fields will end up with 5 tokens (assuming no stop word filtering)

Re: Help optimizing

2008-05-06 Thread Otis Gospodnetic
Daniel - regarding query time - yes, look at the response (assuming you are using XML responses) and look for "Qtime" in the top part of the response. That's the number of milliseconds it took to execute the query. This time does not include the network time (request to Solr + time to send the

RE: Help optimizing

2008-05-06 Thread Lance Norskog
There are two integer types, 'sint' and 'integer'. On an integer, you cannot do a range check (that makes sense). But! Lucene sort makes an array of integers for every record. On an integer field, it creates an integer array. On any other kind of field, each array item has a lot more. So, if you

Re: Searching for empty fields

2008-05-06 Thread Brendan Grainger
Hi, Not sure if this is what you want, but to search for 'empty' fields we use something like this: (*:* AND -color:[* TO *]) Hope that helps. Brendan On May 6, 2008, at 6:43 PM, Daniel Andersson wrote: Hi (again) One of the fields in my database is color. It can either contain a valu

Re: complex queries

2008-05-06 Thread Erik Hatcher
On May 6, 2008, at 8:57 PM, Kevin Osborn wrote: I don't think this is possible, but I figure that I would ask. So, I want to find documents that match a search term and where a field in those documents are also in the results of a subquery. Basically, I am looking for the Solr equivalent of

complex queries

2008-05-06 Thread Kevin Osborn
I don't think this is possible, but I figure that I would ask. So, I want to find documents that match a search term and where a field in those documents are also in the results of a subquery. Basically, I am looking for the Solr equivalent of doing a SQL IN clause. As I said, I don't think it

Solr (text) <> RDMBS (dynamic data) - best practies?

2008-05-06 Thread igrigorik
We're investigating migrating from an RDMBS to Solr to add text search support, as well as, offload the text storage from our RDMBS (which is arguably not designed for this kind of stuff).. While whiteboarding the basic requirements, we realized that we have some 'special' requirements: Basic set

Re: Welcome, Koji

2008-05-06 Thread Koji Sekiguchi
Hi Erik and everyone! I'm looking forward to working with you. :) Cheers, Koji Erik Hatcher wrote: A warm welcome to our newest Solr committer, Koji Sekiguchi! He's been providing solid patches and improvements to Solr and the Ruby (solr-ruby/Flare) integration for a while now. Erik

Re: multi-language searching with Solr

2008-05-06 Thread Mike Klaas
On 5-May-08, at 1:28 PM, Eli K wrote: Wouldn't this impact both indexing and search performance and the size of the index? It is also probable that I will have more then one free text fields later on and with at least 20 languages this approach does not seem very manageable. Are there other opt

Searching for empty fields

2008-05-06 Thread Daniel Andersson
Hi (again) One of the fields in my database is color. It can either contain a value (blue, red etc) or be blank. When I perform a search with facet counts on, I get a count for "_empty_". How do I go about searching for this? I've tried color:"" which gives me an error. Same with color:.

Re: Help optimizing

2008-05-06 Thread Daniel Andersson
On May 6, 2008, at 7:26 PM, Lance Norskog wrote: One cause of out-of-memory is multiple simultaneous requests. If you limit the query stream to one or two simultaneous requests, you might fix this. No, Solr does not have an option for this. The servlet containers have controls for this that

Re: Help optimizing

2008-05-06 Thread Daniel Andersson
On May 6, 2008, at 2:19 PM, Grant Ingersoll wrote: On May 3, 2008, at 1:06 PM, Daniel Andersson wrote: When performing a search, the results vary between 1.5 seconds up to 60 seconds. Is this pure Solr time or overall application time? I ask, b/c it is often the case that people are mea

Re: Help optimizing

2008-05-06 Thread Daniel Andersson
On May 6, 2008, at 4:00 AM, Mike Klaas wrote: On 3-May-08, at 10:06 AM, Daniel Andersson wrote: How do I optimize Solr to better use all the RAM? I'm using java6, 64bit version, and start Solr using: java -Xmx7500M -Xms4096M -jar start.jar But according to top it only seems to be using 7.

Re: Help optimizing

2008-05-06 Thread Daniel Andersson
Thanks Otis! On May 4, 2008, at 4:32 AM, Otis Gospodnetic wrote: You have a lot of fields of type text, but a number of field sound like they really need not be tokenized and should thus be of type string. I've changed quite a few of them over to string. Still not sure about the differe

Re: Multiple SpellCheckRequestHandlers

2008-05-06 Thread Otis Gospodnetic
I don't think so. I just prefer shorter (cleaner?) URLs. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: solr_user <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, May 6, 2008 3:35:43 PM > Subject: Re: Multiple SpellChec

Re: Multiple SpellCheckRequestHandlers

2008-05-06 Thread solr_user
Thanks Otis, Actually, I am planning to make use of the qt parameter to specify which handler should be used for the query. Would there be any downside to that? Otis Gospodnetic wrote: > > Hello, > > If you configured "/sc1" and "/sc2", then use something like > http://../sc1?. fo

Re: Help optimizing

2008-05-06 Thread Otis Gospodnetic
Hello, If you are using Jetty, you don't have to dig very deep - just look for the section about threads. Here is a snippet from Jetty 6.1.9's jetty.xml: 10 50 25 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr

Re: Multiple SpellCheckRequestHandlers

2008-05-06 Thread Otis Gospodnetic
Hello, If you configured "/sc1" and "/sc2", then use something like http://../sc1?. for the first one and http://./sc2? for the second one. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: solr_user <[EMAIL PROTECTED]> > To:

Composition of multiple smaller fields into another larger field?

2008-05-06 Thread Brian Johnson
I am interested in using the suggest feature against a composition of other more granular facets. Let me provide an example to help explain my problem and proposed approaches. Say I have a set of facets for these artifacts: So far things work OK. Now I want my suggest feature to wor

RE: multi-language searching with Solr

2008-05-06 Thread Tim Mahy
Hi, you could also use multiple Solr instances having specific settings and stopwords etc for the same field and upload your documents to the correct instance and than merge the indexes to one searchable index ... greetings, Tim Van: Eli K [EMAIL PR

Re: Delete's increase while adding new documents

2008-05-06 Thread Mike Klaas
On 6-May-08, at 4:56 AM, Tim Mahy wrote: Hi all, it seems that we get errors during the auto-commit : java.io.FileNotFoundException: /opt/solr/upload/nl/archive/data/ index/_4x.fnm (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAcc

Re: Multiple SpellCheckRequestHandlers

2008-05-06 Thread solr_user
And how do I specify in the query which requesthandler to use? Otis Gospodnetic wrote: > > Yes, just define two instances (with two distinct names) in solrconfig.xml > and point each of them to a different index. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > -

RE: Help optimizing

2008-05-06 Thread Lance Norskog
One cause of out-of-memory is multiple simultaneous requests. If you limit the query stream to one or two simultaneous requests, you might fix this. No, Solr does not have an option for this. The servlet containers have controls for this that you have to dig very deep to find. Lance Norskog

Welcome, Koji

2008-05-06 Thread Erik Hatcher
A warm welcome to our newest Solr committer, Koji Sekiguchi! He's been providing solid patches and improvements to Solr and the Ruby (solr-ruby/Flare) integration for a while now. Erik

Re: Your valuable suggestion on autocomplete

2008-05-06 Thread Otis Gospodnetic
Hi Wunder, - Original Message > From: Walter Underwood <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, May 6, 2008 11:21:31 AM > Subject: Re: Your valuable suggestion on autocomplete > > I wrote a prefix map (ternary search tree) in Java and load it with > queries

Re: multi-language searching with Solr

2008-05-06 Thread Eli K
Peter, Thanks for your help, I will prototype your solution and see if it makes sense for me. Eli On Mon, May 5, 2008 at 5:38 PM, Binkley, Peter <[EMAIL PROTECTED]> wrote: > It won't make much difference to the index size, since you'll only be > populating one of the language fields for each do

Re: Your valuable suggestion on autocomplete

2008-05-06 Thread Walter Underwood
I wrote a prefix map (ternary search tree) in Java and load it with queries to Solr every two hours. That keeps the autocomplete and search index in sync. Our autocomplete gets over 25M hits per day, so we don't really want to send all that traffic to Solr. wunder On 5/6/08 2:37 AM, "Nishant Son

[poll] Change logging to SLF4J?

2008-05-06 Thread Ryan McKinley
Hello- There has been a long running thread on solr-dev proposing switching the logging system to use something other then JDK logging. http://www.nabble.com/Solr-Logging-td16836646.html http://www.nabble.com/logging-through-log4j-td13747253.html We are considering using http://www.slf4j.org/. C

Re: Help optimizing

2008-05-06 Thread Grant Ingersoll
On May 3, 2008, at 1:06 PM, Daniel Andersson wrote: Hi (again) people We've now invested in a server with 8 GB of RAM after too many OutOfMemory-errors. Our database/index is 3.5 GB and contains 4,352,471 documents. Most documents are less than 1 kb. When performing a search, the results

RE: Delete's increase while adding new documents

2008-05-06 Thread Tim Mahy
Hi all, it seems that we get errors during the auto-commit : java.io.FileNotFoundException: /opt/solr/upload/nl/archive/data/index/_4x.fnm (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(RandomAccessFile.java:212)

Re: Your valuable suggestion on autocomplete

2008-05-06 Thread Nishant Soni
Just FYI, we have also implemented a Trie approach (outside of solr, even though our mail search uses solr) at the link in the signature. You can try out the auto-completion working on the comparison tool on the home page. - nishant www.reviewgist.com - Original Message From: Va

Re: Your valuable suggestion on autocomplete

2008-05-06 Thread Vaijanath N. Rao
Hi Rantjil Bould, I would suggest you to give a thought on Trie data structure which is used for auto-complete. Hitting Solr for every prefix looks time consuming job, but I might be wrong. I have Trie implementation and it works very fast (of course it is in memory data structure unlike solr