Re: Document field data not getting indexed

2007-11-30 Thread Phillip Farber
Well this one falls into the category of bald faced embarrassment. It's a bug in my process. Thanks to all for taking the time to respond. Have I said how great solr support is? :-) Phil Phillip Farber wrote: Hi Yonik, Hoss, et. al. I'm using numItems=2000 in the luke url so I am seeing a

Re: Multi-core Solr Wiki docs?

2007-11-30 Thread Ryan McKinley
Otis Gospodnetic wrote: Hi, I was looking for the Wiki docs about the multi-core stuff Henri contributed, but couldn't find any. Do we just not have that yet? I found http://wiki.apache.org/solr/MultipleIndexes , but that's "the old way". Currently it is only possible programmatically. Ch

Re: Solr Highlighting, word index

2007-11-30 Thread Erick Erickson
Oh, good luck on this! I've had similar issues and have just thrown up my hands. How do you expect to be able to correlate a word in the index with the bounding box in the OCR? I'm not sure this is a solved problem unless your OCR is *very* regular and clean. Even if you can calculate the ordinal p

Embedded SOLR - Commit issue

2007-11-30 Thread Sunny Bassan
I have implemented the embedded SOLR approach for indexing of database records. I am indexing approximately 10 millions records, querying and indexing 20,000 records at a time. Each record is added to the updateHandler via the updateHandler.addDoc() function once all 20,000 records have been added

Tips for searching

2007-11-30 Thread Dave C .
Hello, I was wondering if someone could offer some suggestions for a problem I am having. When I perform a query on a field using a single character, I get no results back: for example, if i search on the "description" field for the character 'a', i.e. (description:a), I get no results back.

Re: Tips for searching

2007-11-30 Thread Mike Klaas
On 30-Nov-07, at 4:43 PM, Dave C. wrote: Thanks for the quick response Mike... Ideally it should match more than just a single character, i.e. "the" in "weather" or "pro" in "profile" or "000" in "18000". Would these cases be taken care of by the StopFilterFactory? No... you are looking f

RE: Tips for searching

2007-11-30 Thread Dave C .
Actually, yes, I want to be able to match *any* single character, not just 'a'. The StopFilterFactory wiki doesn't help with that, apparently...correct? -David > From: [EMAIL PROTECTED] > To: solr-user@lucene.apache.org > Subject: RE: Tips for searching > Date: Sat, 1 Dec 2007 00:43:21 + >

RE: Tips for searching

2007-11-30 Thread Dave C .
Thanks for the quick response Mike... Ideally it should match more than just a single character, i.e. "the" in "weather" or "pro" in "profile" or "000" in "18000". Would these cases be taken care of by the StopFilterFactory? Thanks, david > From: [EMAIL PROTECTED] > Subject: Re: Tips for sear

RE: Solr Highlighting, word index

2007-11-30 Thread Owens, Martin
> Or I'm just completely off base here. A little, we already have the locations for each word on every ocr, we just need the word index to feed into the existing program. Best Regards, Martin Owens

RE: LowerCaseFilterFactory and spellchecker

2007-11-30 Thread Norskog, Lance
What would also help is a query to find records for the spellcheck dictionary builder. We would like to make separate spelling indexes for all records in english, one in spanish, etc. We would also like to slice&dice the records by other dimensions as well, and have separate spelling DBs for each p

Multi-core Solr Wiki docs?

2007-11-30 Thread Otis Gospodnetic
Hi, I was looking for the Wiki docs about the multi-core stuff Henri contributed, but couldn't find any. Do we just not have that yet? I found http://wiki.apache.org/solr/MultipleIndexes , but that's "the old way". Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: What does "Missing sort order" error mean

2007-11-30 Thread Yonik Seeley
On Nov 30, 2007 4:40 PM, Phillip Farber <[EMAIL PROTECTED]> wrote: > Just getting my head around Solr queries. I get HTTP ERROR: 400 > Missing sort order. when I issue this query: > > http://localhost:8983/solr/select/?q=car&sort=title order is asc or desc so try sort=tutle asc or sort=title des

Re: Solr Highlighting, word index

2007-11-30 Thread Mike Klaas
On 30-Nov-07, at 1:02 PM, Owens, Martin wrote: Hello everyone, We're working to replace the old Linux version of dtSearch with Lucene/Solr, using the http requests for our perl side and java for the indexing. The functionality that is causing the most problems is the highlighting since

Re: SOLR 1.2 - Updates sent containing fields that are not on the Schema fail silently

2007-11-30 Thread Ryan McKinley
Daniel Alheiros wrote: Hi Hoss. I'm using Solr 1.2 and a SolrJ client built from the trunk some time ago (21st of June 2007). One thing I just thought of do you have a request handler defined at: If not, it uses a legacy 1.1 format that returns a 200 response code and puts the error

Re: Document field data not getting indexed

2007-11-30 Thread Yonik Seeley
On Nov 30, 2007 9:03 AM, Phillip Farber <[EMAIL PROTECTED]> wrote: > I'm using numItems=2000 in the luke url so I am seeing all the items in > the index or at least up through c in the alphabet: If Luke is sorting by high term, you wouldn't necessarily see it. Regardless, the search you did below

Re: Local Disk and SAN

2007-11-30 Thread John Stewart
Jae, We recently benchmarked local, SAN and NFS using a real-world Lucene-based benchmark. For searching we found that SAN was marginally slower than local disks, about 1% slower, while for adding documents the SAN was 3x faster, doubtless because of the high parallelism in the writes. I would s

Re: Document field data not getting indexed

2007-11-30 Thread Phillip Farber
Hi Yonik, Hoss, et. al. I'm using numItems=2000 in the luke url so I am seeing all the items in the index or at least up through c in the alphabet: http://localhost:8983/solr/admin/luke?fl=ocr&numTerms=2000 When I index all 22 of my documents including doc id=44 which contains the word "Camp

Re: Boosting and copy fields

2007-11-30 Thread Robert Young
Right, ok, thanks. On Nov 30, 2007 2:14 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > On Nov 30, 2007 7:18 AM, Robert Young <[EMAIL PROTECTED]> wrote: > > How does the copy field with boosted fields? If I have three fields > > with different boost values and they all get copied into a copy field

Re: Boosting and copy fields

2007-11-30 Thread Yonik Seeley
On Nov 30, 2007 7:18 AM, Robert Young <[EMAIL PROTECTED]> wrote: > How does the copy field with boosted fields? If I have three fields > with different boost values and they all get copied into a copy field, > are these boosts taken into account during searching? They are all multiplied together (

Boosting and copy fields

2007-11-30 Thread Robert Young
Hi, How does the copy field with boosted fields? If I have three fields with different boost values and they all get copied into a copy field, are these boosts taken into account during searching? Cheers Rob

Re: SOLR 1.2 - Updates sent containing fields that are not on the Schema fail silently

2007-11-30 Thread Daniel Alheiros
Hi Hoss. I'm using Solr 1.2 and a SolrJ client built from the trunk some time ago (21st of June 2007). When a document is indexed I can see that INFO message on my logs showing exactly what you said, but nothing is logged in this situation I've described initially. I'm using this logging conf:

Re: Solr Highlighting, word index

2007-11-30 Thread Erick Erickson
It's good you already have the data because if you somehow got it from some sort of calculations I'd have to tell my product manager that the feature he wanted that I told him couldn't be done with our data was possible after all ... About page breaks: Another approach to paging is to index a spe

Re: Distribution without SSH?

2007-11-30 Thread Marcus Stratmann
Justin Knoll wrote: We plan to attempt to rewrite the snappuller (and possibly other distribution scripts, as required) to eliminate this dependency on SSH. I thought I ask the list in case anyone has experience with this same situation or any insights into the reasoning behind requiring SSH ac

Local Disk and SAN

2007-11-30 Thread Jae Joo
Hi, I have about 20G bytes of index with 1 Million transactions per day. I am considering the disk system between local disk and SAN based system (not NFS). Is there any performance difference to run solr instance with 20 G index on local disk and on SAN based disk which is connected with fiber ch

Solr Highlighting, word index

2007-11-30 Thread Owens, Martin
Hello everyone, We're working to replace the old Linux version of dtSearch with Lucene/Solr, using the http requests for our perl side and java for the indexing. The functionality that is causing the most problems is the highlighting since we're not storing the text in solr (only indexing) and

What does "Missing sort order" error mean

2007-11-30 Thread Phillip Farber
Hi, Just getting my head around Solr queries. I get HTTP ERROR: 400 Missing sort order. when I issue this query: http://localhost:8983/solr/select/?q=car&sort=title My title field is: stored="false" required="true"/> where "myAlphaOnlySort" is based on the alphaOnlySort from the example

Re: LowerCaseFilterFactory and spellchecker

2007-11-30 Thread Mike Klaas
That's a pretty difficult proposition. Currently the spellcheck doesn't look at documents at all: only the top-level term&count data is used to create the index. Adding select-by-query would be considerably more complicated and expensive (I think a near-full iteration of TermDocs would be

Re: Tips for searching

2007-11-30 Thread Mike Klaas
On 30-Nov-07, at 4:27 PM, Dave C. wrote: When I perform a query on a field using a single character, I get no results back: for example, if i search on the "description" field for the character 'a', i.e. (description:a), I get no results back. I would like to get all documents where the le