Re: Deleting with the DIH sometimes doesn't delete

2010-08-13 Thread Qwerky
I'm using solr 1.4.1 and I've got about 280,000 docs in the index. I'm using a multi core setup (if that makes any difference) with 2 cores. When I check the stats from the JSP my updateHandler reports 3 deletes; cumulative_deletesById : 3 When I search from the admin page the docs are still fo

Deleting with the DIH sometimes doesn't delete

2010-08-12 Thread Qwerky
I'm doing deletes with the DIH but getting mixed results. Sometimes the documents get deleted, other times I can still find them in the index. What would prevent a doc from getting deleted? For example, I delete 594039 and get this in the logs; 2010-08-12 14:41:55,625 [Thread-210] INFO [DataImp

Re: Multi word synomyms

2010-08-04 Thread Qwerky
It would be nice if you could configure some kind of filter to be processed before the query string is passed to the parser. The QueryComponent class seems a nice place for this; a filter could be run against the raw query and ResponseBuilder's queryString value could be modified before the QParse

Multi word synomyms

2010-08-03 Thread Qwerky
I'm having trouble getting multi word synonyms to work. As an example I have the following synonym; exercise dvds => fitness When I search for exercise dvds I want to return all docs in the index which contain the keyword fitness. I've read the wiki about solr.SynonymFilterFactory which recommen

Re: Delta import processing duration

2010-07-23 Thread Qwerky
I found my problem! It was a bad custom EntityProcessor I wrote. My EntityProcessor wasn't checking for hasNext() on the Iterator from my FileImportDataImportHandler, it was just returning next(). The second bug was that when the Iterator ran out of records it was returning an empty Map (it now r

Delta import processing duration

2010-07-22 Thread Qwerky
I'm using Solr to index data from our data warehouse. The data is imported through text files. I've written a custom FileImportDataImportHandler that extends DataSource and it works fine - I've tested it with 280,000 records and it manages to build the index in about 3 minutes. My problem is that