It is looking for documents with "Emory" in the specified field OR "Labs" in the default search field.

-- Jack Krupansky

-----Original Message----- From: Kissue Kissue
Sent: Wednesday, September 26, 2012 7:47 AM
To: solr-user@lucene.apache.org
Subject: Re: Items disappearing from Solr index

I have just solved this problem.

We have a field called catalogueId. One possible value for this field could
be "Emory Labs". I found out that when the following delete by query is
sent to solr:

getSolrServer().deleteByQuery(catalogueId + ":" + Emory Labs)  [Notice that
there are no quotes surrounding the catalogueId value - Emory Labs]

For some reason this delete by query ends up deleting the contents of some
other random catalogues too which is the reason why we are loosing items
from the index. When the query is changed to:

getSolrServer().deleteByQuery(catalogueId + ":" + "Emory Labs"), then it
starts to correctly delete only items in the Emory Labs catalogue.

So my first question is, what exactly does deleteByQuery do in the first
query without the quotes? How is it determining which catalogues to delete?

Secondly, shouldn't the correct behaviour be not to delete anything at all
in this case since when a search is done for the same catalogueId without
the quotes it just simply returns no results?

Thanks.


On Mon, Sep 24, 2012 at 3:12 PM, Kissue Kissue <kissue...@gmail.com> wrote:

Hi Erick,

Thanks for your reply. Yes i am using delete by query. I am currently
logging the number of items to be deleted before handing off to solr. And
from solr logs i can it deleted exactly that number. I will verify further.

Thanks.


On Mon, Sep 24, 2012 at 1:21 PM, Erick Erickson <erickerick...@gmail.com>wrote:

How do you delete items? By ID or by query?

My guess is that one of two things is happening:
1> your delete process is deleting too much data.
2> your index process isn't indexing what you think.

I'd add some logging to the SolrJ program to see what
it thinks is has deleted or added to the index and go from there.

Best
Erick

On Mon, Sep 24, 2012 at 6:55 AM, Kissue Kissue <kissue...@gmail.com>
wrote:
> Hi,
>
> I am running Solr 3.5, using SolrJ and using StreamingUpdateSolrServer
to
> index and delete items from solr.
>
> I basically index items from the db into solr every night. Existing
items
> can be marked for deletion in the db and a delete request sent to solr
to
> delete such items.
>
> My process runs as follows every night:
>
> 1. Check if items have been marked for deletion and delete from solr. I
> commit and optimize after the entire solr deletion runs.
> 2. Index any new items to solr. I commit and optimize after all the new
> items have been added.
>
> Recently i started noticing that huge chunks of items that have not > been
> marked for deletion are disappearing from the index. I checked the solr
> logs and the logs indicate that it is deleting exactly the number of
items
> requested but still a lot of other items disappear from the index from
time
> to time. Any ideas what might be causing this or what i am doing wrong.
>
>
> Thanks.




Reply via email to