That makes sense on the surface, but Kissue makes a good point.  Shouldn't
the delete match the same documents as the search?  He said no documents
come back when he searches on the phrase, but documents are deleted when he
uses the same phrase.

cheers,
Travis

On Wed, Sep 26, 2012 at 9:37 AM, Jack Krupansky <j...@basetechnology.com>wrote:

> It is looking for documents with "Emory" in the specified field OR "Labs"
> in the default search field.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Kissue Kissue
> Sent: Wednesday, September 26, 2012 7:47 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Items disappearing from Solr index
>
> I have just solved this problem.
>
> We have a field called catalogueId. One possible value for this field could
> be "Emory Labs". I found out that when the following delete by query is
> sent to solr:
>
> getSolrServer().deleteByQuery(**catalogueId + ":" + Emory Labs)  [Notice
> that
> there are no quotes surrounding the catalogueId value - Emory Labs]
>
> For some reason this delete by query ends up deleting the contents of some
> other random catalogues too which is the reason why we are loosing items
> from the index. When the query is changed to:
>
> getSolrServer().deleteByQuery(**catalogueId + ":" + "Emory Labs"), then it
> starts to correctly delete only items in the Emory Labs catalogue.
>
> So my first question is, what exactly does deleteByQuery do in the first
> query without the quotes? How is it determining which catalogues to delete?
>
> Secondly, shouldn't the correct behaviour be not to delete anything at all
> in this case since when a search is done for the same catalogueId without
> the quotes it just simply returns no results?
>
> Thanks.
>
>
> On Mon, Sep 24, 2012 at 3:12 PM, Kissue Kissue <kissue...@gmail.com>
> wrote:
>
>  Hi Erick,
>>
>> Thanks for your reply. Yes i am using delete by query. I am currently
>> logging the number of items to be deleted before handing off to solr. And
>> from solr logs i can it deleted exactly that number. I will verify
>> further.
>>
>> Thanks.
>>
>>
>> On Mon, Sep 24, 2012 at 1:21 PM, Erick Erickson <erickerick...@gmail.com>
>> **wrote:
>>
>>  How do you delete items? By ID or by query?
>>>
>>> My guess is that one of two things is happening:
>>> 1> your delete process is deleting too much data.
>>> 2> your index process isn't indexing what you think.
>>>
>>> I'd add some logging to the SolrJ program to see what
>>> it thinks is has deleted or added to the index and go from there.
>>>
>>> Best
>>> Erick
>>>
>>> On Mon, Sep 24, 2012 at 6:55 AM, Kissue Kissue <kissue...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > I am running Solr 3.5, using SolrJ and using StreamingUpdateSolrServer
>>> to
>>> > index and delete items from solr.
>>> >
>>> > I basically index items from the db into solr every night. Existing
>>> items
>>> > can be marked for deletion in the db and a delete request sent to solr
>>> to
>>> > delete such items.
>>> >
>>> > My process runs as follows every night:
>>> >
>>> > 1. Check if items have been marked for deletion and delete from solr. I
>>> > commit and optimize after the entire solr deletion runs.
>>> > 2. Index any new items to solr. I commit and optimize after all the new
>>> > items have been added.
>>> >
>>> > Recently i started noticing that huge chunks of items that have not >
>>> been
>>> > marked for deletion are disappearing from the index. I checked the solr
>>> > logs and the logs indicate that it is deleting exactly the number of
>>> items
>>> > requested but still a lot of other items disappear from the index from
>>> time
>>> > to time. Any ideas what might be causing this or what i am doing wrong.
>>> >
>>> >
>>> > Thanks.
>>>
>>>
>>
>>
>


-- 

**

*Travis Low, Director of Development*


** <t...@4centurion.com>* *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* <http://www.centurionresearch.com>

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed
to be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from
the content of this email.

Reply via email to