Actually this problem occurs even when i am doing just deletes. I tested by
sending only one delete query for a single catalogue and had the same
problem. I always optimize once.

I changed to the syntax you suggested ( {!term f=catalogueId}Emory Labs)
and works like a charm. Thanks for the pointer, saved me from another issue
that could have occurred at some point.

Thanks.



On Thu, Sep 27, 2012 at 12:30 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> Wild shot in the dark....
>
> What happens if you switch from StreamingUpdateSolrServer to
> HttpSolrServer?
>
> What I'm wondering is if somehow you're getting a queueing problem. If you
> have
> multiple threads defined for SUSS, it might be possible (and I'm guessing)
> that
> the delete bit is getting sent after some of the adds. Frankly I doubt
> this is
> the case, but this issue is so weird that I'm grasping at straws.
>
> BTW, there's no reason to optimize twice. Actually, the new thinking is
> that
> optimizing usually isn't necessary anyway. But if you insist on optimizing
> there's no reason to do it _both_ after the deletes and after the adds,
> just
> do it after the adds.
>
> Best
> Erick
>
> On Thu, Sep 27, 2012 at 4:31 AM, Kissue Kissue <kissue...@gmail.com>
> wrote:
> > #What is the field type for that field - string or text?
> >
> > It is a string type.
> >
> > Thanks.
> >
> > On Wed, Sep 26, 2012 at 8:14 PM, Jack Krupansky <j...@basetechnology.com
> >wrote:
> >
> >> What is the field type for that field - string or text?
> >>
> >>
> >> -- Jack Krupansky
> >>
> >> -----Original Message----- From: Kissue Kissue
> >> Sent: Wednesday, September 26, 2012 1:43 PM
> >>
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Items disappearing from Solr index
> >>
> >> # It is looking for documents with "Emory" in the specified field OR
> "Labs"
> >> in the default search field.
> >>
> >> This does not seem to be the case. For instance issuing a deleteByQuery
> for
> >> catalogueId: "PEARL LINGUISTICS LTD" also deletes the contents of a
> >> catalogueId with the value: "Ncl_**MacNaughtonMcGregorCoaching_**
> >> vf010811".
> >>
> >> Thanks.
> >>
> >> On Wed, Sep 26, 2012 at 2:37 PM, Jack Krupansky <
> j...@basetechnology.com>*
> >> *wrote:
> >>
> >>  It is looking for documents with "Emory" in the specified field OR
> "Labs"
> >>> in the default search field.
> >>>
> >>> -- Jack Krupansky
> >>>
> >>> -----Original Message----- From: Kissue Kissue
> >>> Sent: Wednesday, September 26, 2012 7:47 AM
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: Items disappearing from Solr index
> >>>
> >>>
> >>> I have just solved this problem.
> >>>
> >>> We have a field called catalogueId. One possible value for this field
> >>> could
> >>> be "Emory Labs". I found out that when the following delete by query is
> >>> sent to solr:
> >>>
> >>> getSolrServer().deleteByQuery(****catalogueId + ":" + Emory Labs)
> >>>  [Notice
> >>>
> >>> that
> >>> there are no quotes surrounding the catalogueId value - Emory Labs]
> >>>
> >>> For some reason this delete by query ends up deleting the contents of
> some
> >>> other random catalogues too which is the reason why we are loosing
> items
> >>> from the index. When the query is changed to:
> >>>
> >>> getSolrServer().deleteByQuery(****catalogueId + ":" + "Emory Labs"),
> >>> then it
> >>>
> >>> starts to correctly delete only items in the Emory Labs catalogue.
> >>>
> >>> So my first question is, what exactly does deleteByQuery do in the
> first
> >>> query without the quotes? How is it determining which catalogues to
> >>> delete?
> >>>
> >>> Secondly, shouldn't the correct behaviour be not to delete anything at
> all
> >>> in this case since when a search is done for the same catalogueId
> without
> >>> the quotes it just simply returns no results?
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> On Mon, Sep 24, 2012 at 3:12 PM, Kissue Kissue <kissue...@gmail.com>
> >>> wrote:
> >>>
> >>>  Hi Erick,
> >>>
> >>>>
> >>>> Thanks for your reply. Yes i am using delete by query. I am currently
> >>>> logging the number of items to be deleted before handing off to solr.
> And
> >>>> from solr logs i can it deleted exactly that number. I will verify
> >>>> further.
> >>>>
> >>>> Thanks.
> >>>>
> >>>>
> >>>> On Mon, Sep 24, 2012 at 1:21 PM, Erick Erickson <
> erickerick...@gmail.com
> >>>> >
> >>>> **wrote:
> >>>>
> >>>>
> >>>>  How do you delete items? By ID or by query?
> >>>>
> >>>>>
> >>>>> My guess is that one of two things is happening:
> >>>>> 1> your delete process is deleting too much data.
> >>>>> 2> your index process isn't indexing what you think.
> >>>>>
> >>>>> I'd add some logging to the SolrJ program to see what
> >>>>> it thinks is has deleted or added to the index and go from there.
> >>>>>
> >>>>> Best
> >>>>> Erick
> >>>>>
> >>>>> On Mon, Sep 24, 2012 at 6:55 AM, Kissue Kissue <kissue...@gmail.com>
> >>>>> wrote:
> >>>>> > Hi,
> >>>>> >
> >>>>> > I am running Solr 3.5, using SolrJ and using
> StreamingUpdateSolrServer
> >>>>> to
> >>>>> > index and delete items from solr.
> >>>>> >
> >>>>> > I basically index items from the db into solr every night. Existing
> >>>>> items
> >>>>> > can be marked for deletion in the db and a delete request sent to
> solr
> >>>>> to
> >>>>> > delete such items.
> >>>>> >
> >>>>> > My process runs as follows every night:
> >>>>> >
> >>>>> > 1. Check if items have been marked for deletion and delete from
> solr.
> >>>>> > I
> >>>>> > commit and optimize after the entire solr deletion runs.
> >>>>> > 2. Index any new items to solr. I commit and optimize after all
> the >
> >>>>> new
> >>>>> > items have been added.
> >>>>> >
> >>>>> > Recently i started noticing that huge chunks of items that have
> not >
> >>>>> been
> >>>>> > marked for deletion are disappearing from the index. I checked the
> >
> >>>>> solr
> >>>>> > logs and the logs indicate that it is deleting exactly the number
> of
> >>>>> items
> >>>>> > requested but still a lot of other items disappear from the index
> from
> >>>>> time
> >>>>> > to time. Any ideas what might be causing this or what i am doing >
> >>>>> wrong.
> >>>>> >
> >>>>> >
> >>>>> > Thanks.
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>

Reply via email to