Hi Mohsin,
There's some work in progress for in-place updates to docValued fields,
https://issues.apache.org/jira/browse/SOLR-5944. Can you try the latest
patch there (or ping me if you need a git branch)?
It would be nice to know how fast the updates go for your usecase with that
patch. Please note that for that patch, both the version field and the
updated field needs to have stored=false, indexed=false, docValues=true.
Regards,
Ishan


On Thu, Mar 17, 2016 at 10:55 PM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> It would be nice to have a wiki/doc for "Bulk Field Update" that listed all
> of these techniques and tricks.
>
> And, of course, it would be so much better to have an explicit Lucene
> feature for this. It could work in the background like merge and process
> one segment at a time as efficiently as possible.
>
> Have several modes:
>
> 1. Set a field of all documents to explicit value.
> 2. Set a field of query documents to an explicit value.
> 3. Increment by n.
> 4. Add new field to all document, or maybe by query.
> 5. Delete existing field for all documents.
> 6. Delete field value for all documents or a specified query.
>
>
> -- Jack Krupansky
>
> On Thu, Mar 17, 2016 at 12:31 PM, Ken Krugler <kkrugler_li...@transpac.com
> >
> wrote:
>
> > As others noted, currently updating a field means deleting and inserting
> > the entire document.
> >
> > Depending on how you use the field, you might be able to create another
> > core/container with that one field (plus the key field), and use join
> > support.
> >
> > Note that https://issues.apache.org/jira/browse/LUCENE-6352 is an
> > improvement, which looks like it's in the 5.x code line, though I don't
> see
> > a fix version.
> >
> > -- Ken
> >
> > > From: Mohsin Beg Beg
> > > Sent: March 16, 2016 3:52:47pm PDT
> > > To: solr-user@lucene.apache.org
> > > Subject: how to update billions of docs
> > >
> > > Hi,
> > >
> > > I have a requirement to replace a value of a field in 100B's of docs in
> > 100's of cores.
> > > The field is multiValued=false docValues=true type=StrField stored=true
> > indexed=true.
> > >
> > > Atomic Updates performance is on the order of 5K docs per sec per core
> > in solr 5.3 (other fields are quite big).
> > >
> > > Any suggestions ?
> > >
> > > -Mohsin
> >
> >
> > --------------------------
> > Ken Krugler
> > +1 530-210-6378
> > http://www.scaleunlimited.com
> > custom big data solutions & training
> > Hadoop, Cascading, Cassandra & Solr
> >
> >
> >
> >
> >
> >
>

Reply via email to