Yup, thanks for the clarification.  I see now that some of the items I list
in 2 are moot.

On Tue, Sep 18, 2018 at 4:16 PM Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Uhm, inline:
>
> On 18 September 2018 at 17:05, Dan Brown <d...@likethecolor.com> wrote:
> > 1. Thank you.
> >
> > 2. I think this is what you're looking for.  You'd be able to be more
> > specific than with bin/post.  For instance:
> > a. specify the CSV delimiter, CSV quote character, and multivalued field
> > delimiter
>
> http://lucene.apache.org/solr/guide/7_4/uploading-data-with-index-handlers.html
> separator - (global and field local for multivalued)
> encapsulator - for CSV quote characters
>
> > b. the dynamic-fields feature let's you write plugins in Java to define
> > values (very simple example: combine field values f_name, m_name, l_name
> to
> > populate a full_name field)
> UpdateRequestProcessors. Your example specifically:
>
> > c. specify field order for mapping onto SOLR fields, data types, date
> > formats of source data; perhaps your CSV headers/JSON keys don't cleanly
> > map to SOLR field names
> > d. flag whether the first row of a CSV is the header and should not be
> > indexed
> > e. use literal values - e.g., instead of having to alter the source data
> to
> > have a column whose value is "foo" you can configure a field to always
> have
> > the same literal value for all documents
> > f. set the number of times to retry when there is an error and the amount
> > of time between retries (e.g., sometimes zk was not consistently
> responsive)
> > g. skip fields - e.g., your data have 10 columns but you only want to
> index
> > columns 1, 3, 5, and 9
> > h. send soft commits after a specified number of batches
> > i. combine fields to generate the uniqueKey value
> >
> > 3. Yes, atomic updates.  For instance, index data using DIH then use this
> > index to provide additional values to fields in those documents (e.g.,
> > maybe the extra data come from a different data source like BigQuery).
> >
> > I hope this brings more clarity to this tool's features and answers all
> > your questions.  Please ask questions if anyone has more.
> >
> > Dan
> >
> >
> > On Tue, Sep 18, 2018 at 3:21 PM Christopher Schultz <
> > ch...@christopherschultz.net> wrote:
> >
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA256
> >>
> >> Dan,
> >>
> >> On 9/18/18 2:51 PM, Dan Brown wrote:
> >> > I've been working on this for a while and it's finally in a state
> >> > where it's ready for public consumption.
> >> >
> >> > This is a command line indexer that will index CSV or JSON
> >> > documents: https://github.com/likethecolor/solr-indexer
> >> >
> >> > There are quite a few parameters/options that can be set.
> >> >
> >> > One thing to note is that it will update individual fields.  That
> >> > is, unlike the Data Import Handler, it does not replace entire
> >> > documents.
> >> >
> >> > Please check it out and let me know what you think.
> >>
> >> How is this different from the bin/post tool that ships with Solr?
> >>
> >> Or is that you meant when you said "this is unlike the Data Import
> >> Handler".
> >>
> >> AIUI, Solr doesn't support updating a single field in a document. The
> >> document is replaced no matter how hard to try to be surgical about
> >> updating a single field.
> >>
> >> - -chris
> >> -----BEGIN PGP SIGNATURE-----
> >> Comment: GPGTools - http://gpgtools.org
> >> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> >>
> >> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAluhXlYACgkQHPApP6U8
> >> pFjIeQ/+PRIx+I+IDW9XTqGNV5TIWYf+yQKC/4JpTV4Ndj7MZLsEEw+cfMvFTvQt
> >> 44dK7CnDKEDgQHZlMccWKd9/Th1k/5g40VMugBMsayRwUc83Onawdi4HQfnig4et
> >> VN0/RaZ/IBo2AThsgEvUNplXYyY3BtyrUt6miiBsVkhKstI/BnmKqZvsRgvVjH0P
> >> K1Xc5F2LNyXswvoIZqd3YmEa9p7CYMy7COsFV9KOeSymKlB7UoHulZqpJ9MRYkmn
> >> YWjc9dHIRjpz5TUrJqWhZUG03uGXGtTnaXEku1Hb98WyIUZcHxkwN8W7qm6/B0CG
> >> inPxfGRFH9EbUdcK4qeXmbQqty2sbKMQ6hogpRd/NEzgSWjDapiEUT1xz+p5V6wG
> >> XM0ILaiLJ8zHJA6oUY0w5SNNyhdnd76CDpCK7T7YBm+aIxUDv9zoj6TLNceEaLi0
> >> SjfI83LvaR1gM/ZeVO77d+1IY9maU1+5m0EZFjAETfMGj5dwYRvBub0Oo6QQuLUm
> >> roF5R5b/bg/WjjPF1n4CJ7gTr/WBMzahKFnnQvoYD3OQqZpoasoEUifPpSd9OgvO
> >> yEok0VqwxPeXdHgE+Vy+BlXn6QqshB3BYnUSNbpFXlNsOIQojfJXkjcCa+dP1nyF
> >> JCElvmEgBG8K1WzGo4WAtVqJs7WDzQlmY2RDrETGsVbnqkTojXA=
> >> =AmkJ
> >> -----END PGP SIGNATURE-----
> >>
>

Reply via email to