1. Thank you.

2. I think this is what you're looking for.  You'd be able to be more
specific than with bin/post.  For instance:
a. specify the CSV delimiter, CSV quote character, and multivalued field
delimiter
b. the dynamic-fields feature let's you write plugins in Java to define
values (very simple example: combine field values f_name, m_name, l_name to
populate a full_name field)
c. specify field order for mapping onto SOLR fields, data types, date
formats of source data; perhaps your CSV headers/JSON keys don't cleanly
map to SOLR field names
d. flag whether the first row of a CSV is the header and should not be
indexed
e. use literal values - e.g., instead of having to alter the source data to
have a column whose value is "foo" you can configure a field to always have
the same literal value for all documents
f. set the number of times to retry when there is an error and the amount
of time between retries (e.g., sometimes zk was not consistently responsive)
g. skip fields - e.g., your data have 10 columns but you only want to index
columns 1, 3, 5, and 9
h. send soft commits after a specified number of batches
i. combine fields to generate the uniqueKey value

3. Yes, atomic updates.  For instance, index data using DIH then use this
index to provide additional values to fields in those documents (e.g.,
maybe the extra data come from a different data source like BigQuery).

I hope this brings more clarity to this tool's features and answers all
your questions.  Please ask questions if anyone has more.

Dan


On Tue, Sep 18, 2018 at 3:21 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Dan,
>
> On 9/18/18 2:51 PM, Dan Brown wrote:
> > I've been working on this for a while and it's finally in a state
> > where it's ready for public consumption.
> >
> > This is a command line indexer that will index CSV or JSON
> > documents: https://github.com/likethecolor/solr-indexer
> >
> > There are quite a few parameters/options that can be set.
> >
> > One thing to note is that it will update individual fields.  That
> > is, unlike the Data Import Handler, it does not replace entire
> > documents.
> >
> > Please check it out and let me know what you think.
>
> How is this different from the bin/post tool that ships with Solr?
>
> Or is that you meant when you said "this is unlike the Data Import
> Handler".
>
> AIUI, Solr doesn't support updating a single field in a document. The
> document is replaced no matter how hard to try to be surgical about
> updating a single field.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAluhXlYACgkQHPApP6U8
> pFjIeQ/+PRIx+I+IDW9XTqGNV5TIWYf+yQKC/4JpTV4Ndj7MZLsEEw+cfMvFTvQt
> 44dK7CnDKEDgQHZlMccWKd9/Th1k/5g40VMugBMsayRwUc83Onawdi4HQfnig4et
> VN0/RaZ/IBo2AThsgEvUNplXYyY3BtyrUt6miiBsVkhKstI/BnmKqZvsRgvVjH0P
> K1Xc5F2LNyXswvoIZqd3YmEa9p7CYMy7COsFV9KOeSymKlB7UoHulZqpJ9MRYkmn
> YWjc9dHIRjpz5TUrJqWhZUG03uGXGtTnaXEku1Hb98WyIUZcHxkwN8W7qm6/B0CG
> inPxfGRFH9EbUdcK4qeXmbQqty2sbKMQ6hogpRd/NEzgSWjDapiEUT1xz+p5V6wG
> XM0ILaiLJ8zHJA6oUY0w5SNNyhdnd76CDpCK7T7YBm+aIxUDv9zoj6TLNceEaLi0
> SjfI83LvaR1gM/ZeVO77d+1IY9maU1+5m0EZFjAETfMGj5dwYRvBub0Oo6QQuLUm
> roF5R5b/bg/WjjPF1n4CJ7gTr/WBMzahKFnnQvoYD3OQqZpoasoEUifPpSd9OgvO
> yEok0VqwxPeXdHgE+Vy+BlXn6QqshB3BYnUSNbpFXlNsOIQojfJXkjcCa+dP1nyF
> JCElvmEgBG8K1WzGo4WAtVqJs7WDzQlmY2RDrETGsVbnqkTojXA=
> =AmkJ
> -----END PGP SIGNATURE-----
>

Reply via email to