On 7/18/2015 9:49 AM, Charlie Hubbard wrote:
> So I want to allow people to upload any CSV/XML/JSON to solr they want so
> having a predefined schema isn't going to cut it.  After reading about my
> options I figured my choices were schema-less mode and dynamic fields using
> the * with a type other than ignore.  I know the docs say schema-less isn't
> something for production, but it seems like that is changing if I read
> between the lines.  With dynamic fields can I still use the schema API to
> describe all of the fields that have been indexed?
> 
> I like how easy dynamic fields is to configure, so what are the pros and
> cons of both?

Dynamic fields are a reasonable way to do *some* things.  Using a full
wildcard of "*" for them is generally NOT a good way to do it, although
it will work.  It's better to do things like "*_i" for integer, "*_s"
for string, etc.

Schemaless mode has some inherent risks.  You are asking Solr to *guess*
what fieldType the new field will get ... if Solr guesses wrong, you
can't fix it without manually modifying the schema, which will almost
certainly require a reindex.

https://wiki.apache.org/solr/HowToReindex

Schemaless mode is great during prototyping and initial setup, but I
personally would not want to run in that mode in production.  There's
nothing *wrong* with doing so, but I would not want my production schema
to change because the data guys added a new field and didn't tell me.  I
would rather have the indexing fail loudly so everyone is aware that the
config needs attention.  At that point, I can fix the config, and I will
know it's fixed correctly.  A reindex might *still* be required after
the change, of course.

Your situation sounds like it might be a little different than mine.  If
it were me, I would require that the users conform their field names to
something like the *_i and *_s that I mentioned above, and use dynamic
fields.  The schema is still completely static and behavior is entirely
predictable.

Thanks,
Shawn

Reply via email to