Re: Dynamic schema design: feedback requested

Chris Hostetter Mon, 11 Mar 2013 14:52:11 -0700

: we needed to, we could just assert that the schema file is the
: persistence mechanism, as opposed to the system of record, hence if
: you hand edit it and then use the API to change it, your hand edit may
: be lost.  Or we may decide to do away with "local FS" mode altogether.


presuming that it's just a persistence mechanism, but also assuming that 
the user may edit directly, still creates burdens/complexity in when solr 
reads/writes to that file -- even if we say that user edits to that file 
might be overridden (ie: does solr garuntee if/when that the file will be 
written to if you use the REST api to modify things? -- that's going to be 
important if we let people read//edit that file)

: I guess my main point is, we shouldn't decide a priori that using the
: API means you can no longer hand edit.

and my point is we should build a feature where solr has the ability to 
read/write some piece of information, we should start with the asumption 
that it's OK for us to decide that a priori, and not walk into things 
assuming we have to support a lot of much more complicated uses cases.  if 
at some point during the implementation we find that supporting a more lax 
"it's ok, you can edit this by hand" approach won't be a burden, then so 
be it -- we can relax that a priori assertion.

: My thoughts on this are probably heavily influenced on how I initially

my thoughts on this are based directly on:

A) the observations of the confusion & implementation complexity 
observed in the "dual nature" of solr.xml over the years.

B) having spent a lot of time maintining code that did programatic 
read/writing of solr schema.xml files while also trying to treat them as 
"config" files that users were allowed to hand edit -- it's a pain in the 
ass.

: envisioned implementation working in cloud mode (which I thought about
: first since it's harder).  A human readable file on ZK that represents
: the system of record for the schema seemed to be the best.  I never

1) i never said the data couldn't/shouldn't be human readable -- i said it 
should be an implementation detail (ie: subject to change automaticly on 
upgrade just like hte index format), and that end users shouldn't be 
allowed to edit it arbitrarily

2) cloud mode, as i understand it, is actaully much *easier* (if you want 
to allow arbitrary user edits to these "files") because you can set ZK 
watches on those nodes, so any code that is maintaining interal state 
based on them (ie: REST API round trip serialization code that just read 
the file in to modify the DOM before writing it back out) can be notified 
if the file has changed.  I also beleive i was told that writes to files"
in ZK are atomic, which also means you never have to wory about reading 
partial data in the middle of someone else's write.

in the general situation of "config files on disk" we can't even try to 
enforce a lock file type approach, because we shouldn't assume a user will 
remember to obey our locks before editing the file.

If you & sarowe & others feel that:

1) it's important to allow arbitrary user editing of schema.xml files in 
zk mode even when REST read/writes are enabled
2) that allowing arbitrary user edits w/o risk of conflict or complexity 
in the REST read/write code is easy to implement in ZK mode
3) it's reasonable to require ZK mode in order to suppot read/write mode 
in the REST API

...that that would certainly resolve my concern's stemming from "B" 
above.  i'm still worried about "A", but perhaps the ZK nature of things 
and the watches & atomicity provided there will reduce confusion.

But as long as we are talking about this REST api supporting reads & 
writes to schema info even when running in single node mode with files on 
disk -- i think it is a *HUGE* fucking mistake to start with the 
assumption that the serialization mechanism of the REST api needs to be 
able to play nicely with arbitrary user editing of schema.xml.


-Hoss

Re: Dynamic schema design: feedback requested

Reply via email to