An annotation field would be much better than the current "anything goes" schema-less schema.xml.
Has anyone built an XML Schema for schema.xml? I know it is extensible, but it would be worth a try. wunder On Jul 31, 2013, at 6:21 PM, Steve Rowe wrote: > In thinking about making the entire Solr schema REST-API-addressable > (SOLR-4898), I'd like to be able to add arbitrary metadata at both the top > level of the schema and at each leaf node, and allow read/write access to > that metadata via the REST API. > > Some uses I've thought of for such a facility: > > 1. The managed schema now drops XML comments from schema.xml upon conversion > to managed-schema format, but it would be much better if these were somehow > preserved, as well as round-trippable when retrieving the schema and its > constituents via the REST API. > > 2. Some comments in the example schemas don't refer to just one or to all > leaf nodes, but rather to a group of them. I'd like to be able to group nodes > by adding same-named "tags" to multiple nodes, and also have a top-level > (optional) "tag description" - this description could then be presented with > tagged nodes in various output formats. > > 3. Some comments in the example schema are documentation about a feature, > e.g. copyFields. A top-level "documentation" annotation could take a leaf > node element name (or maybe an XPath? probably overkill) and apply to all > matching elements. > > 4. When modifying the schema via REST API, a "last-modified" annotation could > be automatically added. > > 5. There were a couple of user complaints recently when schema.xml parsing > was tightened to disallow unknown attributes on field declarations > (SOLR-4641): people were storing their own information there. User-level > metadata would support this in a round-trippable way - I'm thinking we could > restrict it to flat string-typed key/value pairs, with no nested structure. > > W3C XML Schema has a similar facility: > <http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/structures.html#element-annotation>. > > Thoughts? > > Some concrete examples of what I'm thinking of in schema.xml format > (syntax/naming as yet unsettled): > > <schema name="example" version="1.5"> > <annotation> > <description element="tag" content="plain-numeric-field-types"> > Plain numeric field types store and index the text value verbatim. > </description> > <documentation element="copyField"> > copyField commands copy one field to another at the time a document > is added to the index. It's used either to index the same field > differently, > or to add multiple fields to the same field for easier/faster searching. > </documentation> > <last-modified>2014-03-08T12:14:02Z</last-modified> > … > </annotation> > … > <fieldType name="pint" class="solr.IntField"> > <annotation> > <tag>plain-numeric-field-types</tag> > </annotation> > </fieldType> > <fieldType name="plong" class="solr.LongField"> > <annotation> > <tag>plain-numeric-field-types</tag> > </annotation> > </fieldType> > … > <copyField source="cat" dest="text"> > <annotation> > <todo>Should this field really be copied to the catchall text > field?</todo> > </annotation> > </copyField> > … > <field name="text" type="text_general"> > <annotation> > <description>catchall field</description> > <visibility>public</visibility> > </annotation> > </field> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -- Walter Underwood wun...@wunderwood.org