Thanks! Indeed, one of my issues is that I can not know about the fields to be indexed before seeing (and making some entity extraction) on the browsed documents. It is the reason I thought to avoid the schema definition ...
The schema API sounds interesting! Does it exist via SolrJ? Many thanks! Benjamin On Thu, Apr 30, 2015 at 6:27 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Could you explain a bit more _why_ you want to do this? As you're > probably well aware, there > are multiple ways to shoot yourself in the foot in lower-level Lucene. > > If you have some situation where you're creating indexes on the fly > that may vary then > you could consider the "managed schema" that lets you create a schema > via API calls, > then you wouldn't need to mess with editing the schema.xml file for > instance. > > Best, > Erick > > On Thu, Apr 30, 2015 at 8:12 AM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 4/30/2015 8:43 AM, Sznajder ForMailingList wrote: > >> I am interested to index some documents in Solr, as I did in Lucene. > >> > >> I mean: giving via solrJ all the information about the field I am adding > >> (Tokenize, store, facet etc...) > >> > >> can we do that? Or is it mandatory to define a schema on the collection? > > > > All that information is defined on the server. You do not have direct > > access to the Lucene index - Solr is intended as an abstraction, so the > > admin and the users/applications that use Solr do not need to understand > > all the low-level details that go into a Lucene application. The admin > > just has to deal with configuration files like schema.xml, and the users > > just need to know what fields are in each document and how the query > > syntax works. Deeper Lucene knowledge is helpful, but not strictly > > necessary. > > > > If you want Lucene-level control, you'll need to write the search server > > yourself using Lucene. If you have very specific needs that Solr's > > approach can't satisfy, you always have this option. > > > > The newest Solr versions do have an example of what's known as a > > "data-driven" schema, or schemaless mode. In this mode, Solr builds up > > the schema automatically, guessing the field type based on what kind of > > data is the first to arrive for each field. This is good for > > prototyping, but for production use, I would want to be in full manual > > control of the schema. > > > > Thanks, > > Shawn > > >