[jira] [Commented] (SOLR-14701) Deprecate Schemaless Mode (Discussion)

Alexandre Rafalovitch (Jira) Thu, 10 Sep 2020 22:18:18 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193989#comment-17193989
 ]


Alexandre Rafalovitch commented on SOLR-14701:
----------------------------------------------

Ok, I think I have working code for this that basically slots in instead of 
AddSchemaFieldsURP. The type mapping will need to be different to support 
multiValue, but that's minor. I still need to do the proper test run, but it 
survives learning and then indexing films example without any pre-warming 
(required currently) as well as all xml files in exampledocs.

A couple of messy things:
 * Current implementation is secretly supporting AddSchemeFieldsURP doing field 
names inclusions and exclusions by calling directly (not inheriting) 
FieldMutatingUpdateProcessor. I don't think we are documenting it anywhere, nor 
are we using it in our examples. Do I need to preserve this secret 
functionality anyway?
 * The actual schema creating logic is now in processCommit. Which post tool 
sends in a separate request. Which means the cache of mapping is hoisted to the 
URPFactory level and is a bit longer term than a request. Which is nice, I can 
run post tool on the whole directory. It will be nice for scripts iterating 
over collections too. But it is a higher level storage. I've seen it in other 
places though and it really seems like the only option.
 * To prevent documents actually being created in processAdd, I have to abort 
the chain (only when learning parameter is passed in). Clearing either command 
or attached doc fails as things expect content there. Again, don't seem to have 
any options, but definitely would prefer to make this a learning example 
configuration rather than default chain.
 * I am hardcoding int->long->number chain. Interestingly, film example fails 
because something parses it into 'Double', which we don't declare mapping for. 
So, it ends up as strings anyway. But maybe there is a question there that I am 
not sure how to phrase.

 

> Deprecate Schemaless Mode (Discussion)
> --------------------------------------
>
>                 Key: SOLR-14701
>                 URL: https://issues.apache.org/jira/browse/SOLR-14701
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>            Reporter: Marcus Eagan
>            Priority: Major
>         Attachments: image-2020-08-04-01-35-03-075.png
>
>
> I know this won't be the most popular ticket out there, but I am growing more 
> and more sympathetic to the idea that we should rip many of the freedoms out 
> that cause users more harm than not. One of the freedoms I saw time and time 
> again to cause issues was schemaless mode. It doesn't work as named or 
> documented, so I think it should be deprecated. 
> If you use it in production reliably and in a way that cannot be accomplished 
> another way, I am happy to hear from more knowledgeable folks as to why 
> deprecation is a bad idea. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14701) Deprecate Schemaless Mode (Discussion)

Reply via email to