[ 
https://issues.apache.org/jira/browse/SOLR-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192884#comment-17192884
 ] 

Jan Høydahl commented on SOLR-14701:
------------------------------------

Thanks for doing this Alexandre!
bq. * Widen to default
Sure, but we should support int->float->string

bq. * Use single-value FieldTypes and only assign multiValued on fields 
themselves
Agree

bq. * Look at only new fields (for now), especially since the learning schema 
may not be matching production one
Yep 

bq. * Output structured advice and leave room to do JSON generation for future 
Jira
I'd say update schema at the end. But I'm not fully clear on how you intend 
this tool to be used? Is it a standalone cmd tool, or just an update-chain you 
specify on http POST to your existing collection? Or an implicit handler 
/update/learning that has this update-chain preconfigured?

bq. Do the field type definitions actually need to exist in schema if we never 
create the real fields that use them (we check right now). Or does this just 
become an abstract parsing and mapping exercise?
I have earlier proposed that the major field types be made implicit and need 
not be part of schema. I think I proposed a new tag <standardTypes/> or 
something, which would bring 'int', 'long' etc with it. That would let you do 
it explicitly if you don't include that tag...

Wrt the choice between analyzed text or string, it is impossible to know just 
by looking at text length or content whether the intention is to search as 
analyzed text or use for soring/faceting or matching. Likewise whether 
docvalues are needed, vectors, norms etc. I still think the approach from the 
AddFieldsToSchema processor approach makes sense, to index as text and string 
(capped at 256 and with _str suffix) as a default you know will be there.

> Deprecate Schemaless Mode (Discussion)
> --------------------------------------
>
>                 Key: SOLR-14701
>                 URL: https://issues.apache.org/jira/browse/SOLR-14701
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>            Reporter: Marcus Eagan
>            Priority: Major
>         Attachments: image-2020-08-04-01-35-03-075.png
>
>
> I know this won't be the most popular ticket out there, but I am growing more 
> and more sympathetic to the idea that we should rip many of the freedoms out 
> that cause users more harm than not. One of the freedoms I saw time and time 
> again to cause issues was schemaless mode. It doesn't work as named or 
> documented, so I think it should be deprecated. 
> If you use it in production reliably and in a way that cannot be accomplished 
> another way, I am happy to hear from more knowledgeable folks as to why 
> deprecation is a bad idea. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to