[ https://issues.apache.org/jira/browse/SOLR-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192884#comment-17192884 ]
Jan Høydahl commented on SOLR-14701: ------------------------------------ Thanks for doing this Alexandre! bq. * Widen to default Sure, but we should support int->float->string bq. * Use single-value FieldTypes and only assign multiValued on fields themselves Agree bq. * Look at only new fields (for now), especially since the learning schema may not be matching production one Yep bq. * Output structured advice and leave room to do JSON generation for future Jira I'd say update schema at the end. But I'm not fully clear on how you intend this tool to be used? Is it a standalone cmd tool, or just an update-chain you specify on http POST to your existing collection? Or an implicit handler /update/learning that has this update-chain preconfigured? bq. Do the field type definitions actually need to exist in schema if we never create the real fields that use them (we check right now). Or does this just become an abstract parsing and mapping exercise? I have earlier proposed that the major field types be made implicit and need not be part of schema. I think I proposed a new tag <standardTypes/> or something, which would bring 'int', 'long' etc with it. That would let you do it explicitly if you don't include that tag... Wrt the choice between analyzed text or string, it is impossible to know just by looking at text length or content whether the intention is to search as analyzed text or use for soring/faceting or matching. Likewise whether docvalues are needed, vectors, norms etc. I still think the approach from the AddFieldsToSchema processor approach makes sense, to index as text and string (capped at 256 and with _str suffix) as a default you know will be there. > Deprecate Schemaless Mode (Discussion) > -------------------------------------- > > Key: SOLR-14701 > URL: https://issues.apache.org/jira/browse/SOLR-14701 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis > Reporter: Marcus Eagan > Priority: Major > Attachments: image-2020-08-04-01-35-03-075.png > > > I know this won't be the most popular ticket out there, but I am growing more > and more sympathetic to the idea that we should rip many of the freedoms out > that cause users more harm than not. One of the freedoms I saw time and time > again to cause issues was schemaless mode. It doesn't work as named or > documented, so I think it should be deprecated. > If you use it in production reliably and in a way that cannot be accomplished > another way, I am happy to hear from more knowledgeable folks as to why > deprecation is a bad idea. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org