I looked at SOLR-7290, but I think the discussion should stay on the mailing list for at least one more iteration.
My understanding that the reason copyField exists is so that a search actually worked out of the box. Without knowing the field names, one cannot say what to search. So, the copyField to a general field and search that is a classic strategy. Though usually it is not with a *match all* wildcard. But for schemaless, *match all* is all we get as we don't even have prefix/suffix strategies to rely on. So, saying *remove* without offering an alternative way to achieve easy search is not - to me - a terribly useful contribution for a default setup. Regards, Alex. P.s. As to the field renaming, I have no opinion. It would be nice if somebody checked the consistency now that a couple more special names were introduced with smart JSON parsing. ---- Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 22 March 2015 at 20:32, Erick Erickson <erickerick...@gmail.com> wrote: > I think you mean https://issues.apache.org/jira/browse/SOLR-7290? > > Erick > > On Sun, Mar 22, 2015 at 2:30 PM, Mike Murphy <mmurphy3...@gmail.com> wrote: >> That's it! >> I hand edited the file that says you are not supposed to edit it and >> removed that copyField. >> Indexing performance is now back to expected levels. >> >> I created an issue for this, https://issues.apache.org/jira/browse/SOLR-7284 >> >> --Mike >> >> On Sun, Mar 22, 2015 at 3:29 PM, Yonik Seeley <ysee...@gmail.com> wrote: >>> I took a quick look at the stock schemaless configs... unfortunately >>> they contain a performance trap. >>> There's a copyField by default that copies *all* fields to a catch-all >>> field called "_text". >>> >>> IMO, that's not a great default. Double the index size (well, the >>> "index" portion of it at least... not stored fields), and slower >>> indexing performance. >>> >>> The other unfortunate thing is the name. No where else in solr (that >>> I know of) do we have a single underscore field name. _text looks >>> more like a dynamicField pattern. Our other fields with underscores >>> look like _version_ and _root_. If we're going to start a new naming >>> convention (or expand the naming conventions) we need to have some >>> consistency and logic behind it. >>> >>> -Yonik >>> >>> On Sun, Mar 22, 2015 at 12:32 PM, Mike Murphy <mmurphy3...@gmail.com> wrote: >>>> I start up solr schemaless and index a bunch of data, and it takes a >>>> lot longer to finish indexing. >>>> No configuration changes, just straight schemaless. >>>> >>>> --Mike >>>> >>>> On Sun, Mar 22, 2015 at 12:27 PM, Erick Erickson >>>> <erickerick...@gmail.com> wrote: >>>>> Please review: http://wiki.apache.org/solr/UsingMailingLists >>>>> >>>>> You haven't quantified the slowdown. Or given any details on how >>>>> you're measuring the "slowdown". Or how you've configured your setups >>>>> in 4.10 and 5.0. Or... Ad Hossman would say "details matter". >>>>> >>>>> Best, >>>>> Erick >>>>> >>>>> On Sun, Mar 22, 2015 at 8:35 AM, Mike Murphy <mmurphy3...@gmail.com> >>>>> wrote: >>>>>> I'm trying out schemaless in solr 5.0, but the indexing seems quite a >>>>>> bit slower than it did in the past on 4.10. Any pointers? >>>>>> >>>>>> --Mike