Re: schemaless slow indexing

2015-03-23 Thread Steve Rowe
> On Mar 23, 2015, at 11:09 AM, Yonik Seeley wrote: > > On Mon, Mar 23, 2015 at 1:54 PM, Alexandre Rafalovitch > wrote: >> I looked at SOLR-7290, but I think the discussion should stay on the >> mailing list for at least one more iteration. >> >> My understanding that the reason copyField exist

Re: schemaless slow indexing

2015-03-23 Thread Steve Rowe
> On Mar 23, 2015, at 11:51 AM, Alexandre Rafalovitch > wrote: > For example, I am not even sure if we can create a copyField > definition via REST API yet.

Re: schemaless slow indexing

2015-03-23 Thread Alexandre Rafalovitch
Yonik, those are all facts. Which I do not disagree with at all. But there are also consequences when you bring the rest of the facts and the assumptions and documented workflows into play. My comment was trying to address the situation on that level I am all for improving performance. I am just

Re: schemaless slow indexing

2015-03-23 Thread Yonik Seeley
On Mon, Mar 23, 2015 at 1:54 PM, Alexandre Rafalovitch wrote: > I looked at SOLR-7290, but I think the discussion should stay on the > mailing list for at least one more iteration. > > My understanding that the reason copyField exists is so that a search > actually worked out of the box. Without k

Re: schemaless slow indexing

2015-03-23 Thread Alexandre Rafalovitch
I looked at SOLR-7290, but I think the discussion should stay on the mailing list for at least one more iteration. My understanding that the reason copyField exists is so that a search actually worked out of the box. Without knowing the field names, one cannot say what to search. So, the copyField

Re: schemaless slow indexing

2015-03-22 Thread Erick Erickson
I think you mean https://issues.apache.org/jira/browse/SOLR-7290? Erick On Sun, Mar 22, 2015 at 2:30 PM, Mike Murphy wrote: > That's it! > I hand edited the file that says you are not supposed to edit it and > removed that copyField. > Indexing performance is now back to expected levels. > > I c

Re: schemaless slow indexing

2015-03-22 Thread Mike Murphy
That's it! I hand edited the file that says you are not supposed to edit it and removed that copyField. Indexing performance is now back to expected levels. I created an issue for this, https://issues.apache.org/jira/browse/SOLR-7284 --Mike On Sun, Mar 22, 2015 at 3:29 PM, Yonik Seeley wrote: >

Re: schemaless slow indexing

2015-03-22 Thread Yonik Seeley
I took a quick look at the stock schemaless configs... unfortunately they contain a performance trap. There's a copyField by default that copies *all* fields to a catch-all field called "_text". IMO, that's not a great default. Double the index size (well, the "index" portion of it at least... no

Re: schemaless slow indexing

2015-03-22 Thread Alexandre Rafalovitch
Same data with same version of Solr with the only difference between Schema vs. Schemaless? How much longer, 10%, 2x, 20x? Schemaless mode has a much more complex UpdateRequestProcessor chain, that's partially what makes it schemaless. But I hesitate pointing fingers at that without any real detai

Re: schemaless slow indexing

2015-03-22 Thread Mike Murphy
I start up solr schemaless and index a bunch of data, and it takes a lot longer to finish indexing. No configuration changes, just straight schemaless. --Mike On Sun, Mar 22, 2015 at 12:27 PM, Erick Erickson wrote: > Please review: http://wiki.apache.org/solr/UsingMailingLists > > You haven't qu

Re: schemaless slow indexing

2015-03-22 Thread Erick Erickson
Please review: http://wiki.apache.org/solr/UsingMailingLists You haven't quantified the slowdown. Or given any details on how you're measuring the "slowdown". Or how you've configured your setups in 4.10 and 5.0. Or... Ad Hossman would say "details matter". Best, Erick On Sun, Mar 22, 2015 at 8:

schemaless slow indexing

2015-03-22 Thread Mike Murphy
I'm trying out schemaless in solr 5.0, but the indexing seems quite a bit slower than it did in the past on 4.10. Any pointers? --Mike