I've been a bit snowed under, but I've found the difference is because the
_default config has the dynamic schema building in it, which I assume is
pushing it down a different code path.

  <updateRequestProcessorChain name="add-unknown-fields-to-the-schema"
default="${update.autoCreateFields:true}"

 
processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">

I'm using the vanilla Solr 8.3.0 binary8.3.0
2aa586909b911e66e1d8863aa89f173d69f86cd2 - ishan - 2019-10-25 23:15:22 with
Eclipse OpenJ9 Eclipse OpenJ9 VM 1.8.0_232 openj9-0.17.0
and I've checked with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM
1.8.0_191 25.191-b12 as well

I've put a testcase and configsets in Google Drive:
https://drive.google.com/open?id=1ibKNWvowT8cXTwSa3bcTwKYLSRNur86U
The configsets are a copy of the _default configset, except the "problem"
configset has autoCreateFields set to false.
I created a collection with 4 shards, replication factor 1 for each
configset. The test case reliably fails on the "problem" collection and
reliably passes against the "no_problem" collection.

The test (well it's not actually a @Test but still) has static data (though
it was originally generated randomly). The data is a bit mad... but it was
easier to reproduce the problem reliably with this data, than with the
normal documents we use in our product.
Each document has a different (dynamically named) field to index data into,
but it's the same data in each field.
The problem only appears (or probably is just more likely to appear?) when
the field names in the request are of different lengths.
The length / value of the data doesn't appear to matter. Or is less
impactful than variations in the field names.
*If you run the test 10 times you will see a variety of different errors.
i.e. it's not the same error every time.*
I've included some examples of the errors in the Drive folder. One of the
most fundamental (and probably points at the root cause) is this:
































*2019-11-21 17:02:53.720 ERROR
(updateExecutor-3-thread-6-processing-x:problem_collection_shard2_replica_n2
r:core_node5 null n:10.0.75.1:8983_solr c:problem_collection s:shard2)
[c:problem_collection s:shard2 r:core_node5
x:problem_collection_shard2_replica_n2]
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=ForwardNode:
http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/
<http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/> to
http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/
<http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/> =>
java.lang.StringIndexOutOfBoundsException at
java.lang.String.<init>(String.java:668)java.lang.StringIndexOutOfBoundsException:
null at java.lang.String.<init>(String.java:668) ~[?:1.8.0_232] at
org.noggit.CharArr.toString(CharArr.java:182) ~[?:?] at
org.apache.solr.common.util.JavaBinCodec.lambda$getStringProvider$1(JavaBinCodec.java:966)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec$$Lambda$668.0000000000000000.apply(Unknown
Source) ~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr(ByteArrayUtf8CharSequence.java:156)
~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence.toString(ByteArrayUtf8CharSequence.java:235)
~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence.convertCharSeq(ByteArrayUtf8CharSequence.java:215)
~[?:?] at
org.apache.solr.common.SolrInputField.getValue(SolrInputField.java:128)
~[?:?] at
org.apache.solr.common.SolrInputDocument.lambda$writeMap$0(SolrInputDocument.java:55)
~[?:?] at
org.apache.solr.common.SolrInputDocument$$Lambda$743.000000002774E7B0.accept(Unknown
Source) ~[?:?] at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)
~[?:1.8.0_232] at
org.apache.solr.common.SolrInputDocument.writeMap(SolrInputDocument.java:59)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:658)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:383)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:813)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:411)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:750)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:395)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:248)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:355)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:167)
~[?:?] at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:102)
~[?:?] at
org.apache.solr.client.solrj.impl.BinaryRequestWriter.write(BinaryRequestWriter.java:83)
~[?:?] at
org.apache.solr.client.solrj.impl.Http2SolrClient.send(Http2SolrClient.java:340)
~[?:?] at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient$Runner.sendUpdateStream(ConcurrentUpdateHttp2SolrClient.java:231)
~[?:?]*

And











*java.lang.StringIndexOutOfBoundsException: String index out of range: 39
at java.lang.String.<init>(String.java:205) at
org.noggit.CharArr.toString(CharArr.java:182) at
org.apache.solr.common.util.JavaBinCodec._readStr(JavaBinCodec.java:929) at
org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:918) at
org.apache.solr.common.util.JavaBinCodec.readExternString(JavaBinCodec.java:1194)
at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:303)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:281)
at
org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:625)
at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:340)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:281)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:321)*

*...*

Sometimes the indexing will succeed because of the nature of the dynamic
field, but retrieving the documents show that the field names have been
corrupted:
Exception in thread "main" java.lang.IllegalStateException: Doc 224 does
not have field *name_wmJmiiWghggUHmNiQAg_prop_s* it has [id,
*SomebodymiiWghggUHmNiQAg_prop_s*, _version_]
Which is a concatenation of the data value "Somebody" from some record, and
part of the actual field name  *name_wmJ**miiWghggUHmNiQAg_prop_s *

On Thu, 21 Nov 2019 at 13:16, Jason Gerlowski <gerlowsk...@gmail.com> wrote:

> Very curious what the config change that's related to reproducing this
> looks like.  Maybe it's something that is worth adding
> test-randomization around?  Just thinking aloud.
>

Reply via email to