Further experiments:
-- updated the schema to account for multiple values:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-dynamic-field":{
"name":"*type_s",
"type":"string",
"indexed":true,
"multiValued":true
}
}' http://localhost:8985/solr/my_collection/schema
-- Re-ran indexing again:
solr-5.5.0$ bin/post -c my_collection ../../data/data-solr.json -p 8985
java -classpath /Users/<omitted>/solr-5.5.0/dist/solr-core-5.5.0.jar -Dauto=yes
-Dport=8985 -Dc=enron_path_w_ts -Ddata=files
org.apache.solr.util.SimplePostTool ../../data/data-solr.json
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8985/solr/my_collection/update...
Entering auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file data-solr-path-w-ts-suffix.json (application/json) to
[base]/json/docs
SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url:
http://localhost:8985/solr/my_collection/update/json/docs
SimplePostTool: WARNING: Response:
{"responseHeader":{"status":400,"QTime":12},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"ERROR:
[doc=AVNzOoBsX6g-H6sC3dgo] multiple values encountered for non multiValued
field _childDocuments_._childDocuments_._childDocuments_.relevance_tf:
[0.918377, 0.737646, 0.700964, 0.659539, 0.657294, 0.62809, 0.612241, 0.609963,
0.873428, 0.764, 0.763825, 0.552016, 0.472819, 0.30331, 0.292935, 0.285799,
0.278851, 0.936158, 0.790093, 0.722639, 0.649841, 0.576905, 0.570454, 0.445547,
0.429439, 0.410347, 0.391091, 0.293075, 0.253883, 0.252494, 0.250084, 0.242866,
0.24142, 0.239883, 0.239827, 0.239563, 0.239507, 0.238434, 0.238193, 0.237804,
0.237769, 0.237022, 0.236955, 0.2364, 0.236164, 0.236129, 0.236025,
0.235973]","code":400}}
SimplePostTool: WARNING: IOException while reading response:
java.io.IOException: Server returned HTTP response code: 400 for URL:
http://localhost:8985/solr/my_collection/update/json/docs
1 files indexed.
COMMITting Solr index changes to
http://localhost:8985/solr/my_collection/update...
Time spent: 0:00:05.137
So now it dumps all the values of relevance_tf into one array disregarding
the type of the nested field they actually belonged... It really does not seem
to account for proper hierarchy handling with branches of different types. :(
-- Alisa
>Пятница, 25 марта 2016, 18:19 -04:00 от Alisa Z. <[email protected]>:
>
>Hi all,
>It is partially a question, partially a discussion.
>I am working with documents with deep levels of nesting. The documents are in
>a single JSON file (see a sample below).
>
>When I was on Solr 5.3.1,
>solr-5.3.1$ bin/post -c my_collection ../data/data-solr.json
>caused no problems.
>
>Now, I am trying to run just the the same on Solr-5.5.0:
>
>solr-5.5.0$ bin/post -c my_collection ../data/data-solr.json
>java -classpath /Users/<omitted>/solr-5.5.0/dist/solr-core-5.5.0.jar
>-Dauto=yes -Dc=enron_path_w_ts -Ddata=files
>org.apache.solr.util.SimplePostTool ../data/data-solr.json
>SimplePostTool version 5.0.0
>Posting files to [base] url http://localhost:8983/solr/my_collection/update
>...
>Entering auto mode. File endings considered are
>xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
>POSTing file data-solr.json (application/json) to [base]/json/docs
>SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url:
>http://localhost:8983/solr/my_collection/update/json/docs
>SimplePostTool: WARNING: Response:
>{"responseHeader":{"status":400,"QTime":5},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"ERROR:
> [doc=AVNzOoBsX6g-H6sC3dgo] multiple values encountered for non multiValued
>field _childDocuments_._childDocuments_.type_s: [doc.userData.parts,
>doc.enriched.text]","code":400}}
>SimplePostTool: WARNING: IOException while reading response:
>java.io.IOException: Server returned HTTP response code: 400 for URL:
>http://localhost:8983/solr/my_collection/json/docs
>1 files indexed.
>COMMITting Solr index changes to
>http://localhost:8983/solr/my_collection/update .. .
>Time spent: 0:00:05.078
>
>So obviously I don't get my collection uploaded and indexed properly anymore.
>
>
>The question is:
> - What to do?
>
>The discussion is:
>- Is it a proper behavior? It used to be smooth on Solr 5.3.1: I did not need
>to know how many levels of nesting do I exactly have and specify whether the
>_childDocuments_ were of the same type or not.
>
>
>A partial sample follows:
>
>[
> {
> "type_s": "doc",
> "_childDocuments_": [
> {
> "type_s": "doc.userData",
> "Mime-Version_t": "1.0",
> "_childDocuments_": [
> {
> "type_s": "doc.userData.parts",
> "id": "AVNzOoBsX6g-H6sC3dgo-userData-23461"
> "content_t": "----- SOMETEXT",
> "id": "AVNzOoBsX6g-H6sC3dgo-parts-15557",
> "contentType_t": "text/plain"
> }
> ],
> "Content-Transfer-Encoding_t": "7bit",
> },
> {
> "type_s": "doc.enriched",
> "_childDocuments_": [
> {
> "type_s": "doc.enriched.text",
> "language_t": "english",
> "_childDocuments_": [
> {
> "type_s": "doc.enriched.text.docSentiment",
> "id":
>"AVNzOoBsX6g-H6sC3dgo-docSentiment-17692",
> "type_t": "positive"
> },
> {
> "type_s": "doc.enriched.text.taxonomy",
> "label_t": "/business",
> "id": "AVNzOoBsX6g-H6sC3dgo-taxonomy-12728"
> },
> {
> "type_s": "doc.enriched.text.concepts",
> "id": "AVNzOoBsX6g-H6sC3dgo-concepts-98530",
> "text_t": "Stephen",
> "_childDocuments_": [
> {
> "type_s":
>"doc.enriched.text.concepts.knowledgeGraph",
> "id":
>"AVNzOoBsX6g-H6sC3dgo-knowledgeGraph-20811",
> "typeHierarchy_t":
>"/people/children/stephen"
> }
> ]
> },
> {
> "type_s": "doc.enriched.text.concepts",
>
> "id": "AVNzOoBsX6g-H6sC3dgo-concepts-12396",
> "text_t": "Thought",
> "_childDocuments_": [
> {
> "type_s":
>"doc.enriched.text.concepts.knowledgeGraph",
> "id":
>"AVNzOoBsX6g-H6sC3dgo-knowledgeGraph-20316",
> "typeHierarchy_t":
>"/people/ideas/thought"
> }
> ]
> },
> ...
> }]
> },
>{"type_s": "doc", ....
>},
>...
>]
>
>
>Thank you for your consideration,
>--
>Alisa Zhila