Further experiments:

-- updated the schema to account for multiple values: 

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-dynamic-field":{
     "name":"*type_s",
     "type":"string",
     "indexed":true, 
     "multiValued":true
 }
}' http://localhost:8985/solr/my_collection/schema

-- Re-ran indexing again: 
solr-5.5.0$ bin/post -c my_collection ../../data/data-solr.json -p 8985
java -classpath /Users/<omitted>/solr-5.5.0/dist/solr-core-5.5.0.jar -Dauto=yes 
-Dport=8985 -Dc=enron_path_w_ts -Ddata=files 
org.apache.solr.util.SimplePostTool ../../data/data-solr.json
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8985/solr/my_collection/update...
Entering auto mode. File endings considered are 
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file data-solr-path-w-ts-suffix.json (application/json) to 
[base]/json/docs
SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: 
http://localhost:8985/solr/my_collection/update/json/docs
SimplePostTool: WARNING: Response: 
{"responseHeader":{"status":400,"QTime":12},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"ERROR:
 [doc=AVNzOoBsX6g-H6sC3dgo] multiple values encountered for non multiValued 
field  _childDocuments_._childDocuments_._childDocuments_.relevance_tf: 
[0.918377, 0.737646, 0.700964, 0.659539, 0.657294, 0.62809, 0.612241, 0.609963, 
0.873428, 0.764, 0.763825, 0.552016, 0.472819, 0.30331, 0.292935, 0.285799, 
0.278851, 0.936158, 0.790093, 0.722639, 0.649841, 0.576905, 0.570454, 0.445547, 
0.429439, 0.410347, 0.391091, 0.293075, 0.253883, 0.252494, 0.250084, 0.242866, 
0.24142, 0.239883, 0.239827, 0.239563, 0.239507, 0.238434, 0.238193, 0.237804, 
0.237769, 0.237022, 0.236955, 0.2364, 0.236164, 0.236129, 0.236025, 
0.235973]","code":400}}
SimplePostTool: WARNING: IOException while reading response: 
java.io.IOException: Server returned HTTP response code: 400 for URL: 
http://localhost:8985/solr/my_collection/update/json/docs
1 files indexed.
COMMITting Solr index changes to 
http://localhost:8985/solr/my_collection/update...
Time spent: 0:00:05.137

So now it dumps all the values of  relevance_tf into one array  disregarding 
the type of the nested field they actually belonged... It really does not seem 
to account for proper hierarchy handling with branches of different types.  :(  

-- Alisa 


>Пятница, 25 марта 2016, 18:19 -04:00 от Alisa Z. <prol...@mail.ru>:
>
>Hi all, 
>It is partially a question, partially a discussion. 
>I am working with documents with deep levels of nesting. The documents are in 
>a single JSON file (see a sample below).
>
>When I was on Solr 5.3.1, 
>solr-5.3.1$ bin/post -c my_collection ../data/data-solr.json
>caused no problems.
>
>Now, I am trying to run just the the same on Solr-5.5.0: 
>
>solr-5.5.0$ bin/post -c my_collection ../data/data-solr.json
>java -classpath /Users/<omitted>/solr-5.5.0/dist/solr-core-5.5.0.jar 
>-Dauto=yes -Dc=enron_path_w_ts -Ddata=files 
>org.apache.solr.util.SimplePostTool ../data/data-solr.json
>SimplePostTool version 5.0.0
>Posting files to [base] url  http://localhost:8983/solr/my_collection/update 
>...
>Entering auto mode. File endings considered are 
>xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
>POSTing file data-solr.json (application/json) to [base]/json/docs
>SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: 
>http://localhost:8983/solr/my_collection/update/json/docs
>SimplePostTool: WARNING: Response: 
>{"responseHeader":{"status":400,"QTime":5},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"ERROR:
> [doc=AVNzOoBsX6g-H6sC3dgo] multiple values encountered for non multiValued 
>field _childDocuments_._childDocuments_.type_s: [doc.userData.parts, 
>doc.enriched.text]","code":400}}
>SimplePostTool: WARNING: IOException while reading response: 
>java.io.IOException: Server returned HTTP response code: 400 for URL: 
>http://localhost:8983/solr/my_collection/json/docs
>1 files indexed.
>COMMITting Solr index changes to  
>http://localhost:8983/solr/my_collection/update ..  .
>Time spent: 0:00:05.078
>
>So obviously I don't get my collection uploaded and indexed properly anymore.  
> 
>
>The question is: 
> - What to do?  
>
>The discussion is: 
>- Is it a proper behavior?  It used to be smooth on Solr 5.3.1: I did not need 
>to know how many levels of nesting do I exactly have and specify whether the 
>_childDocuments_ were of the same type or not. 
> 
>
>A partial sample follows: 
>
>[
>    {
>        "type_s": "doc",
>        "_childDocuments_": [
>            {
>                "type_s": "doc.userData",
>                "Mime-Version_t": "1.0",
>                "_childDocuments_": [
>                    {
>                        "type_s": "doc.userData.parts",
>                        "id": "AVNzOoBsX6g-H6sC3dgo-userData-23461"
>                        "content_t": "----- SOMETEXT",
>                        "id": "AVNzOoBsX6g-H6sC3dgo-parts-15557",
>                        "contentType_t": "text/plain"
>                    }
>                ],
>                "Content-Transfer-Encoding_t": "7bit",
>            },
>            {
>                "type_s": "doc.enriched",
>                "_childDocuments_": [
>                    {
>                       "type_s": "doc.enriched.text",
>                        "language_t": "english",
>                        "_childDocuments_": [
>                            {
>                                "type_s": "doc.enriched.text.docSentiment",
>                                "id": 
>"AVNzOoBsX6g-H6sC3dgo-docSentiment-17692",
>                                "type_t": "positive"
>                            },
>                            {
>                                "type_s": "doc.enriched.text.taxonomy",
>                                "label_t": "/business",
>                                "id": "AVNzOoBsX6g-H6sC3dgo-taxonomy-12728"
>                            },
>                           {
>                                "type_s": "doc.enriched.text.concepts",
>                                "id": "AVNzOoBsX6g-H6sC3dgo-concepts-98530",
>                                "text_t": "Stephen",
>                                "_childDocuments_": [
>                                    {
>                                        "type_s": 
>"doc.enriched.text.concepts.knowledgeGraph",
>                                        "id": 
>"AVNzOoBsX6g-H6sC3dgo-knowledgeGraph-20811",
>                                        "typeHierarchy_t": 
>"/people/children/stephen"
>                                    }
>                                ]
>                            },
>                            {
>                               "type_s": "doc.enriched.text.concepts",         
>                     
>                                "id": "AVNzOoBsX6g-H6sC3dgo-concepts-12396",
>                                "text_t": "Thought",
>                                "_childDocuments_": [
>                                    {
>                                        "type_s": 
>"doc.enriched.text.concepts.knowledgeGraph",
>                                        "id": 
>"AVNzOoBsX6g-H6sC3dgo-knowledgeGraph-20316",
>                                        "typeHierarchy_t": 
>"/people/ideas/thought"
>                                    }
>                                ]
>                            }, 
>                            ...
>                          }]
>     },
>{"type_s": "doc", ....
>},
>...
>]
>
>
>Thank you for your consideration,
>-- 
>Alisa Zhila

Reply via email to