Ok. I have revisited this issue as deeply as possible using simplistic unit tests, tossing out indexes, and starting fresh.
A typical Solr document might have a label, e.g. the string inside the quotes: "Node Type". That would be queried, according to what I've been able to read, as a Phrase Query, which means, include the quotes around the text. When I use the admin query panel with this query: label:"Node Type" A fragment of the full document is returned. it is this: <doc> <str name="locator">NodeType</str> <arr name="label"> <str>Node Type</str> </arr> In my code using SolrJ, I have printlines just as the "escaped" query string comes in, and one which shows what the SolrQuery looks like after setting it up to go online. I then show what came back: Solr3Client.runQuery- label:"Node Type" 0 10 Solr3Client.runQuery-1 q=label%3A%22Node+Type%22&start=0&rows=10 ZZZZ {numFound=1,start=0,docs=[SolrDocument{locator=NodeType, smallIcon=cogwheel.png, subOf=ClassType, details=The TopicQuests typology node type., isPrivate=false, creatorId=SystemUser, label=Node Type, largeIcon=cogwheel.png, lastEditDate=Sat Feb 23 20:43:22 PST 2013, createdDate=Sat Feb 23 20:43:22 PST 2013, _version_=1427826019119661056}]} What that says is that SolrQuery inserted a + inside the query string, and that it found 1 document, but did not return it. In the largest picture, I have returned to using XMLResponseParser on the theory that I will now be able to take advantage of partialUpdates on multi-valued fields (List<String>) but haven't tested that yet. I am not yet escaping such things as "<" or ">" but just escaping those things mentioned in the Solr documents which are reserved characters. So, the current update is this: learning about phrase queries, and judicious escaping of reserved characters seems to be helping. Next up entails two issues: more robust testing of escaped characters, and trying to discover what is the best approach to dealing with characters that must be escaped to get past XML, e.g. '<', '>', and others. Many thanks Jack On Fri, Feb 22, 2013 at 2:44 PM, Jack Park <jackp...@topicquests.org> wrote: > Michael, > I don't think you misunderstood. I will soon give a full response here, but > am on the road at the moment. > > Many thanks > Jack > > > On Friday, February 22, 2013, Michael Della Bitta > <michael.della.bi...@appinions.com> wrote: >> My mistake, I misunderstood the problem. >> >> Michael Della Bitta >> >> ------------------------------------------------ >> Appinions >> 18 East 41st Street, 2nd Floor >> New York, NY 10017-6271 >> >> www.appinions.com >> >> Where Influence Isn’t a Game >> >> >> On Fri, Feb 22, 2013 at 3:55 PM, Chris Hostetter >> <hossman_luc...@fucit.org> wrote: >>> >>> : If you're submitting documents as XML, you're always going to have to >>> : escape meaningful XML characters going in. If you ask for them back as >>> : XML, you should be prepared to unescape special XML characters as >>> >>> that still wouldn't explain the discrepency he's claiming to see between >>> the json & xml resmonses (the json containing an empty string >>> >>> Jack: please elaborate with specifics about your solr version, field, >>> field type, how you indexed your doc, and what the request urls & raw >>> responses that you get are (ie: don't trust the XML you see in your >>> browser, it may be unescaping escaped sequences in element text to be >>> "helpful" .. use something like curl) >>> >>> For example... >>> >>> ----BEGIN GOOD EXAMPLE OF SPECIFICS--- >>> >>> I'm using Solr 4.x with the 4.x example schema which has the following >>> field... >>> >>> <field name="cat" type="string" indexed="true" stored="true" >>> multiValued="true"/> >>> <fieldType name="string" class="solr.StrField" sortMissingLast="true" >>> /> >>> >>> I indexed a doc like this... >>> >>> $ curl "http://localhost:8983/solr/update?commit=true" -H >>> 'Content-type:application/json' -d '[{"id":"hoss", "cat":"<Something to use >>> as a source node>" } ]' >>> >>> And this is what i get from the following requests... >>> >>> $ curl >>> "http://localhost:8983/solr/select?q=id:hoss&wt=xml&indent=true&omitHeader=true" >>> <?xml version="1.0" encoding="UTF-8"?> >>> <response> >>> >>> <result name="response" numFound="1" start="0"> >>> <doc> >>> <str name="id">hoss</str> >>> <arr name="cat"> >>> <str><Something to use as a source node></str> >>> </arr> >>> <long name="_version_">1427705631375097856</long></doc> >>> </result> >>> </response> >>> >>> $ curl >>> "http://localhost:8983/solr/select?q=id:hoss&wt=json&indent=true&omitHeader=true" >>> { >>> "response":{"numFound":1,"start":0,"docs":[ >>> { >>> "id":"hoss", >>> "cat":["<Something to use as a source node>"], >>> "_version_":1427705631375097856}] >>> }} >>> >>> $ curl >>> "http://localhost:8983/solr/select?q=cat:%22<Something+to+use+as+a+source+node>%22&wt=json&indent=true&omitHeader=true" >>> { >>> "response":{"numFound":1,"start":0,"docs":[ >>> { >>> "id":"hoss", >>> "cat":["<Something to use as a source node>"], >>> "_version_":1427705631375097856}] >>> }} >>> >>> ----END GOOD EXAMPLE OF SPECIFICS--- >>> >>> : > Even more curious, if I use this query at the console: >>> : > >>> : > details:<Something to use as a source node> >>> : > >>> : > I get nothing back. >>> >>> note in my last example above the importance of using quotes (or the >>> {!term} qparser) to query string fields that contain special characters >>> like whitespace -- whitespace is syntacally meaningul to the lucene query >>> parser, it seperates clauses of a boolean query. >>> >>> >>> -Hoss >>