As Erick was suggesting, add &debugQuery=debug (or &debugQuery=true) to your
Solr query request, and Solr will display more detail about the parsed
query.
For example, I see this on a query of an integer field:
curl
"http://localhost:8983/solr/select/?q=++(i_i:123)+&debugQuery=true&indent=true"
...
<lst name="debug">
<str name="rawquerystring"> (i_i:123) </str>
<str name="querystring"> (i_i:123) </str>
<str name="parsedquery">i_i:123</str>
<str name="parsedquery_toString">i_i:`#8;#0;#0;#0;{</str>
The "parsedquery_toString" is doing a Query.toString as you have suggested.
But, note that the "parsedquery" displays the source term, exactly as you
expected. This is because the Solr debug component uses a Solr utility
method, QueryParsing.toString that is a hardcoded version of Query.toString
that is schema-aware. The latter is not schema-aware because it is a Lucene
method and Lucene has no concept of a schema.
-- Jack Krupansky
-----Original Message-----
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 12:08 PM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...
This is perhaps more clear:
Assuming you have a schema where:
<field name="collection_id" type="integer" indexed="true" stored="false"
required="true" omitTermFreqAndPositions="true"/>
Then:
void testSamplePrint()throws IOException, SAXException,
ParserConfigurationException{
SolrConfig config = new SolrConfig("solrconfig.xml");
IndexSchema schema = new IndexSchema(config, "schema.xml", null);
TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
TermQuery bTerm=new TermQuery(new Term("TestString",
schema.getField("collection_id").getType().readableToIndexed("123456")));
System.out.printf("%s\n", aTerm.toString());
System.out.printf("%s\n", bTerm.toString());
assertEquals(aTerm.toString(),bTerm.toString());
}
The test output is:
java.lang.AssertionError:
Expected :TestString:123456
Actual :TestString:`
I believe that this is because the Term does not know that it contains an
encoded integer, and thus cannot parse it. If the TermQuery knew the type,
it could also decode it. But w/o a query to the schema, I don't know how to
get the toString to function correctly.
-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...
If you simply attach &debug=all to your URL, you should see the query come
back in your response, XML, JSON, whatever. If that also shows bizarre
characters, then that will give you some idea whether it's in Solr or not.
But you haven't given us much info about how/where you call toString. You
may be getting into trouble with character sets (although I'd find that
quite odd, but its a possibility.
What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has
nothing at all to do with Terms, it's just the query string passed in. So
I'm really puzzled as to what you're doing to get this kind of output, it
almost looks like you're trying to print out the _results_ of a query, not
the query.
So some clarification would be helpful...
Best
Erick
On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundg...@familysearch.org
wrote:
I am sorry, I don't follow what you mean by debug=query. Can you
elaborate on that a bit?
Thanks!
-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Sunday, March 17, 2013 8:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...
Hmmm, without looking at the code, somehow when you specify
debug=query you get readable results, maybe that code would be a place to
start?
And are you looking for the parsed output? Otherwise you could print
original query.
Not much help....
Erick
On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
<lundg...@familysearch.org>wrote:
> We use the toString call on the query in our logs. For some numeric
> types, the encoded form of the number is being printed instead of
> the readable form.
>
> This makes tail and some other tools very unhappy...
>
> Here is a partial example of a query.toString() that would have had
> binary in it. As a short term work around I replaced all
> non-printable characters in the string with an '_'.
>
> (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> collection_id:`__f ]^9.99997E-4)
>
> But, as you can see, that is less than useful...
>
> I spent some time looking at the source and found that Term does not
> contain the type of the embedded data. Any possible solutions to
> this short of walking the query and getting the type of each field
> from the schema and creating my own print function?
>
> Thanks!
>
> --
> Andrew
>
>
>
>
> NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information.
> Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact
> the sender by reply email and destroy all copies of the original
> message.
>
>
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is
prohibited. If you are not the intended recipient, please contact the
sender by reply email and destroy all copies of the original message.
NOTICE: This email message is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy all
copies of the original message.