Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote:
Hi,
I'm trying to see if I can use termVectors for a use case I have. Essentially I want to
know is: where in the indexed value does the query hit occur? I think either tv.positions
or tv.offsets would provide that info but I don't really grok the result. Below I've pasted
the URL and part of the result. What is <lst name "#1;00"? And why so many
offsets?
http://localhost:8080/solr/select?q=idxPartition:CONNECTED_ASSETS%20AND%20srcSpan:CR1434&rows=1&indent=on&qt=tvrh&tv.offsets=true&fl=srcSpan
<result name="response" numFound="1" start="0">
<doc>
<str name="srcSpan">|CR1434-Occ1|abcCR1434 is a token for searching with
WILDCI|testuser|System of
Registries|2010-01-12T23:00:00.000Z|2010-01-12T23:00:00.000Z|testuser|System of Registries</str>
</doc>
</result>
<lst name="termVectors">
<lst name="doc-960">
<str name="uniqueKey">f57488c1d041a1de5bd6a70b09428d119ed1de29</str>
<lst name="srcSpan">
<lst name="#1;00">
<lst name="offsets">
<int name="start">104</int>
<int name="end">106</int>
<int name="start">107</int>
<int name="end">109</int>
<int name="start">129</int>
<int name="end">131</int>
<int name="start">132</int>
<int name="end">134</int>
</lst>
</lst>
"#1;00" is a token that was produced by your <analyzer/> from srcSpan
field value
when you indexed the field. And it seems the token occurred four times
in the field.
If "#1;00" is unexpected token, you should check your <analyzer
type="index"/>
definition for srcSpan field.
Koji
--
http://www.rondhuit.com/en/