Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote:
Hi,
I'm trying to see if I can use termVectors for a use case I have.  Essentially I want to 
know is: where in the indexed value does the query hit occur?  I think either tv.positions 
or tv.offsets would provide that info but I don't really grok the result.  Below I've pasted 
the URL and part of the result.  What is <lst name "#1;00"?  And why so many 
offsets?

http://localhost:8080/solr/select?q=idxPartition:CONNECTED_ASSETS%20AND%20srcSpan:CR1434&rows=1&indent=on&qt=tvrh&tv.offsets=true&fl=srcSpan

<result name="response" numFound="1" start="0">
<doc>
<str name="srcSpan">|CR1434-Occ1|abcCR1434 is a token for searching with 
WILDCI|testuser|System of 
Registries|2010-01-12T23:00:00.000Z|2010-01-12T23:00:00.000Z|testuser|System of Registries</str>
</doc>
</result>
<lst name="termVectors">
<lst name="doc-960">
<str name="uniqueKey">f57488c1d041a1de5bd6a70b09428d119ed1de29</str>
<lst name="srcSpan">
<lst name="#1;00">
<lst name="offsets">
<int name="start">104</int>
<int name="end">106</int>
<int name="start">107</int>
<int name="end">109</int>
<int name="start">129</int>
<int name="end">131</int>
<int name="start">132</int>
<int name="end">134</int>
</lst>
</lst>

"#1;00" is a token that was produced by your <analyzer/> from srcSpan field value when you indexed the field. And it seems the token occurred four times in the field. If "#1;00" is unexpected token, you should check your <analyzer type="index"/>
definition for srcSpan field.

Koji

--
http://www.rondhuit.com/en/

Reply via email to