On 10/5/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
On Oct 5, 2006, at 7:17 AM, Marcio Pinto Motta wrote:
> &lt;br&gt;&lt;p&gt;  A Brasil Telecom ... </str>
>
> the html code was "changed".

It wasn't "changed" per se... but rather it was encoded.  If you use
an XML API to read the response you would not see these encoded
characters.

You can also use a different output syntax to verify that the internal
form is unchanged...
for example, add a wt=json to the HTTP parameters to see the results
in JSON format.

See HTMLStripWhitespaceTokenizerFactory if you don't want XML/HTML
tags indexed.  As Erik said, regardless of how you analyze a field,
you can always get an un-analyzed version back when you markthe field
as "stored".

-Yonik

Reply via email to