On Aug 27, 2007, at 10:00 AM, Michael Kimsal wrote:
What's odd about this is that the error seems to indicate that I did.

Actually the error message looks like you escaped too much. You should _not_ escape <field>, only the contents of it.

        Erik




The full text (minus the stack trace) was

org.xmlpull.v1.XmlPullParserException: parser must be on START_TAG or TEXT to read text (position: START_TAG seen ...&lt;field name="line"&gt;&lt;a
href="foobar"&gt;... @4:37)

Or is that just a byproduct of how SOLR reports the errors back - always
escaping them?

Thanks guys - I'll have another crack at this tonight.


On 8/27/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:

Michael,

I think the issue is that you're not escaping the <field> values.
Send something like this to Solr instead:

  <field name="line">&lt;a
href="foobar"&gt;&lt;b&gt;&lt;i&gt;linktext&lt;/i&gt;&lt;/b&gt;&lt;/
a&gt;</field>

        Erik


On Aug 27, 2007, at 9:29 AM, Michael Kimsal wrote:

Hello

I'm trying to index individual lines of an HTML file, and I'm
hitting this
error:

TEXT must be immediately followed by END_TAG and not START_TAG

I've got something that looks like

<add>
<doc>
<field name="id">4</field>
<field name="line"><a href="foobar"><b><i>linktext</i></b></a></ field>
</doc>
</add>

Actually, that sample code above, as its own data file POSTed to SOLR,
throws

parser must be on START_TAG or TEXT to read text (position:
START_TAG seen
...&lt;field name="line"&gt;&lt;a href="foobar"&gt;... @4:37

as an error.

Any clues as to how I can do this?  I'd like to keep the original
copy of
each line intact in the index.

Thanks!

--
Michael Kimsal
http://webdevradio.com




--
Michael Kimsal
http://webdevradio.com

Reply via email to