2006/9/25, Walter Underwood <[EMAIL PROTECTED]>:

This document has two problems. First, the document is not well-formed
XML.
Open it  in Firefox and you will see this error:

   XML Parsing Error: mismatched tag. Expected: </doc>.
   Location: file:///Users/wunderwood/Desktop/jl.xml
   Line Number 15, Column 3:

After I fix that, it still is not legal UTF-8.


Im sorry that it have more <doc>, because i test more data in
solr. In order to transfter attachements, i reduced jl.xml and not check.
so, you find this problem.
yes, it is not legal utf-8.
utf-8 encoding i mean that is file encoding mode.
when you create new xml by using editplus, and save it, it appears window
that have a selection encoding mode.(u can find it with attachements)
That is jl.xml,Index it by post.sh.

if you use "script language", like solrphp(my solrphp not from solr's wiki)
that i modified. you must send your xml with encoding utf-8.
for instance, i try send my.xml to http://localhost:8983/solr/update-< this
url's head information should have ""Content-Type: text/xml;charset=utf-8"";
Solr work well after with head information.


Does Solr report parsing errors? It really should. Maybe a 400 Bad Request
response with a text/plain body showing the error message.


after i fixed "more <doc" problem, solr work well.

wunder


On 9/22/06 6:24 PM, "James liu" <[EMAIL PROTECTED]> wrote:
>
> 2006/9/23, Walter Underwood <[EMAIL PROTECTED]>:
>> On 9/21/06 5:37 PM, "James liu" <[EMAIL PROTECTED]> wrote:
>>
>>> > Yes,it working. the root of my problem is xml muse be encoded by
utf-8.
>>> > if use php,it not about www browser. just notice that
>>> > curl header information must be utf-8.
>>> > if use post.sh,xml muse be encoded by utf-8.(my editplus default
encode
>>> > style is ansi)
>>
>> This might be a Solr bug. Solr should be able to accept XML in any
>> of the required encodings (ASCII, Latin 1, UTF-8, and UTF-16).
>> Getting XML content types exactly right is tricky, see RFC 3023.
>>
>> What curl command line was used?
>
> No sepcial curl command i use.just solr-nightly/example/exampledocs
post.sh.
> but my jl.xml encoded  utf-8(i use editplus, i tried to use  xml
encoding utf
> 8, but it is not effect).
> solrphp i use curl "$header=array("Content-Type:
> text/xml;charset=utf-8");curl_setopt($ch, CURLOPT_HTTPHEADER,
$header);", this
> is php.
>
>> What encoding is the XML?
>>
>> Can you give a sample XML file?
>
> see attachments, anything you need mail me.
>
>> wunder
>> --
>> Walter Underwood
>> Search Guru, Netflix
>>
>
>






--
regards
jl

Reply via email to