On Jul 26, 2007, at 11:49 AM, Yonik Seeley wrote:
Could you try it with jetty to see if it's the servlet container?
It should be simple to just copy the index directory into solr's
example/solr/data directory.
Yonik, sorry for my delay, but I did just try this in jetty -- it
works (it doe
On 7/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
>
> On Jul 26, 2007, at 11:25 AM, Yonik Seeley wrote:
>
> > OK, then perhaps it's a jetty bug with charset handling.
> >
>
> I'm using resin btw
Could you try it with jetty to see if it's the servlet container?
It should be simple to just copy t
On Jul 26, 2007, at 11:25 AM, Yonik Seeley wrote:
OK, then perhaps it's a jetty bug with charset handling.
I'm using resin btw
Could you run the same query, but use the python output?
wt=python
Seems to be OK:
{'responseHeader':{'status':0,'QTime':0,'params':{'start':'7','fl':'c
onten
On 7/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
>
> On Jul 26, 2007, at 11:10 AM, Yonik Seeley wrote:
>
> >
> > If the '<' truely got destroyed, it's a server (Solr or Jetty) bug.
> >
> > One possibility is that the '<' does exist, but due to a charset
> > mismatch, it's being slurped into a m
On Jul 26, 2007, at 11:10 AM, Yonik Seeley wrote:
If the '<' truely got destroyed, it's a server (Solr or Jetty) bug.
One possibility is that the '<' does exist, but due to a charset
mismatch, it's being slurped into a multi-byte char.
Just dumped it with curl and did a hexdump:
5a0
On 7/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
> I ended up with this doc in solr:
>
>
>
> 0 name="QTime">17 name="fl">content"Pez"~1 name="rows">1 numFound="5381" start="7">Akatsuki - PE'Z
> ҳ | ̳ | պ | ŷ | >>> Akatsuki - PE'Z ר | и  |
> Ů  | ֶ  | պ  | ¸  | tӺ
> &
Looks to me as if your document is not valid UTF-8 and is missing one
byte at the end.
Then the '<' of '' is included into the previous character.
Did you create the text snippet yourself? Maybe check if the string
functions you are using are multi-byte aware.
Greetings, Marc
On 26-jul-2