Bill, somewhere in the process I think you might be treating your UTF-8 text as ISO-8859-1.
Your character: 00B5 (µ) Bits: 10110101 UTF8-encoded: 11000010 10110101 If you were to treat these bytes as ISO-8859-1 (i.e. reading from a file or wrong url encoding) then it looks like: 0xC2 (Å) followed by 0xB5 (µ) On Tue, Jul 28, 2009 at 3:26 PM, Bill Au<bill.w...@gmail.com> wrote: > I am using SolrJ to index the word µTorrent. After a commit I was not able > to query for it. It turns out that the document in my Solr index contains > the word µTorrent instead of µTorrent. Any one has any idea what's going > on??? > > Bill > -- Robert Muir rcm...@gmail.com