Re: Best way to index without diacritics

2009-04-22 Thread lupiss
hola de nuevo! gracias por la yuda, ya pude solucionar lo de los acentos :D por si a alguien le sirve el tip, todo está en poner en el schema.xml la clase isolatin mas o menos así: hello again! thanks for the help, I could solve it for the accents: D if someone helps the tip, everything is pu

Re: Best way to index without diacritics

2009-04-21 Thread wiserweb
Amigo! Viva Solr :) Sent from my BlackBerry device on the Rogers Wireless Network -Original Message- From: lupiss Date: Tue, 21 Apr 2009 16:17:04 To: Subject: Re: Best way to index without diacritics hola, gracias por contestar. sí, yo creo que esa es la clase que me servirá, pero

Re: Best way to index without diacritics

2009-04-21 Thread lupiss
hola, gracias por contestar. sí, yo creo que esa es la clase que me servirá, pero no sé cómo implementarla, podrías decirme si tu ya la haz usado, y si es así, decirme qué líneas incluíste en el schema.xml, en el config.xml, qué .jar adjuntaste, etc, todos los detalles, o incluso si tienes un ejem

Re: Best way to index without diacritics

2009-04-21 Thread Otis Gospodnetic
he.org > Sent: Tuesday, April 21, 2009 12:22:08 PM > Subject: Re: Best way to index without diacritics > > > Hola! > > Yo también tengo el mismo problema, ya tengo mis índices de mis documentos > cambie el charset a iso-8859-1 y ya pude ver las ñ y acentos, ahora ligué el >

Re: Best way to index without diacritics

2009-04-21 Thread lupiss
Hola! Yo también tengo el mismo problema, ya tengo mis índices de mis documentos cambie el charset a iso-8859-1 y ya pude ver las ñ y acentos, ahora ligué el buscador a mi aplicación y desde páginas jsp se hacen búsquedas, el problema es que cuando el usuario escribe en el text que pide el paráme

Re: Best way to index without diacritics

2008-08-14 Thread Norberto Meijome
On Thu, 14 Aug 2008 11:34:47 -0400 "Steven A Rowe" <[EMAIL PROTECTED]> wrote: [...] > The kind of filter Walter is talking about - a generalized language-aware > character normalization Solr/Lucene filter - does not yet exist. My guess is > that if/when it does materialize, both the Solr and th

RE: Best way to index without diacritics

2008-08-14 Thread Steven A Rowe
Hi Norberto, On 08/14/2008 at 8:10 AM, Norberto Meijome wrote: > > On 8/13/08 9:16 AM, "Steven A Rowe" <[EMAIL PROTECTED]> wrote: > > > > > Hi Norberto, > > > > > > https://issues.apache.org/jira/browse/LUCENE-1343 > > hi Steve, > thanks for the pointer. this is a Lucene entry... I thought the

Re: Best way to index without diacritics

2008-08-14 Thread Norberto Meijome
( 2 in 1 reply) On Wed, 13 Aug 2008 09:59:21 -0700 Walter Underwood <[EMAIL PROTECTED]> wrote: > Stripping accents doesn't quite work. The correct translation > is language-dependent. In German, o-dieresis should turn into > "oe", but in English, it shoulde be "o" (as in "co__perate" or > "M__tle

Re: Best way to index without diacritics

2008-08-13 Thread Walter Underwood
Stripping accents doesn't quite work. The correct translation is language-dependent. In German, o-dieresis should turn into "oe", but in English, it shoulde be "o" (as in "coöperate" or "Mötley Crüe"). In Swedish, it should not be converted at all. There are other character-to-string conversions:

RE: Best way to index without diacritics

2008-08-13 Thread Steven A Rowe
Hi Norberto, https://issues.apache.org/jira/browse/LUCENE-1343 :) Steve On 08/13/2008 at 12:35 AM, Norberto Meijome wrote: > On Tue, 12 Aug 2008 11:44:42 -0400 > "Steven A Rowe" <[EMAIL PROTECTED]> wrote: > > > Solr is Unicode aware. The ISOLatin1AccentFilterFactory > handles diacritics for t

Re: Best way to index without diacritics

2008-08-12 Thread Norberto Meijome
On Tue, 12 Aug 2008 11:44:42 -0400 "Steven A Rowe" <[EMAIL PROTECTED]> wrote: > Solr is Unicode aware. The ISOLatin1AccentFilterFactory handles diacritics > for the ISO Latin-1 section of the Unicode character set. UTF (do you mean > UTF-8?) is a (set of) Unicode serialization(s), and once Sol

RE: Best way to index without diacritics

2008-08-12 Thread Steven A Rowe
Hi Alejandro, Solr is Unicode aware. The ISOLatin1AccentFilterFactory handles diacritics for the ISO Latin-1 section of the Unicode character set. UTF (do you mean UTF-8?) is a (set of) Unicode serialization(s), and once Solr has deserialized it, it is just Unicode characters (Java's in-memor