Boosting in version 1.2

2007-06-08 Thread Thierry Collogne
Hello, Our documents contain three fields. title, keywords, content. What we want is to give priority to the field keywords, than title and last content. So we did the following in our xml file that is to be indexed we put the following letters This is a test foobar This is a test lett

How does HTMLStripWhitespaceTokenizerFactory work?

2007-06-08 Thread Thierry Collogne
Hello, I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer with no luck. I have a field content that contains the following When I do a search I get the following test link post po_1_NL post This is a test Is

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Thank you for your reply. Yes, I realize that hitting a query against the hole content would come with this problems, but what I'm trying to say is that I will always narrow by the language (from my users point of view). I would like to know if it is possible (and appropriate) to have all my conte

How can I use dates to boost my results?

2007-06-08 Thread Daniel Alheiros
Hi For my search use, the document freshness is a relevant aspect that should be considered to boost results. I have a field in my index like this: How can I make a good use of this to boost my results? I'm using the DisMaxRequestHandler to boost other textual fields based on the query, but i

Re: Multi-language indexing and searching

2007-06-08 Thread Henrib
Hi Daniel, If it is functionally 'ok' to search in only one lang at a time, you could try having one index per lang. Each per-lang index would have one schema where you would describe field types (the lang part coming through stemming/snowball analyzers, per-lang stopwords & al) and the same field

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Hi Henri. Thanks for your reply. I've just looked at the patch you referred, but doing this I will lose the out of the box Solr installation... I'll have to create my own Solr application responsible for creating the multiple cores and I'll have to change my indexing process to something able to n

problem with schema.xml

2007-06-08 Thread mirko
Hi, I just started playing around with Solr 1.2. It has some nice improvements. I noticed that errors in the schema.xml get reported in a verbose way now, but the following steps cause a problem for me: 1. start with a correct schema.xml - Solr works fine 2. edit it in a way that is no longer co

Re: How does HTMLStripWhitespaceTokenizerFactory work?

2007-06-08 Thread Yonik Seeley
On 6/8/07, Thierry Collogne <[EMAIL PROTECTED]> wrote: I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer with no luck. [...] Is this normal? Shouldn't the html code and the white spaces be removed from the field? For indexing purposes, yes. The stored field you get bac

Cannot index '&' this character using post.jar

2007-06-08 Thread Tiong Jeffrey
Hi all, I tried to index a document that has '&' using post.jar. But during the indexing it causes error and it wont finish the indexing. Can I know why is this and how to prevent this? Thanks! Jeffrey

Re: Boosting in version 1.2

2007-06-08 Thread Mike Klaas
On 8-Jun-07, at 2:07 AM, Thierry Collogne wrote: Hello, Our documents contain three fields. title, keywords, content. What we want is to give priority to the field keywords, than title and last content <> In our schema.xml we have put text No when we do a search like this http://lo

Re: Cannot index '&' this character using post.jar

2007-06-08 Thread Mike Klaas
On 8-Jun-07, at 10:19 AM, Tiong Jeffrey wrote: Hi all, I tried to index a document that has '&' using post.jar. But during the indexing it causes error and it wont finish the indexing. Can I know why is this and how to prevent this? Thanks! XML requires &'s to be escaped. & -> & -Mike

Re: Cannot index '&' this character using post.jar

2007-06-08 Thread Ryan McKinley
are you sending a valid XML document? for XML, & needs to be sent as & Tiong Jeffrey wrote: Hi all, I tried to index a document that has '&' using post.jar. But during the indexing it causes error and it wont finish the indexing. Can I know why is this and how to prevent this? Thanks! Jeffr

To make sure XML is UTF-8

2007-06-08 Thread Tiong Jeffrey
Hi, Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and how do I know what kind of coding it uses in the

Re: To make sure XML is UTF-8

2007-06-08 Thread Mike Klaas
On 8-Jun-07, at 11:20 AM, Tiong Jeffrey wrote: Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and h

Re: problem with schema.xml

2007-06-08 Thread Ryan McKinley
I don't use tomcat, so I can't be particularly useful. The behavior you describe does not happen with resin or jetty... My guess is that tomcat is caching the error state. Since fixing the problem is outside the webapp directory, it does not think it has changed so it stays in a broken state

Re: problem with schema.xml

2007-06-08 Thread mirko
Hi Ryan, I have my .war file located outside the webapps folder (I am using multiple Solr instances with a config as suggested on the wiki: http://wiki.apache.org/solr/SolrTomcat). Nevertheless, I touched the .war file, the config file, the directory under webapps, but nothing seems to be working

Re: To make sure XML is UTF-8

2007-06-08 Thread Funtick
Tiong Jeffrey wrote: > > Thought this is not directly related to Solr, but I have a XML output from > mysql database, but during indexing the XML output is not working. And the > problem is part of the XML output is not in UTF-8 encoding, how can I > convert it to UTF-8 and how do I know what ki

Re: To make sure XML is UTF-8

2007-06-08 Thread funtick
Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and how do I know what kind of coding it uses in the first

Re: Multi-language indexing and searching

2007-06-08 Thread Chris Hostetter
: Can't I have the same index, using one single core, same field names being : processed by language specific components based on a field/parameter? yes, but you don't really need the complexity you describe below ... you don't need seperate request handlers per language, just seperate fields per

Re: Solr 1.2 released

2007-06-08 Thread Jack L
Hello Yonik, This is great news. Will it be a drop-in replacement for 1.1? I.e., do I need to make any changes other than replacing the jar files? I suppose the index files will still be good. Are 1.2 schema files and config files compatible with those of 1.1? -- Best regards, Jack Thursday, Ju

Re: Solr 1.2 released

2007-06-08 Thread Yonik Seeley
On 6/8/07, Jack L <[EMAIL PROTECTED]> wrote: This is great news. Will it be a drop-in replacement for 1.1? I.e., do I need to make any changes other than replacing the jar files? I suppose the index files will still be good. Are 1.2 schema files and config files compatible with those of 1.1? It

Re: host logging options (was Re: Schema validator/debugger)

2007-06-08 Thread Chris Hostetter
: > SEVERE: org.apache.solr.core.SolrException: undefined field text : As someone who has used both Jetty and Tomcat in production (and has : come to prefer Tomcat), what are my choices to get the "undefined field : xxx" error in the catalina log files (or is it stashed somewhere I'm : overlooking

Re: Wildcards / Binary searches

2007-06-08 Thread Chris Hostetter
: Do you mean something like below ? : w wo wor word yeah, but there are some Tokenizers that make this trivial (EdgeNGramTokenizer i think is the name) : project, definitively not a good practice for portability of indexes. A : duplicate field with an analyser to produce a sortable ASCII versi

RE: Solr 1.2 released

2007-06-08 Thread Teruhiko Kurosaka
I noticed there is no example/ext directory or jars that was found there in 1.1 (commons-el.jar, commons-logging.jar, jasper-*.jar, mx4j-*.jar) I have a jar that my Solr plugin depends on. This jar contains a class that needs to be loaded only once per container because it is a JNI library. Fo

Re: solr+hadoop = next solr

2007-06-08 Thread Jeff Rodenburg
On 6/7/07, Rafael Rossini <[EMAIL PROTECTED]> wrote: Hi, Jeff and Mike. Would you mind telling us about the architecture of your solutions a little bit? Mike, you said that you implemented a highly-distributed search engine using Solr as indexing nodes. What does that mean? You guys implemen

RE: Solr 1.2 released

2007-06-08 Thread Chris Hostetter
: I noticed there is no example/ext : directory or jars that was found there : in 1.1 (commons-el.jar, commons-logging.jar, : jasper-*.jar, mx4j-*.jar) the example/ext directory was an entirly Jetty based artifact. when we upgraded the Jetty used in the example setup, Jetty no longer had an ext d