RE: add CJKTokenizer to solr

2007-06-21 Thread Xuesong Luo
Thanks, Toru and Chris, I tried both the CJKTokenizer and CJKAnalyzer. Both return some unexpected highlight results when I tested with Germany. The field value I searched is "Ein Mann beißt den Hund". The search criteria is beißt. When using CJKAnalyzer, beißt is treated as 2 single terms(bei

RE: Faceted Search!

2007-06-21 Thread Chris Hostetter
: generating XML feed file and feeding to the Solr server. However, I was : also looking into implementing having sub-categories within the : categories if that make sense. For example, in the shopper.com we have : the categories of by price, manufactures and so on and with in them,they : are su

Re: All facet.fields for a given facet.query?

2007-06-21 Thread Chris Hostetter
: > facet.mincount is a way to tell solr not to bother giving you those 0 : > counts ... : : An aside: shouldn't that be the default? All of the people using : facets that I have seen always have to set facet.mincount=1 (or : facet.zeros=false) Hmmm... maybe, but it's a really easy option to tur

Re: All facet.fields for a given facet.query?

2007-06-21 Thread Chris Hostetter
: I get your point, but how to know where additional metadata is of value : if not : just trying? Currently I start with a generic approach to see what Man power. for simple schemas the brute force facet on everything appraoch can scale well .. but as soon as you start talking about having hunda

Re: Facets & Links

2007-06-21 Thread Chris Hostetter
: solr.zappos.com/select/&fq=brand_exact:VALUE ? that will work (just remember to URL escape the brand name, and probably put it in quotes too if you think it might contains whitespace. : Should I not be sending the same facets the second time to SOLR? Do : you remove the facet they've just cli

Re: commit script with solr 1.2 response format

2007-06-21 Thread Chris Hostetter
: I guess we should look for 'status="0"><' ? that wouldn't quite work. : Or, if you get a response code of 200, it's a success unless : you see status="" we could always make it an option in the scripts.conf file -- what substring to match on ... just in case people want to write their own cr

Re: add CJKTokenizer to solr

2007-06-21 Thread Chris Hostetter
: Regarding reflection - even if reflection is slower, and I'm sure it is, : I just don't know exactly how much slower it is, couldn't we cache the : instantiated instances keyed off by name? Such instances would have to : be thread-safe, but I imagine most/all Tokenizers already are : thread-saf

Facet searching on single field with multiple words value.

2007-06-21 Thread ashwani kabra
Hi friends, I tried to implement the facet searching in a sample code and when I tried it with various case and found no result in one case.I wanted to narrow by one field "title" and gave the multiple word or say phrase. So First, in this preparing the lucene query and converting it into QueryF

Re: commit script with solr 1.2 response format

2007-06-21 Thread James liu
aha,,same question i found few days ago. i m sorry to forget submit it. 2007/6/22, Yonik Seeley <[EMAIL PROTECTED]>: On 6/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > I just started running the scripts and > > The commit script seems to run fine, but it says there was an error. I > looke

Re: add CJKTokenizer to solr

2007-06-21 Thread Otis Gospodnetic
Eh, I was looking at these Factories just the other day and wondering about the similar stuff as Daniel. Regarding reflection - even if reflection is slower, and I'm sure it is, I just don't know exactly how much slower it is, couldn't we cache the instantiated instances keyed off by name? Such

Re: Multiple doc types in schema

2007-06-21 Thread Frédéric Glorieux
After further reading, especially (Thanks Hoss) Depending on update patterns and index sizes, you can probably get better efficiency with multiple indexes, but not really more functionality (in your case), rig

Re: Multiple doc types in schema

2007-06-21 Thread Frédéric Glorieux
Thanks Yonik to share your reflexion, This doesn't sound like true federated search, I'm affraid to not understand "federated search", you seems to have a precise idea behind the head. since you have a number of fields that are the same in each index that you search across, and you treat t

Re: DismaxRequestHandler reports sort by score as invalid

2007-06-21 Thread Chris Hostetter
as mentioned, the warning is missleading in the case where you sort by socre, i filed a bug as a reminder to fix it (and so people searching for it will understand what's going on) patches welcome! :) http://issues.apache.org/jira/browse/SOLR-270 : WARNING: Invalid sort "score desc" was specifi

Re: add CJKTokenizer to solr

2007-06-21 Thread Chris Hostetter
: Why instead of that we don't create an UbberFactory that takes the Tokenizer : class as a parameter and instantiates the proper Tokenizer? The idea has come up before ... and there's really no reason why it wouldn't be okay to include a reflection based facotry like this in Solr -- it just hasn

Facets & Links

2007-06-21 Thread Matthew Runo
Hello! Let's say I have a query which is returning facets. Let's say they are various brand names, and map back to a brand_exact field in the index. What is the proper format for the link that these facets should have? something like.. solr.zappos.com/select/facets>&fq=brand_exact:VALUE

Re: commit script with solr 1.2 response format

2007-06-21 Thread Yonik Seeley
On 6/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: I just started running the scripts and The commit script seems to run fine, but it says there was an error. I looked into it, and the scripts expect 1.1 style response: 1.2 /update returns: 0 44 I guess

commit script with solr 1.2 response format

2007-06-21 Thread Ryan McKinley
I just started running the scripts and The commit script seems to run fine, but it says there was an error. I looked into it, and the scripts expect 1.1 style response: 1.2 /update returns: 0 44 ryan

Re: Multiple doc types in schema

2007-06-21 Thread Yonik Seeley
On 6/21/07, Frédéric Glorieux <[EMAIL PROTECTED]> wrote: >> I will also need multiple indexes searches, > > Do you mean: > 2) Multiple indexes with different schemas, search will search across > all or some subset and combine the results (federated search) Exactly that. I'm comming from a quite

Re: Multiple doc types in schema

2007-06-21 Thread Frédéric Glorieux
Hi Sonic, I will also need multiple indexes searches, Do you mean: 2) Multiple indexes with different schemas, search will search across all or some subset and combine the results (federated search) Exactly that. I'm comming from a quite old lucene based project, called SDX

Re: DismaxRequestHandler reports sort by score as invalid

2007-06-21 Thread Yonik Seeley
A little background: I originally conceived of query operation chains (based on some of my previous hacking in mechanical investing stock screens: select all stocks; take top 10% lowest PE; then take the top 20 highest growth rate; then sort descending by 13 week relative strength). So, I thought

Re: DismaxRequestHandler reports sort by score as invalid

2007-06-21 Thread J.J. Larrea
Because "score desc" is the default Lucene & Solr behavior when no explicit sort is specified, QueryParsing.parseSort() returns a null sort so that the non-sort versions of the query execution routines get called. However the caller SolrPluginUtils.parseSort issues that warning whenever it gets

Re: Recent updates to Solrsharp

2007-06-21 Thread Jeff Rodenburg
great, thanks Yonik. On 6/20/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 6/21/07, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: > As an aside, it would be nice to record these issues more granularly in > JIRA. Could we get a component created for our client library, similar to > java/php/ruby?

Re: Multiple doc types in schema

2007-06-21 Thread Yonik Seeley
On 6/21/07, Frédéric Glorieux <[EMAIL PROTECTED]> wrote: I will also need multiple indexes searches, Do you mean: 1) Multiple unrelated indexes with different schemas, that you will search separately... but you just want them in the same JVM for some reason. 2) Multiple indexes with different

DismaxRequestHandler reports sort by score as invalid

2007-06-21 Thread gerard sychay
Hello all, This is a minor issue and does not affect Solr operation, but I could not find it in the issue tracking. To reproduce: - I set up a Solr server with the example docs indexed by following the Solr tutorial. - I clicked on the following example search under the "Sorting" section:

Re: Multiple doc types in schema

2007-06-21 Thread Frédéric Glorieux
Otis, Thanks for the link and the work ! Maybe around september, I will need this patch, if it's not already commit to the Solr sources. I will also need multiple indexes searches, but understand that there is no simple, fast and genereric solution in solr context. Maybe I should lose solr

Re: Multiple doc types in schema

2007-06-21 Thread Walter Underwood
I used Solr with indexes on NFS and I do not recommend it. It was either 100 or 1000 times slower than local disc for indexing, I forget which. Unusable. This is not a problem with Solr/Lucene, I have seen the same NFS performance cost with other search engines. wunder On 6/21/07 3:22 AM, "Otis

Multi-language Tokenizers / Filters recommended?

2007-06-21 Thread Daniel Alheiros
Hi I'm now considering how to improve query results on a set of languages and would like to hear considerations based on your experience in that. I'm using the tokenizer HTMLStringWhitespaceTokenizerFactory with the WordDelimiterFilterFactory, LowerCaseFilterFactory and RemoveDuplicatesTokenFilte

Re: add CJKTokenizer to solr

2007-06-21 Thread Daniel Alheiros
Hi Well, creating a Factory for each new Tokenizer we want to add means you are replicating the same code again and again just to bind the Factory (Solr interface) to the Tokenizer (Lucene interface). Why instead of that we don't create an UbberFactory that takes the Tokenizer class as a paramete

Re: All facet.fields for a given facet.query?

2007-06-21 Thread Thomas Traeger
: Faceting on manufacturers and categories first and than present the : corresponding facets might be used under some circumstances, but in my case : the category structure is quite deep, detailed and complex. So when : the user enters a query I like to say to him "Look, here are the : manufactu

Re: problems getting data into solr index

2007-06-21 Thread vanderkerkoff
Hi Mike, Brian Thanks for helping with this, and for clearing up my misunderstanding. Solr the python module and Solr the package being two different things, I've got you. The issues I have are compounded by the fact that we're hovering between using the Unicode branch of Django and the older b

Re: Multiple doc types in schema

2007-06-21 Thread Otis Gospodnetic
SOLR-215 support multiple indices on a single Solr instance. It does *not* support searching of multiple indices at once (e.g. parallel search) and merging of results. This has nothing to do with NFS, though. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www

Re: Multi-language indexing and searching

2007-06-21 Thread Daniel Alheiros
Hi Hoss. I've tried that yesterday using the same approach you just said (I've created the base fields for any language with basic analyzers) and it worked alright. Thanks again for you time. Regards, Daniel On 20/6/07 21:00, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : So far it sound

Re: delete changed?

2007-06-21 Thread James liu
aha,,sorry,i miss it. 2007/6/21, Chris Hostetter <[EMAIL PROTECTED]>: : curl http://192.168.7.6:8080/solr0/update --data-binary : 'nodeid:20' : : i remember it is ok when i use solr 1.1 ... : HTTP Status 400 - missing content stream please note the "Upgrading from Solr 1.1" section o

Re: delete changed?

2007-06-21 Thread Chris Hostetter
: curl http://192.168.7.6:8080/solr0/update --data-binary : 'nodeid:20' : : i remember it is ok when i use solr 1.1 ... : HTTP Status 400 - missing content stream please note the "Upgrading from Solr 1.1" section of the 1.2 CHANGES.txt file, which states... The Solr "Request Handler" fr