Re: tomcat install

2006-09-19 Thread Nick Snels
Hi James, now you should make a FilterFactory, there are a few examples in c:\solr-nightly\src\java\org\apache\solr\analysis . You should also place your FilterFactory in this directory and rerun 'ant dist'. I have made a DutchStemFilterFactory class and this is the code: package org.apache.solr

Re: tomcat install

2006-09-19 Thread James liu
today not Ok。i check source of cjk: CJKAnalyzer.java and CJKTokenizer.java (these from lucene 2.0 source code)and your code,,,i write CJKJLFilterFactory.java and CJKJLTokenizerFactory.javaant is ok.i copy new solr.war to tomcat's webappsand modify schema.xml use admin page, i use http://localh

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Yonik Seeley
On 9/18/06, Michael Imbeault <[EMAIL PROTECTED]> wrote: Yonik Seeley wrote: > For cases like "author", if there is only one value per document, then > a possible fix is to use the field cache. If there can be multiple > occurrences, there doesn't seem to be a good way that preserves exact > coun

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Joachim Martin
Michael Imbeault wrote: Also, is there any plans to add an option not to run a facet search if the result set is too big? To avoid 40 seconds queries if the docset is too large... You could run one query with facet=false, check the result size and then run it again (should be fast because i

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Yonik Seeley
I just updated the comments in solrconfig.xml: On 9/18/06, Michael Imbeault <[EMAIL PROTECTED]> wrote: Another followup: I bumped all the caches in solrconfig.xml to size="1600384" initialSize="400096" autowarmCount="400096" It seemed to fix the problem on a very smal

Re: tomcat install

2006-09-19 Thread Nick Snels
Hi James, don't give up, your very close to having it work. If you can get CJKAnalyzer and CJKTokenizer to work in Lucene, you should also be able to get it to work in Solr. Look at the bright site, at least ant doesn't throw any errors. And my code isn't going to work, since it really cann't han

copyField to a dynamic field

2006-09-19 Thread Paul Terray
Hi, I know this is a complex one, but it help me to be able to make a dynamic copy field, like: The goal is to have for each string index a tokenized one. It does not seem possible at the moment, but will it be in the foreseeable future? Thanks anyway ! > Paul Terray

Multivalued vs single valued

2006-09-19 Thread Paul Terray
Hi, Using a lot of dynamic fields, I’d like to simplify the field types. I had a question on this: is there an advantage to have a field declared as single valued, as opposed to multi-valued? Thanks > Paul Terray Consultant Avant-Vente > SOLLAN 27, bis rue du Prog

Re: Multivalued vs single valued

2006-09-19 Thread Yonik Seeley
On 9/19/06, Paul Terray <[EMAIL PROTECTED]> wrote: Using a lot of dynamic fields, I'd like to simplify the field types. I had a question on this: is there an advantage to have a field declared as single valued, as opposed to multi-valued? The response of single valued fields is smaller (no enca

strange highlighting behavior

2006-09-19 Thread Brian Lucas
I’m experiencing some unusual behavior when I perform a search with highlighting enabled. I’ve set up “id” as “sint” and indexed properly, but performing a search gives the following result: 3.0647626 2 369845 1 Microsoft Reorganizes Microsoft Reorganizes 3.0647626

Re: strange highlighting behavior

2006-09-19 Thread Yonik Seeley
On 9/19/06, Brian Lucas <[EMAIL PROTECTED]> wrote: The unusual characters on lst name="…" are what I can't figure out, as it DEFINITELY is not the id. I've tried indexed id with "integer", "sint", and "string" all with the same result. Yes, looks like you hit a bug where you are seeing the "

Re: strange highlighting behavior

2006-09-19 Thread Yonik Seeley
On 9/19/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: The fix would be to use FieldType.indexedToReadable() to convert the indexed form back to a readable form. Oops, that should be storedToReadable since the id is obtained from the stored fields, not from the index. Hmmm, a quick look at the co

RE: strange highlighting behavior

2006-09-19 Thread Brian Lucas
Yonik, thanks for the tip. Converting to 'integer' and deleting/reindexing fixed it. Can 'sint' be used for the id with highlighting, or does one need to use integer or string for that? Just trying to figure out if it's a bug with sint, or possibly due to the fact I could have changed sint to i

Re: strange highlighting behavior

2006-09-19 Thread Yonik Seeley
On 9/19/06, Brian Lucas <[EMAIL PROTECTED]> wrote: Converting to 'integer' and deleting/reindexing fixed it. Can 'sint' be used for the id with highlighting, or does one need to use integer or string for that? It should be usable (but I personally haven't tested that). If it's not, it's a bug a

Re: tomcat install

2006-09-19 Thread Chris Hostetter
: I have went through my archives and I have found that people also have used : something similar to: : : : : Correct. If you want to use a Lucene analyzer "as is" all you need to do is specify the class name. if you wnat to make an analyzer on the fly from a tokenizer and some tokenfil

Re: no example to CollectionDistribution?

2006-09-19 Thread Chris Hostetter
: maybe i should get cron through cygwin.. : : my system is win2003,not unix. : : today i try ./snappuller,,,but it seems wrong and i set master port, : directory,snap directory The CollectionDistribution scripts may not work well on windows -- many of them require hardlinks which may or may-not

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Chris Hostetter
Quick Question: did you say you are faceting on the first name field seperately from the last name field? ... why? You'll probably see a sharp increase in performacne if you have a single untokenized author field containing hte full name and you facet on that -- there will be a lot less unique te

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Yonik Seeley
On 9/19/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: Quick Question: did you say you are faceting on the first name field seperately from the last name field? ... why? You'll probably see a sharp increase in performacne if you have a single untokenized author field containing hte full name an

relational design in solr?

2006-09-19 Thread Joachim Martin
I am trying to integrate solr search results with results from a rdbms query. It's working ok, but fairly complicated due to large size of the results from the database, and many different sort requirements. I know that solr/lucene was not designed to intelligently handle multiple document t

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Chris Hostetter
: > when we facet on the authors, we start with : > that list and go in order, generating their facet constraint count using : > the DocSet intersection just like we currently do ... if we reach our : > facet.limit before we reach the end of hte list and the lowest constraint : > count is higher t

Re: Facet performance with heterogeneous 'facets'?

2006-09-19 Thread Chris Hostetter
: I just updated the comments in solrconfig.xml: I've tweaked the SolrCaching wiki page to include some of this info as well, feel free to add any additional info you think would be helpful to other people (or ask any qestions about it if any of it still doesn't seem clear to you)... htt

wana use CJKAnalyzer

2006-09-19 Thread James liu
My step to support CJK...:1:add lucene-analyzers-2.0.0.jar to "C:\cygwin\tmp\solr-nightly\lib"2:use cmd, "cd C:\cygwin\tmp\solr-nightly","ant dist"3:copy "C:\cygwin\tmp\solr-nightly\dist\solr- 1.0.war" to "C:\cygwin\tmp\solr-nightly\example\webapps\solr.war"4:modify schema(conf/schema.conf), like

Re: no example to CollectionDistribution?

2006-09-19 Thread James liu
i see,thk u. 2006/9/20, Chris Hostetter <[EMAIL PROTECTED]>: : maybe i should get cron through cygwin.. : : my system is win2003,not unix. : : today i try ./snappuller,,,but it seems wrong and i set master port, : directory,snap directory The CollectionDistribution scripts may not work well o

Re: tomcat install

2006-09-19 Thread James liu
i'd like to hear "I would start by trying to use the CJKAnalyzer as is with the syntax,described above." if need tester, call me. 2006/9/20, Chris Hostetter <[EMAIL PROTECTED]>: : I have went through my archives and I have found that people also have used : something similar to: : : :