RE: Strange behaviour with copyField

2009-06-03 Thread Radha C.
What is the defaultOperator set in your solrconfig.xml? Are you sure that it matches for au and not author? -Original Message- From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Thursday, June 04, 2009 2:53 AM To: solr-user@lucene.apache.org Subject: Re: Strange behaviour with cop

Re: Token filter on multivalue field

2009-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
isn't better to use an UpdateProcessor for this? On Thu, Jun 4, 2009 at 1:52 AM, Otis Gospodnetic wrote: > > Hello, > > It's ugly, but the first thing that came to mind was ThreadLocal. > >  Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message

Re: Which caches should use the solr.FastLRUCache

2009-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
FastLRUCache is designed to be lock free so it is well suited for caches which are hit several times in a request. I guess there is no harm in using FastLRUCache across all the caches. On Thu, Jun 4, 2009 at 3:22 AM, Robert Purdy wrote: > > Hey there, > > Anyone got any advice on which caches (fi

Re: indexing Chienese langage

2009-06-03 Thread Fer-Bj
We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after reindexing the index size went from 1.5 Gb to 2.7 Gb. Is that some expected behavior ? Is there any switch or trick to avoid having a double + index file size? Koji Sekiguchi-2 wrote: > > CharFilter can normalize (convert) tradit

where to find solr help/consultant

2009-06-03 Thread Larry Eitel
I am implementing solr on Centos server. It involves handling multi-languages. Where is the best place to look for developers experienced in solr who may be interested in a little consulting work. Mostly to give some guidance, etc. IRC is rather quite. Thank you :)

OPI: Article on Sunspot

2009-06-03 Thread Glen Newton
"Sunspot: A Solr-Powered Search Engine for Ruby" http://www.linux-mag.com/id/7341 glen http://zzzoot.blogspot.com/ -- -

Re: Is there Downside to a huge synonyms file?

2009-06-03 Thread anuvenk
A small addition to my earlier post. I wonder if its because of the 'mm' param, which requires that until 3 words in search phrase, all the words should be matched. If i alter this now, i'd get ir-relevant results for a lot of popular 1, 2, 3 word search terms. How to solve for this? anuvenk wro

Re: Is there Downside to a huge synonyms file?

2009-06-03 Thread anuvenk
I tried adding some city to state mappings in the synonyms file. I'm using the dismax handler for phrase matching. So as & when i add more & more city to state mappings, I end up with zero results for state based searches. Eg: ca,california,los angeles ca,california,san diego ca,californ

Re: NPE when unloading an absent

2009-06-03 Thread Peter Wolanin
I did not find any relevant issue, so here's a new issue with a patch: https://issues.apache.org/jira/browse/SOLR-1200 -Peter On Wed, Jun 3, 2009 at 4:56 PM, Peter Wolanin wrote: > Is this a known bug?  When I try to unload a core that does not exist, > Solr throws a NullPointerException > > ja

Re: synonyms

2009-06-03 Thread anuvenk
I happened to revisit this post that I had started long time back. I'm still using the same query time synonyms. Now i want to be able to map cities to states in the synonyms and continuing to have this issue with the multi-word synonyms. Could you please explain what you've done to overcome this

Which caches should use the solr.FastLRUCache

2009-06-03 Thread Robert Purdy
Hey there, Anyone got any advice on which caches (filterCache, queryResultCache, documentCache, fieldValueCache) should be implemented using the solr.FastLRUCache in solr 1.4 and what are the pros & cons vs the solr.LRUCache. Thanks Robert. -- View this message in context: http://www.nabble.

Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens
Sorry, no videos this time. The conversation wasn't very structured... next month I'll record it :) On Wed, Jun 3, 2009 at 1:59 PM, Bhupesh Bansal wrote: > Great Bradford, > > Can you post some videos if you have some ? > > Best > Bhupesh > > > > On 6/3/09 11:58 AM, "Bradford Stephens" > wrote:

Re: Solr search by segment

2009-06-03 Thread aidahaj
I must precise that I am running nutch-solr-integration and both schema.xml are the same in nutch or in solr. -- View this message in context: http://www.nabble.com/Solr-search-by-segment-tp23856569p23859728.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Strange behaviour with copyField

2009-06-03 Thread Grant Ingersoll
On Jun 3, 2009, at 5:09 AM, James Grant wrote: I've been hitting my head against a wall all morning trying to figure this out and haven't managed to get anywhere and wondered if anybody here can help. I have defined a field type positionIncrementGap="100"> I have t

Re: Solr search by segment

2009-06-03 Thread aidahaj
Yes it's already defined as String: When I make query by id or url it works but not the segment... -- View this message in context: http://www.nabble.com/Solr-search-by-segment-tp23856569p23859699.html Sent from the Solr - User mailing list archive at Nabble.com.

NPE when unloading an absent

2009-06-03 Thread Peter Wolanin
Is this a known bug? When I try to unload a core that does not exist, Solr throws a NullPointerException java.lang.NullPointerException at org.apache.solr.handler.admin.CoreAdminHandler.handleUnloadAction(CoreAdminHandler.java:319) at org.apache.solr.handler.admin.CoreAdminHandl

Re: fq vs. q

2009-06-03 Thread Otis Gospodnetic
Martin, One option is: q=name:iPod&fq=brand:Apple That way, when you want to search for some other Apple product, Solr will reuse the Apple filter if you again use fq=brand:Apple with the new q=name:foo query. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Origina

Re: Token filter on multivalue field

2009-06-03 Thread Otis Gospodnetic
Hello, It's ugly, but the first thing that came to mind was ThreadLocal. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: David Giffin > To: solr-user@lucene.apache.org > Sent: Wednesday, June 3, 2009 1:57:42 PM > Subject: Token filter on

Re: Solr vs Sphinx

2009-06-03 Thread Otis Gospodnetic
Hi, Could you please start a new thread? Thanks, Otis - Original Message > From: sunnyfr > To: solr-user@lucene.apache.org > Sent: Wednesday, June 3, 2009 10:20:06 AM > Subject: Re: Solr vs Sphinx > > > Hi guys, > > I work now for serveral month on solr and really you provide qui

Re: Solr search by segment

2009-06-03 Thread Otis Gospodnetic
What is the type of this field? Use something like "string" type. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: aidahaj > To: solr-user@lucene.apache.org > Sent: Wednesday, June 3, 2009 2:18:37 PM > Subject: Solr search by segment > >

Re: fq vs. q

2009-06-03 Thread Martin Davidsson
On Wed, Jun 3, 2009 at 1:53 AM, Marc Sturlese wrote: > > It's definitely not proper documentation but maybe can give you a hand: > > http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ > > > Martin Davidsson-2 wrote: > > > > I've tried to read up on how to decide,

Re: Seattle / PNW Hadoop + Lucene User Group?

2009-06-03 Thread Bradford Stephens
Hey everyone! I just wanted to give a BIG THANKS for everyone who came. We had over a dozen people, and a few got lost at UW :) [I would have sent this update earlier, but I flew to Florida the day after the meeting]. If you didn't come, you missed quite a bit of learning and topics. Such as: -B

Solr search by segment

2009-06-03 Thread aidahaj
Hi, I have an index in wich I am always indexing the same documents (re-indexing). So I need to search for them by their number of segment. When I ask solrj for the documents by their segment [for example: solrj.query("segment:20090603142546");] , he doesn't return any thing. I checked the schema.

MoreLikeThis query

2009-06-03 Thread SergeyG
Hi, I'm adding the "MoreLikeThis" functionality to my search. 1. Do I understand it right that the query: q=id:1&mlt=true&mlt.fl=content will bring back documents in which the most important terms of the content field are partly the same as those of the content field of the doc with id=1? 2. Al

filter on millions of IDs from external query

2009-06-03 Thread Ryan McKinley
I am working with an in index of ~10 million documents. The index does not change often. I need to preform some external search criteria that will return some number of results -- this search could take up to 5 mins and return anywhere from 0-10M docs. I would like to use the output of t

Token filter on multivalue field

2009-06-03 Thread David Giffin
Hi There, I'm working on a unique token filter, to eliminate duplicates on a multivalue field. My filter works properly for a single value field. It seems that a new TokenFilter is created for each value in the multivalue field. I need to maintain an array of used tokens across all of the values i

Re: Keyword Density

2009-06-03 Thread Otis Gospodnetic
I don't think this is possible without changing Solr. Or maybe it's possible with a custom Search Component that looks at all hits and checks the "df" (document frequency) for a term in each document? Sounds like a very costly operation... Otis -- Sematext -- http://sematext.com/ -- Lucene -

Re: Strange behaviour with copyField

2009-06-03 Thread Otis Gospodnetic
James, I don't see the error, but this is exactly what Solr Admin's analysis page will quickly help you with! :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: James Grant > To: solr-user@lucene.apache.org > Sent: Wednesday, June 3, 20

Re: NPE in dataimport.DebugLogger.peekStack (DIH Development Console)

2009-06-03 Thread Shalin Shekhar Mangar
This is fixed in trunk. The next nightly build will have this fix. Thanks! On Tue, Jun 2, 2009 at 9:49 PM, Steffen B. wrote: > > Glad to hear that it's not a problem with my setup. > Thanks for taking care of it! :) > > > Shalin Shekhar Mangar wrote: > > > > On Tue, Jun 2, 2009 at 8:06 PM, Steffe

Re: Solr vs Sphinx

2009-06-03 Thread sunnyfr
Hi guys, I work now for serveral month on solr and really you provide quick answer ... and you're very nice to work with. But I've got huge issue that I couldn't fixe after lot of post. My indexation take one two days to be done. For 8G of data indexed and 1,5M of docs (ok I've plenty of links i

Re: Keyword Density

2009-06-03 Thread Alex Shevchenko
So, is there an ability to perform filtering as I described? On Mon, Jun 1, 2009 at 22:24, Alex Shevchenko wrote: > But I don't need to sort using this value. I need to cut results, where > this value (for particular term of query!) not in some range. > > > On Mon, Jun 1, 2009 at 22:20, Walter U

Strange behaviour with copyField

2009-06-03 Thread James Grant
I've been hitting my head against a wall all morning trying to figure this out and haven't managed to get anywhere and wondered if anybody here can help. I have defined a field type positionIncrementGap="100"> I have two fields required="false" multiValued="true"/

Re: How to avoid space on facet field

2009-06-03 Thread Marc Sturlese
Yeah, that's the point. Once you have this, you can use copyField as was wrote above with the "string" example. Bny Jo wrote: > > Anshuman, thanks for you input. I will try that, I can understand what you > are trying. > > Marcus, I did not understand how your KeyworkTokenizer work. Is that

Re: How to avoid space on facet field

2009-06-03 Thread Bny Jo
Anshuman, thanks for you input. I will try that, I can understand what you are trying. Marcus, I did not understand how your KeyworkTokenizer work. Is that I have to define a septate field like what we have in example schema and call that field. This what I came up with.

Re: indexing/crawling HTML + solr

2009-06-03 Thread Otis Gospodnetic
Gena, Besides droids (simpler, smaller components you can put together) there is also Nutch, a bigger beast for large scale crawling that index crawled pages into Solr - http://lucene.apache.org/nutch . Otis - Original Message > From: Gena Batsyan > To: solr-user@lucene.apache.org

Re: indexing/crawling HTML + solr

2009-06-03 Thread Olivier Dobberkau
Hi Have à Look at the droids project in The incubator. Olivier Von meinem iPhone gesendet Am 03.06.2009 um 12:09 schrieb Gena Batsyan : Hi! to be short, where to start with the subject? Any pointers to some [semi-]functional solutions that crawl the web as a normal crawler, take care ab

Alphabetical index for faceting

2009-06-03 Thread Bertrand Mathieu
Hello, My goal is to get an index for alphabetical faceting of titles. For this I'm trying to define a fieldType meant to index first letter of text, with stopwords removed. My problem is that without WordDelimiterFilterFactory stopwords are not removed, and with it I end up with 2 tokens (and I'd

indexing/crawling HTML + solr

2009-06-03 Thread Gena Batsyan
Hi! to be short, where to start with the subject? Any pointers to some [semi-]functional solutions that crawl the web as a normal crawler, take care about html parsing, etc, and feed the crawled stuff as solr-documents per ? regards!

Re: How contrib for solr memcache query cache

2009-06-03 Thread 林彬 陈
  https://issues.apache.org/jira/browse/SOLR-1197 --- 09年6月3日,周三, chenl...@yahoo.com.cn 写道: 发件人: chenl...@yahoo.com.cn 主题: How contrib for solr memcache query cache 收件人: solr-user@lucene.apache.org 日期: 2009年6月3日,周三,下午3:44 Hi all:   I want to contrib memcache implement solr cache (only test

Re: fq vs. q

2009-06-03 Thread Anshuman Manur
wow! that was a good read!!! On Wed, Jun 3, 2009 at 2:23 PM, Marc Sturlese wrote: > > It's definitely not proper documentation but maybe can give you a hand: > > http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ > > > Martin Davidsson-2 wrote: > > > > I've trie

Re: fq vs. q

2009-06-03 Thread Marc Sturlese
It's definitely not proper documentation but maybe can give you a hand: http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ Martin Davidsson-2 wrote: > > I've tried to read up on how to decide, when writing a query, what > criteria goes in the q parameter and

Re: How to avoid space on facet field

2009-06-03 Thread Marc Sturlese
You can configure a "facet_text" instead of the normal "text" type. There you use KeyWordTokenizer instead of StandardTokenizer. One of the advantages of using it instead of "string" is that it will allow you to use synonyms, stopwords and filters and all the properties from an analyzer. Anshuma

Re: How contrib for solr memcache query cache

2009-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
please raise this as an issue in Jira https://issues.apache.org/jira/browse/SOLR let us see what others think about this On Wed, Jun 3, 2009 at 1:14 PM, wrote: > > Hi all: > > I want to contrib memcache implement solr cache (only test query result cache) > > patch for solr 1.3 http://code.googl

How contrib for solr memcache query cache

2009-06-03 Thread chenlbya
Hi all:   I want to contrib memcache implement solr cache (only test query result cache)   patch for solr 1.3 http://code.google.com/p/solr-side/issues/detail?id=1&can=1   solr-memcache.zip http://solr-side.googlecode.com/files/solr-memcache.zip   =readme.txt=