Re: Reasonable number of maxWarming searchers

2009-08-07 Thread Chris Hostetter
: Is there a problem if i set maxWarmingSearchers to something like 30 or 40? my personal opinion: anything higher then 3 indicates a serious architecture problem. On a master, doing lots of updates, the "warming" time should be zero, so there shouldn't ever be more then 2 searchers at one ti

Re: update some index documents after indexing process is done with DIH

2009-08-07 Thread Chris Hostetter
: What is confusing me now is that I have to implement my logic in you're certianly in a fuzzy grey area here ... none of this stuff was designed for the kind of thing you're doing. : But in processCommit, having access to the core I can get the IndexReader : but I still don't know how to get t

Re: solr indexing on same set of records with different value of unique field, not working fine.

2009-08-07 Thread Chris Hostetter
: Sorry, schema.xml file is here in this mail... in the schema.xml file you attached, the uniqueKey field is "evid" you only provided one example of the type of input you are indexing, and in that example... : > 501 ...but in your orriginal email (see below) you said you were using a

Re: solr/home in web.xml relative to web server home

2009-08-07 Thread Chris Hostetter
: the environment variable (env-entry) in web.xml to configure the solr/home is : relative to the web server's working directory. I find this unusual as all the : servlet paths are relative to the web applications directory (webapp context, : that is). So, I specified solr/home relative to the web

Re: 99.9% uptime requirement

2009-08-07 Thread Chris Hostetter
: Subject: 99.9% uptime requirement : In-Reply-To: <4a730d0f.3050...@btelligent.de> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even

Re: Question regarding merging Solr indexes

2009-08-07 Thread Shalin Shekhar Mangar
On Fri, Aug 7, 2009 at 10:45 PM, ahammad wrote: > > Hello, > > I have a MultiCore setup with 3 cores. I am trying to merge the indexes of > core1 and core2 into core3. I looked at the wiki but I'm somewhat unclear > on > what needs to happen. > > This is what I used: > > > http://localhost:9085/s

Re: Can multiple Solr webapps access the same lucene index files?

2009-08-07 Thread Otis Gospodnetic
Yes, they could all point to an index that lives on a NAS or SAN, for example. You'd still have to make sure only one server is writing to the index at a time. Zookeeper can help with coordination of that. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nut

How to use key with facet.prefix?

2009-08-07 Thread Jón Helgi Jónsson
I'm trying to facet multiple times on same field using key. This works fine except when I use prefixes for these facets. What I got so far (and not functional): .. &facet=true &facet.field=category&f.category.facet.prefix=01 &facet.field={!key=subcat}category&f.subcat.facet.prefix=00 This will g

MoreLikeThis: How to get quality terms from html from content stream?

2009-08-07 Thread Jay Hill
I'm using the MoreLikeThisHandler with a content stream to get documents from my index that match content from an html page like this: http://localhost:8080/solr/mlt?stream.url=http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2009/08/06/SP5R194Q13.DTL&mlt.fl=body&rows=4&debugQuery=true But, not su

Can multiple Solr webapps access the same lucene index files?

2009-08-07 Thread Mark Diggory
Hello, I have a question I can't find an answer to in the list. Can mutliple solr webapps (for instance in separate cluster nodes) share the same lucene index files stored within a shared filesystem? We do this with a custom Lucene search application right now, I'm trying to switch to using solr

Re: solr v1.4 in production?

2009-08-07 Thread Ian Connor
Pubget has been using 1.4 for a while now to make the replication easier. http://pubget.com We compiled a while back and are thinking of updating to the latest build to start playing with distributed spell checking. On Fri, Aug 7, 2009 at 7:42 AM, Shalin Shekhar Mangar < shalinman...@gmail.com>

spellcheck component in 1.4 distributed

2009-08-07 Thread mike anderson
I am e-mailing to inquire about the status of the spellchecking component in 1.4 (distributed). I saw SOLR-785, but it is unreleased and for 1.5. Any help would be much appreciated. Thanks in advance, Mike

Re: Solr CMS Integration

2009-08-07 Thread Paul Libbrecht
Hello Wojtek, I don't want to discourage all the famous CMSs around nor solr uptake but xwiki is quite a powerful CMS and has a search that is lucene based. paul Le 07-août-09 à 22:42, Olivier Dobberkau a écrit : I've been asked to suggest a framework for managing a website's content an

Re: Solr CMS Integration

2009-08-07 Thread Olivier Dobberkau
Am 07.08.2009 um 19:01 schrieb wojtekpia: I've been asked to suggest a framework for managing a website's content and making all that content searchable. I'm comfortable using Solr for search, but I don't know where to start with the content management system. Is anyone using a CMS (open so

Solr Security

2009-08-07 Thread Francis Yakin
Have anyone had an experience to setup the Solr Security? http://wiki.apache.org/solr/SolrSecurity I would like to implement using HTTP Authentication or using Path Based Authentication. So, in the webdefault.xml I set like the following: Solr authenticated application /core

PhoneticFilterFactory related questions

2009-08-07 Thread Reuben Firmin
Hi, I have a schema with three (relevant to this question) fields: title, author, book_content. I found that if PhoneticFilterFactory is used as a filter on book_content, it was bringing back all kinds of unrelated results, so I have it applied only against title and author. Questions -- 1) I ha

Re: Solr CMS Integration

2009-08-07 Thread wojtekpia
Thanks for the responses. I'll give Drupal a shot. It sounds like it'll do the trick, and if it doesn't then at least I'll know what I'm looking for. Wojtek -- View this message in context: http://www.nabble.com/Solr-CMS-Integration-tp24868462p24870218.html Sent from the Solr - User mailing lis

Re: Solr CMS Integration

2009-08-07 Thread Tim Archambault
I would second that and add that you may want to consider acquia.com as they provide a solid infrustracture to support the solr instance. On Fri, Aug 7, 2009 at 11:20 AM, Andre Hagenbruch wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > wojtekpia schrieb: > > Hi Wojtek, > > > I've bee

Re: Is kill -9 safe or not?

2009-08-07 Thread solrcoder
Thanks for the confirmation and reassurance! - Michael Yonik Seeley-2 wrote: > > On Fri, Aug 7, 2009 at 12:04 PM, Otis > Gospodnetic wrote: >> Yonik, >> >> Uncommitted (as in solr un"commit"ed) on unflushed? > > Solr uncommitted. Even if the docs hit the disk via a segment flush, > they aren'

Re: localSolr install

2009-08-07 Thread Bhargava Sriram
Hi All, I also need the same information. I am planning to set up solr. I have data around 20 to 30 million records and those in csv formats. Your help is highly appreciable. Regards, Bhargava S Akula. 2009/8/7 Brian Klippel > Is there any sort of guide to installing and configuring localSo

localSolr install

2009-08-07 Thread Brian Klippel
Is there any sort of guide to installing and configuring localSolr into an existing solr implementation? I'm not extremely versed with java applications, but I've managed to cobble together jetty and solr multicore fairly reliably. I've downloaded localLucine 2.0 and localSolr 6.1, and this is

Re: Solr CMS Integration

2009-08-07 Thread Grant Ingersoll
lucidimagination.com is powered off of Drupal and we index it using Solr (but not the Drupal plugin, as we have non CMS data as well). It has blogs, articles, white papers, mail archives, JIRA tickets, Wiki's etc. On Aug 7, 2009, at 1:01 PM, wojtekpia wrote: I've been asked to suggest a

Re: Solr CMS Integration

2009-08-07 Thread Andre Hagenbruch
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 wojtekpia schrieb: Hi Wojtek, > I've been asked to suggest a framework for managing a website's content and > making all that content searchable. I'm comfortable using Solr for search, > but I don't know where to start with the content management sys

Question regarding merging Solr indexes

2009-08-07 Thread ahammad
Hello, I have a MultiCore setup with 3 cores. I am trying to merge the indexes of core1 and core2 into core3. I looked at the wiki but I'm somewhat unclear on what needs to happen. This is what I used: http://localhost:9085/solr/core3/admin/?action=mergeindexes&core=core3&indexDir=/solrHome/cor

Re: Preserving "C++" and other weird tokens

2009-08-07 Thread solrcoder
Ach, sorry I didn't find this before posting! - Michael Yonik Seeley-2 wrote: > > http://search.lucidimagination.com/search/document/2d325f6178afc00a/how_to_search_for_c > > -Yonik > http://www.lucidimagination.com > -- View this message in context: http://www.nabble.com/Preserving-%22C%2B

Solr CMS Integration

2009-08-07 Thread wojtekpia
I've been asked to suggest a framework for managing a website's content and making all that content searchable. I'm comfortable using Solr for search, but I don't know where to start with the content management system. Is anyone using a CMS (open source or commercial) that you've integrated with S

Re: Is kill -9 safe or not?

2009-08-07 Thread Yonik Seeley
On Fri, Aug 7, 2009 at 12:04 PM, Otis Gospodnetic wrote: > Yonik, > > Uncommitted (as in solr un"commit"ed) on unflushed? Solr uncommitted. Even if the docs hit the disk via a segment flush, they aren't part of the index until the index descriptor (segments_n) is written pointing to that new segm

Re: Attempt to query for max id failing with exception

2009-08-07 Thread Reuben Firmin
Yep, thanks - this turned out to be a systems configuration error. Our sysadmin hadn't opened up the http port on the server's internal network interface; I could browse to it from outside (i.e. firefox on my machine), but the apache landing page was being returned when CommonsHttpSolrServer tried

Re: Is kill -9 safe or not?

2009-08-07 Thread Otis Gospodnetic
Yonik, Uncommitted (as in solr un"commit"ed) on unflushed? Thanks, Otis - Original Message > From: Yonik Seeley > To: solr-user@lucene.apache.org > Sent: Friday, August 7, 2009 11:10:49 AM > Subject: Re: Is kill -9 safe or not? > > Kill -9 will not corrupt your index, but you would

Re: Attempt to query for max id failing with exception

2009-08-07 Thread Yonik Seeley
I just tried this sample code... it worked fine for me on trunk. -Yonik http://www.lucidimagination.com On Thu, Aug 6, 2009 at 8:28 PM, Reuben Firmin wrote: > I'm using SolrJ. When I attempt to set up a query to retrieve the maximum id > in the index, I'm getting an exception. > > My setup code i

Re: Preserving "C++" and other weird tokens

2009-08-07 Thread Yonik Seeley
http://search.lucidimagination.com/search/document/2d325f6178afc00a/how_to_search_for_c -Yonik http://www.lucidimagination.com On Thu, Aug 6, 2009 at 11:38 AM, Michael _ wrote: > Hi everyone, > I'm indexing several documents that contain words that the StandardTokenizer > cannot detect as token

Re: Is kill -9 safe or not?

2009-08-07 Thread Yonik Seeley
Kill -9 will not corrupt your index, but you would lose any uncommitted documents. -Yonik http://www.lucidimagination.com On Fri, Aug 7, 2009 at 11:07 AM, Michael _ wrote: > I've seen several threads that are one or two years old saying that > performing "kill -9" on the java process running Sol

Re: Preserving "C++" and other weird tokens

2009-08-07 Thread Michael _
On Thu, Aug 6, 2009 at 11:38 AM, Michael _ wrote: > Hi everyone, > I'm indexing several documents that contain words that the > StandardTokenizer cannot detect as tokens. These are words like > C# > .NET > C++ > which are important for users to be able to search for, but get treated as > "

Is kill -9 safe or not?

2009-08-07 Thread Michael _
I've seen several threads that are one or two years old saying that performing "kill -9" on the java process running Solr either CAN, or CAN NOT corrupt your index. The more recent ones seem to say that it CAN NOT, but before I bake a kill -9 into my control script (which first tries a normal "kil

Re: Item Facet

2009-08-07 Thread David Lojudice Sobrinho
The behavior i'm expecting is something similar to a GROUP BY in a relational database. SELECT product_name, model, min(price), max(price), count(*) FROM t GROUP BY product_name, model The current schema: product_name (type: text) model (type: text) price (type: sfloat) On Fri, Aug 7, 2009 at

Re: CorruptIndexException: Unknown format version

2009-08-07 Thread Yonik Seeley
Wow, that is an interesting one... I bet there is more than one Lucene version kicking around the classpath somehow. Try removing all of the servlet container's working directories. -Yonik http://www.lucidimagination.com On Fri, Aug 7, 2009 at 4:41 AM, Maximilian Hütter wrote: > Hi, > > how can t

Re: Item Facet

2009-08-07 Thread Yao Ge
Are your product_name* fields numeric fields (integer or float)? Dals wrote: > > Hi... > > Is there any way to group values like shopping.yahoo.com or > shopper.cnet.com do? > > For instance, I have documents like: > > doc1 - product_name1 - value1 > doc2 - product_name1 - value2 > doc3 - p

Re: Solr 1.4 in Production Environment-- Is it stable?

2009-08-07 Thread Jeff Newburn
We also use 1.4 which has gotten hit with load tests of up to 2000queries/sec. Biggest thing is make sure you are using the slaves for that kind of load. Other than that 1.4 is pretty impressive. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 > From: Otis Gosp

Re: Item Facet

2009-08-07 Thread David Lojudice Sobrinho
Thanks Avlesh. But I didn't get it. How a dynamic field would aggregate values in query time? On Thu, Aug 6, 2009 at 11:14 PM, Avlesh Singh wrote: > Dynamic fields might be an answer. If you had a field called "product_*" and > these were populated with the corresponding values during indexing th

Re: Language Detection for Analysis?

2009-08-07 Thread Grant Ingersoll
There are several free Language Detection libraries out there, as well as a few commercial ones. I think Karl Wettin has even written one as a plugin for Lucene. Nutch also has one, AIUI. I would just Google "language detection". Also see http://www.lucidimagination.com/search/?q=languag

Re: Solr 1.4 in Production Environment-- Is it stable?

2009-08-07 Thread Otis Gospodnetic
I know a number of large companies using 1.4-dev. But you could also wait another month or so and get the real 1.4. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message > From: Ninad Ra

Re: solr v1.4 in production?

2009-08-07 Thread Shalin Shekhar Mangar
On Wed, Jul 1, 2009 at 6:17 PM, Ed Summers wrote: > Here at the Library of Congress we've got several production Solr > instances running v1.3. We've been itching to get at what will be v1.4 > and were wondering if anyone else happens to be using it in production > yet. Any information you can pr

Solr 1.4 in Production Environment-- Is it stable?

2009-08-07 Thread Ninad Raut
Hi, Has anyone used Solr 1.4 in production? There are some really nice features in it like - Directly adding POJOs to Solr - ReplicationHandler etc. Is 1.4 stable enought to be used in production?

Re: mergeFactor / indexing speed

2009-08-07 Thread Chantal Ackermann
Thanks for the tip, Shalin. I'm happy with 6 indexes running in parallel and completing in less than 10min, right now, but I'll have look anyway. Shalin Shekhar Mangar schrieb: On Fri, Aug 7, 2009 at 3:58 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: Juhu, great news, guys.

Re: mergeFactor / indexing speed

2009-08-07 Thread Shalin Shekhar Mangar
On Fri, Aug 7, 2009 at 3:58 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > Juhu, great news, guys. I merged my child entity into the root entity, and > changed the custom entityprocessor to handle the additional columns > correctly. > And - indexing 160k documents now takes 5min

Help creating schema for indexable document

2009-08-07 Thread rossputin
Hi Guys. I am struggling to create a schema with a determinist content model for a set of documents I want to index. My indexable documents will look something like: 1 code1 code2 mycategory My service will be mission critical and will accept batch imports from a potent

Re: Language Detection for Analysis?

2009-08-07 Thread Jukka Zitting
Hi, On Fri, Aug 7, 2009 at 12:31 PM, Andrzej Bialecki wrote: > .. and a Nutch plugin with similar functionality: > > http://lucene.apache.org/nutch/apidocs-1.0/org/apache/nutch/analysis/lang/LanguageIdentifier.html See also TIKA-209 [1] where I'm currently integrating the Nutch code to work with

Re: Language Detection for Analysis?

2009-08-07 Thread Andrzej Bialecki
Otis Gospodnetic wrote: Bradford, If I may: Have a look at http://www.sematext.com/products/language-identifier/index.html And/or http://www.sematext.com/products/multilingual-indexer/index.html .. and a Nutch plugin with similar functionality: http://lucene.apache.org/nutch/apidocs-1.0/org/

Re: mergeFactor / indexing speed

2009-08-07 Thread Chantal Ackermann
Juhu, great news, guys. I merged my child entity into the root entity, and changed the custom entityprocessor to handle the additional columns correctly. And - indexing 160k documents now takes 5min instead of 1.5h! (Now I can go relaxed on vacation. :-D ) Conclusion: In my case performance w

CorruptIndexException: Unknown format version

2009-08-07 Thread Maximilian Hütter
Hi, how can that happen, it is a new index, and it is already corrupt? Did anybody else something like this? WARN - 2009-08-07 10:44:54,925 | Solr index directory 'data/solr/index' doesn't exist. Creating new index... WARN - 2009-08-07 10:44:56,583 | solrconfig.xml uses deprecated , Please updat

Re: Documentation for Master-Slave Replication missing for Solr1.3. No mirror site for Solr 1.4 distribution.

2009-08-07 Thread Ninad Raut
Hi Noble, can these builds be used in production environment? Are they stable? we are not going live now, but in a few months we will. as such when will 1.4 be officially released? 2009/8/7 Noble Paul നോബിള്‍ नोब्ळ् > 1.4 is not released yet. you can grab a nightly from here > http://people.apac

Re: Documentation for Master-Slave Replication missing for Solr1.3. No mirror site for Solr 1.4 distribution.

2009-08-07 Thread Shalin Shekhar Mangar
On Fri, Aug 7, 2009 at 12:47 PM, Ninad Raut wrote: > Hi, > I want to know how to setup master-slave configuration for Solr 1.3 . I > can't get documentation on the net. I found one for 1.4 but not for 1.3 . > ReplicationHandler is not present in 1.3. > Also, I would like to know from will I get

Re: Documentation for Master-Slave Replication missing for Solr1.3. No mirror site for Solr 1.4 distribution.

2009-08-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
1.4 is not released yet. you can grab a nightly from here http://people.apache.org/builds/lucene/solr/nightly/ On Fri, Aug 7, 2009 at 12:47 PM, Ninad Raut wrote: > Hi, > I want to know how to setup  master-slave configuration for Solr  1.3 . I > can't get documentation on the net. I found one for

Documentation for Master-Slave Replication missing for Solr1.3. No mirror site for Solr 1.4 distribution.

2009-08-07 Thread Ninad Raut
Hi, I want to know how to setup master-slave configuration for Solr 1.3 . I can't get documentation on the net. I found one for 1.4 but not for 1.3 . ReplicationHandler is not present in 1.3. Also, I would like to know from will I get the Solr 14. distribution. The Solr Site lists mirrors only fo