Creating document schema at runtime

2007-12-11 Thread Shalin Shekhar Mangar
Hi, I'm looking on some tips on how to create a new document schema and add it to solr core at runtime. The use case that I'm trying to solve is: 1. Using a custom configuration tool, user creates a solr schema 2. The schema is added (uploaded) to a solr instance (on a remote machine). 3. Documen

Creating user-defined field types

2007-12-11 Thread Rishabh Joshi
Hi, Can anyone guide me as to how one can go on to implement a user defined field types in solr? I could not find anything on the solr-wiki. Help of any kind would be appreciated. Regards, Rishabh

Facets - What's a better term for non technical people?

2007-12-11 Thread Benjamin O'Steen
Whilst many of the people on this list (myself included) have a pretty good grasp of what is meant by the term facet, this is not clear to people who approach the system from a more fresh point of view. So, has anyone got a good example of the language they might use over, say, a set of radio bu

Re: Facets - What's a better term for non technical people?

2007-12-11 Thread Adrian Sutton
On 11/12/2007, at 8:32 PM, Benjamin O'Steen wrote: So, has anyone got a good example of the language they might use over, say, a set of radio buttons and fields on a web form, to indicate that selecting one or more of these would return facets. 'Show grouping by' or 'List the sets that the result

RE: Facets - What's a better term for non technical people?

2007-12-11 Thread DAVIGNON Andre - CETE NP/DIODé/PANDOC
Hi, > So, has anyone got a good example of the language they might use over, > say, a set of radio buttons and fields on a web form, to indicate that > selecting one or more of these would return facets. 'Show grouping by' > or 'List the sets that the results fall into' or something similar. Here

Re: Replication hooks

2007-12-11 Thread Tracy Flynn
That's what I was after. As always, thanks for the quick response. Tracy On Dec 11, 2007, at 12:18 AM, Yonik Seeley wrote: On Dec 10, 2007 11:22 PM, climbingrose <[EMAIL PROTECTED]> wrote: I think there is a event listener interface for hooking into Solr events such as post commit, post opt

How to effectively search inside fields that should be indexed with changing them.

2007-12-11 Thread Brian Carmalt
Hello all, The titles of our docs have the form "ABC0001231-This is an important doc.pdf". I would like to be able to search for 'important', or '1231', or 'ABC000*', or 'This is an important doc' in the title field. I looked a the NGramTokenizer and tried to use it. In the index it doesn't

Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Jörg Kiegeland
I have successfully configured two parallel Solr webapps , however I see that all data gets stored in one folder of my Tomcat installation, namely C:\Tomcat\solr\data\index. How can I configure that each Solr webapp shall store the data in the folders I assigned at , where already the Solr

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread patrick o'leary
I actually have a patch for solr config parser which allows you to use context environment variables in the solrconfig.xml I generally use it for development when I'm working with multiple instances and different data dirs.  I'll add it to jira today if you want it. P Jörg Kiegeland wrote: I

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Jörg Kiegeland
I actually have a patch for solr config parser which allows you to use context environment variables in the solrconfig.xml I generally use it for development when I'm working with multiple instances and different data dirs. I'll add it to jira today if you want it. That would be nice. Howeve

Re: Facets - What's a better term for non technical people?

2007-12-11 Thread Charles Hornberger
FAST calls them "navigators" (which I think is a terrible term - YMMV of course :-)) I tend to think that "filters" -- or perhaps "dynamic filters" -- captures the essential function. On Dec 11, 2007 2:38 AM, "DAVIGNON Andre - CETE NP/DIODé/PANDOC" <[EMAIL PROTECTED]> wrote: > Hi, > > > So, has a

Re: Creating user-defined field types

2007-12-11 Thread Yonik Seeley
On Dec 11, 2007 5:17 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote: > Can anyone guide me as to how one can go on to implement a user defined > field types in solr? At a higher level, what are you trying to accomplish? If you just want to customize analysis, just copy and modify an existing fieldTy

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Mike Klaas
I use jvm system properties for this; they seem to work well. -Mike On 11-Dec-07, at 7:39 AM, patrick o'leary wrote: I actually have a patch for solr config parser which allows you to use context environment variables in the solrconfig.xml I generally use it for development when I'm working w

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread patrick o'leary
JVM properties restrict you to a single implementation within a jvm. For instance if you want multiple instances of solr running with the same schema, with different data dir's in the one app server. You'll have to have several copies of solrconfig and schema.xml. By using context environment

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Chris Hostetter
: However I cannot believe that one cannot configure this by someconfiguration : file by now - what if only one index needs to be backuped and the other index is the option you are looking forthe in solrconfig.xml ? -Hoss

Pattern that generates two tokens per match

2007-12-11 Thread Ken Krugler
Hi all, I've got a pattern in a document (call it "xy") that I want to turn into two tokens - "xy" and "y". One approach I could use is PatternTokenizer to extract "xy", and then a custom filter that returns "xy" and then "y" on the next call (caches the next result). Or I could extend Pat

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Chris Hostetter
: I actually have a patch for solr config parser which allows you to use : context environment variables in the solrconfig.xml : I generally use it for development when I'm working with multiple : instances and different data dirs. I'll add it to jira today if you : want it. yes please! ... Solr

Re: Pattern that generates two tokens per match

2007-12-11 Thread Mike Klaas
On 11-Dec-07, at 11:51 AM, Ken Krugler wrote: Hi all, I've got a pattern in a document (call it "xy") that I want to turn into two tokens - "xy" and "y". One approach I could use is PatternTokenizer to extract "xy", and then a custom filter that returns "xy" and then "y" on the next cal

Re: Solr, Multiple processes running

2007-12-11 Thread Otis Gospodnetic
Martin, Look into MultiCore (new stuff, some info on the Wiki) or into running multiple Solrs inside a single JVM. We just did this with Jetty 6.1.6 for a client and it works beautifully. This is also documented on the Wiki. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch -

Re: How to effectively search inside fields that should be indexed with changing them.

2007-12-11 Thread Otis Gospodnetic
Brian, This is not really a job for n-grams. It sounds like you'll want to write a custom Tokenizer that has knowledge about this particular pattern, knows how to split input like the one in your example, and produce multiple tokens out of it. For the natural language part you can probably ge

Re: Solr, Multiple processes running

2007-12-11 Thread Otis Gospodnetic
Keeping track of 1000+ indices is actually not that hard. I've implemented Simpy - http://simpy.com - in a way that keeps each member's index (or indices - some users have multiple indices) separate. I can't give out the total number of Simpy users, but I can tell you it is weeell beyo

RE: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Arnone, Anthony
I asked a question similar to this back in http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200709.mbox/[EMAIL PROTECTED] and didn't really find anyone who was doing this. What I wound up doing was adding a variable to the context.xml file called contextRelativeHome: solr/contextRel

Re: Solr, Multiple processes running

2007-12-11 Thread Erick Erickson
You're right, I'm wrong. I certainly am willing to defer to someone who's been there before . On Dec 11, 2007 4:44 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Keeping track of 1000+ indices is actually not that hard. I've > implemented Simpy - http://simpy.com - in a way that keeps each me

Re: Two Solr Webapps, one folder for the index data?

2007-12-11 Thread Otis Gospodnetic
Maybe I'm confused. Can't you use the brand-spanking new MultiCore stuff for this, or JNDI, as I just mentioned in the "Re: Solr, Multiple processes running" thread? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: patrick o'leary <[EMAIL PRO

Re: Solr, Multiple processes running

2007-12-11 Thread Erick Erickson
How much data are we talking about here? Because it seems *much* simpler to just index a field with each document indicating the user and then just AND that user's ID in with your query. Or think about facets (although I admit I don't know enough about facets to weigh in on its merits, it's just b

Re: SOLR X FAST

2007-12-11 Thread Matthew Runo
I think it all depends, what do you want out of Solr or FAST? Thanks! Matthew Runo Software Developer 702.943.7833 On Dec 11, 2007, at 2:09 PM, William Silva wrote: Hi, How is the best way to compare SOLR and FAST Search ? Thanks, William.

Solr and Flex

2007-12-11 Thread jenix
Has anyone used Solr in a Flex application? Any code snipplets to share? Thank you. Jennifer -- View this message in context: http://www.nabble.com/Solr-and-Flex-tp14284703p14284703.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR X FAST

2007-12-11 Thread William Silva
Hi, Why use FAST and not use SOLR ? For example. What will FAST offer that will justify the investment ? I would like a matrix comparing both. Thanks, William. On Dec 11, 2007 8:15 PM, Matthew Runo <[EMAIL PROTECTED]> wrote: > I think it all depends, what do you want out of Solr or FAST? > >

Re: SOLR X FAST

2007-12-11 Thread Ravish Bhagdev
Stability and better Support (at great cost obviously) On Dec 11, 2007 10:20 PM, William Silva <[EMAIL PROTECTED]> wrote: > Hi, > Why use FAST and not use SOLR ? For example. > What will FAST offer that will justify the investment ? > I would like a matrix comparing both. > Thanks, > William. > >

Re: Solr, Multiple processes running

2007-12-11 Thread Walter Underwood
Since they all use the same schema, can you add a client ID to each document when it is indexed? Filter by "clientid:4" and you get a subset of the index. wunder On 12/11/07 1:01 PM, "Owens, Martin" <[EMAIL PROTECTED]> wrote: > Hello everyone, > > The system we're moving from (dtSearch) allows

Re: distributing indexes via solr

2007-12-11 Thread Mike Klaas
On 10-Dec-07, at 12:50 PM, Doug T wrote: I have been using parallelmultisearches on multi-CPU machines, and seen sizable benefit over a single large index (even if all of the fragments are on 1 disk). Is there a way to quickly enable this on a solr server? Or do I need to go into the so

Re: SOLR X FAST

2007-12-11 Thread Nuno Leitao
Depends, if you are looking for a small sized index (gigabytes rather than dozens or hundreds of gigabytes or terabytes) with relatively simple requirements (a few facets, simple tokenization, English only linguistics, etc.) Solr is likely to be appropriate for most cases. FAST however give

Re: SOLR X FAST

2007-12-11 Thread Ravish Bhagdev
Could you please elaborate on what you mean by ingestion pipeline and horizontal scalability? I apologize if this is a stupid question everyone else on the forum is familiar with. Thanks, Ravi On Dec 12, 2007 1:09 AM, Nuno Leitao <[EMAIL PROTECTED]> wrote: > Depends, if you are looking for a sma

Re: SOLR X FAST

2007-12-11 Thread Nuno Leitao
FAST uses two pipelines - an ingestion pipeline (for document feeding) and a query pipeline which are fully programmable (i.e., you can customize it fully). At ingestion time you typically prepare documents for indexing (tokenize, character normalize, lemmatize, clean up text, perform ent

RE: SOLR X FAST

2007-12-11 Thread Norskog, Lance
FAST is a little less flexible (no dynamic fields) and not programmable at the Lucene level. We recently switched from FAST to Solr because of cost reasons. They did not know how to license us; they are used to, say, IBM running FAST on hundreds of servers. We are a startup with very specific ne

Re: Facets - What's a better term for non technical people?

2007-12-11 Thread Mike Klaas
"category counts" On 11-Dec-07, at 6:38 PM, Norskog, Lance wrote: In SQL terms they are: 'select unique'. Except on only one field. -Original Message- From: Charles Hornberger [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 9:49 AM To: solr-user@lucene.apache.org Subject: Re

RE: Facets - What's a better term for non technical people?

2007-12-11 Thread Norskog, Lance
In SQL terms they are: 'select unique'. Except on only one field. -Original Message- From: Charles Hornberger [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 9:49 AM To: solr-user@lucene.apache.org Subject: Re: Facets - What's a better term for non technical people? FAST call

Re: SOLR X FAST

2007-12-11 Thread Otis Gospodnetic
Just to comment on that last part: "There are FAST deployments out there which run on dozens, in some cases hundreds of nodes serving multiple terabyte size indexes and achieving hundreds of queries per seconds." There are also a lot of Lucene or Solr deployments with similar setups - I've wo

Re: Facets - What's a better term for non technical people?

2007-12-11 Thread Otis Gospodnetic
Isn't that GROUP BY ColumnX, count(1) type of thing? I'd think "group by" would be a good label. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: "Norskog, Lance" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, December 11, 20