Re: document retrieval, nested field and HTMLStripStandardTokenizerFactory

2008-03-26 Thread Chris Hostetter
: 1. Can I limit the number of returned document in config file to avoid : misconfiguration pull down the server? You can configure it with an invariant value in your requestHandler config ... so it won't matter how many the client asks for, they'll get the number you pick (or less if there are

Re: Master Slave Replication

2008-03-26 Thread Chris Hostetter
: I want to know if we can use index replication when we have segmented indexes : over multiple solr instances? I'm not sure i understand your question, but a slave just knows about it's master and the data in it -- it doesn't care or need to know if the master index is really just a subset of

Re: synonyms

2008-03-26 Thread Chris Hostetter
: And if I search for "refrigerador", I'll have all results for "refrigerador", : for "geladeira", and all results for the flexed words for what i've typed : (refrigerador, refrigerado, refrigeração, etc). But I won't find the results : for the flexed words of the synonym that i've defined (geladei

Re: Highlight - get terms used by lucene

2008-03-26 Thread Chris Hostetter
: we use highlighting and snippets for our searches. Besides those two, I : would want to have a list of terms that lucene used for the : highlighting, so that I can pull out of a "Tim OR Antwerpen AND Ekeren" : the following terms : Antwerpen, Ekeren if let's say these are the only : terms th

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Yonik Seeley
On Thu, Mar 27, 2008 at 12:11 AM, Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> wrote: > Can I make an API call to remove the stale indexsearcher so that the > documents do not get committed? > > Basically what I need is a 'rollback' feature This should be possible when Solr starts using Lucene

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
Can I make an API call to remove the stale indexsearcher so that the documents do not get committed? Basically what I need is a 'rollback' feature --Noble On Wed, Mar 26, 2008 at 9:08 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > On Wed, Mar 26, 2008 at 10:18 AM, Noble Paul നോബിള്‍ नोब्ळ् > <

Re: Making stop-words optional with DisMax?

2008-03-26 Thread Walter Underwood
We use two fields, one with and one without stopwords. The exact field has a higher boost than the other. That works pretty well. It helps to have an automated relevance test when tuning the boost (and other things). I extracted queries and clicks from the logs for a couple of months. Not perfect,

RE: How to index multiple sites with option of combining results in search

2008-03-26 Thread Lance Norskog
In fact, 55m records works fine in Solr; assuming they are small records. The problem is that the index files wind up in the tens of gigabytes. The logistics of doing backups, snapping to query servers, etc. is what makes this index unwieldy, and why multiple shards are useful. Lance -Origina

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 10:15 PM, Vinci wrote: Erik Hatcher wrote: ..The value you set that gap to depends on whether you'll be using sloppy phrase queries, and how sloppy they'll be and whether you desire matching across field instances. 1. If I doesn't care the sloppy queries, I just set a nu

Re: Facet searching and facet hierarchies.

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 3:34 PM, A.Z wrote: As I understand, after passing facets to Solr, one must manually add facet results to search to narrow the search. ex. i search for "foo bar" and click some facet. must i now search for 'foo bar facet:value' ? Must I include + signs? I'm using solrphpclien

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Vinci
Hi Erik, Thank you for your help. This is useful. Some follow up questions, Erik Hatcher wrote: > > ..The value you set that gap to depends on whether you'll > be using sloppy phrase queries, and how sloppy they'll be and whether > you desire matching across field instances. > 1. If I

Re: Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Vinci
Hi hossman, Thank you for your reply, it help a lots...just little more question here: hossman wrote: > > > : it should be indexed, so I comment this > : > : > : However, the search fail. After I clear up the index and, uncomment the > : copyField and commit the document again, the search

Re: Adding custom field for sorting?

2008-03-26 Thread Vinci
Hi hossman, Thank you for your reply. Some question on sorting: 1. Does Solr have a limit, e.g a % or a number to limit the number of document involved in sorting? or just sort all document? 2. Does the order in parameter 'sort' refer to the sorting order? (sort the first argument first, then th

Re: Making stop-words optional with DisMax?

2008-03-26 Thread Ronald K. Braun
Hi Otis, > I skimmed your email. You are indexing book and music titles. Those tend to > be short. > Do you really benefit from removing stop words in the first place? I'd try > keeping all the stop > words and seeing if that has any negative side-effects in your context. Thanks for your ski

Re: Beginner questions: Jetty and solr with utf-8 + cached page + dedup

2008-03-26 Thread Thorsten Scherler
On Tue, 2008-03-25 at 10:56 -0700, Vinci wrote: > Hi, > > Thank for your reply. > Question for apply xslt: If I use saxon, where should the saxon.jar located > if I using the example jetty server? lib/ inside example/ or outside the > example/? http://wiki.apache.org/solr/mySolr "... Typically it

Re: Highlighting Quoted Phrases

2008-03-26 Thread Chris Harris
On Tue, Mar 25, 2008 at 4:25 PM, Brian Whitman <[EMAIL PROTECTED]> wrote: > > On Mar 25, 2008, at 6:31 PM, Chris Harris wrote: > > > working pretty well, but my testers have > > discovered something they find borderline unacceptable. If they search > > for > > > >"stock market" > > > >

Re: Document Path issue and change the layout in the example

2008-03-26 Thread Chris Hostetter
: I started the indexing with jetty and then I come with some question... : 1. If I use the example start.jar, what should be my document system layout? : What is the essential folder? : solr_jar : |_start.jar : |_solrhome : |_etc : |_lib : |_logs i'm not sure what "solr_jar" is ... but most of t

Re: Adding custom field for sorting?

2008-03-26 Thread Chris Hostetter
: : Inspirited by the previous post, does it possible to add my custom field and : use it for sorting the search result? : If It is possible, what will be the step? Do I need to modify the source : code? adding a custom field is easy, just add the to your schema.xml and put data in it. adding

Re: How to use Solr in java program

2008-03-26 Thread Chris Hostetter
: I am new user of Solr and I want to know how can I use Solr in my own java http://wiki.apache.org/solr/SolJava : program, what are the different possibilities of using solr. Is a web : servlet container is ncessary to run and use Solr, Is servlet Container as : Tomcat is enough to use all t

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Chris Hostetter
: > Top often requested feature: : > 1. Make the option on using the "RAMDirectory" to hook in Terracotta( : > billion(s) of items in an index anyone?..it would be possible using : > this.) : : This is noted in: https://issues.apache.org/jira/browse/SOLR-465 ...and if people posted comments in t

Using Field Collapsing and Filter Query to implement JOIN

2008-03-26 Thread Lester Scofield
Hello solr people, I'm very new to solr to please forgive any misunderstanding on my part. I am hoping to do a JOIN across documents. Let me start with the 4 documents: part1 ABC this is a test part2 ABC of a fake JOIN part1 XYZ this is a test part2 XYZ o

Re: Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Chris Hostetter
: it should be indexed, so I comment this : : : However, the search fail. After I clear up the index and, uncomment the : copyField and commit the document again, the search work again. : : That I feeling very confusing as wiki and the schema.xml said this is : optional...is this a bug or wiki

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Jeryl Cook
i wouldn't call Terracotta approach magic(smile)..., it's being used quite a bit in many scalable high performing projects... i personally used Terracotta and Lucene, and it worked but did not try to "cluster" it with multiple terracotta(workers) across nodes , and the Terracotta(master)..just a s

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 4:41 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > just intuition - haven't tried it, so i'd love to be proved wrong. > Instrumenting Objects and magically passing them around seems like it > would be slower then a tuned approach used in SOLR-303. Yep, that's my sense to

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Ryan McKinley
good point: http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt?r1=641573&r2=641572&pathrev=641573 ryan Chris Harris wrote: Looks like that can't-go-back bit hasn't made it into CHANGES.txt yet. Might want to eventually add that somewhere particularly obvious, to help out people who assu

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Ryan McKinley
just intuition - haven't tried it, so i'd love to be proved wrong. Instrumenting Objects and magically passing them around seems like it would be slower then a tuned approach used in SOLR-303. It looks like they have a lucene example: http://www.terracotta.org/confluence/display/integrations/Lu

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Chris Harris
Looks like that can't-go-back bit hasn't made it into CHANGES.txt yet. Might want to eventually add that somewhere particularly obvious, to help out people who assume they could downgrade. Maybe under "Upgrading from Solr 1.2"? On Wed, Mar 26, 2008 at 12:59 PM, Ryan McKinley <[EMAIL PROTECTED]> wr

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Ryan McKinley
It *should* work as a drop in replacement. Check: http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt So you should be good. Note that trunk has a newer verison of lucene, so the index will be automatically upgraded and you can't go back from there. so make sure to backup before t

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 3:05 PM, Chris Harris <[EMAIL PROTECTED]> wrote: > What are the odds that I can plop an index created in Solr 1.2 into a > Solr 1.3 and/or Solr trunk install and have things work correctly? Should be relatively high. I'd never do it on a live index, regardless of what is a

Facet searching and facet hierarchies.

2008-03-26 Thread A . Z
I have a couple of question concerning facet searching. As I understand, after passing facets to Solr, one must manually add facet results to search to narrow the search. ex. i search for "foo bar" and click some facet. must i now search for 'foo bar facet:value' ? Must I include + signs? I'm usin

Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Chris Harris
What are the odds that I can plop an index created in Solr 1.2 into a Solr 1.3 and/or Solr trunk install and have things work correctly? This would be more convenient than reindexing, but I'm wondering how dangerous it is, and hence how much testing is required.

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Otis Gospodnetic
Ah, that's a very different number. Yes, assuming your docs are web pages, a single reasonably equipped machine should be able to handle that and a few dozen QPS. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Dietrich <[EMAIL PROTECTED]> To

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Dietrich
Makes sense, nut probably overkill for my requirements. I wasn't really talking 275*20, more likely the total would be something like four million documents. I was under the assumption that a single machine, or a simple distributed index, should be able to handle that, is that wrong? -ds On W

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Otis Gospodnetic
Hey Ryan, why do you say a Lucene/Solr index served via Terracotta would be substantially slower? I often wanted to try Terracotta + Lucene, but... time. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ryan McKinley <[EMAIL PROTECTED]>

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Otis Gospodnetic
Dietrich, I don't think there are established practices in the open (yet). You could design your application with a site(s)->shard mapping and then, knowing which sites are involved in the query, search only the relevant shards. This will be efficient, but it would require careful management

Re: Making stop-words optional with DisMax?

2008-03-26 Thread Otis Gospodnetic
Hi Ron,, I skimmed your email. You are indexing book and music titles. Those tend to be short. Do you really benefit from removing stop words in the first place? I'd try keeping all the stop words and seeing if that has any negative side-effects in your context. Otis -- Sematext -- http:

Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Vinci
Hi, While I am testing the Solr schema (1.3 nightly) with example mySolr on jetty, for the exampledocs and the default schema, I see the declaration: it should be indexed, so I comment this However, the search fail. After I clear up the index and, uncomment the copyField and commit the docum

Re: Replication of Segmented indexes

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 11:34 AM, oleg_gnatovskiy <[EMAIL PROTECTED]> wrote: > Hello, this is actually a repost of a question posed by Swarag. I don't think > he made the question quite clear, so let me give it a shot. It is known that > Solr has support for index replication, and it has support

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Daniel Papasian
[EMAIL PROTECTED] wrote: Quoting Daniel Papasian <[EMAIL PROTECTED]>: Or if you're adding a new field to the schema (perhaps the most common need for editing schema.xml), you don't need to reindex any documents at all, right? Unless I'm missing something? Well, it all depends on if that "fiel

Making stop-words optional with DisMax?

2008-03-26 Thread Ronald K. Braun
I've followed the stop-word discussion with some interest, but I've yet to find a solution that completely satisfies our needs. I was wondering if anyone could suggest some other options to try short of a custom handler or building our own queries (DisMax does such a fine job generally!). We are

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 10:18 AM, Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> wrote: > If my appserver fails during an update or if I do a planned shutdown > without wanting to commit my changes Solr does not allow it?. > It commits whatever unfinished changes. > Is it by design? > Can I cha

Replication of Segmented indexes

2008-03-26 Thread oleg_gnatovskiy
Hello, this is actually a repost of a question posed by Swarag. I don't think he made the question quite clear, so let me give it a shot. It is known that Solr has support for index replication, and it has support for index segmentation. The question is, how would you use the replication tools wit

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Ryan McKinley
Jeryl Cook wrote: Top often requested feature: 1. Make the option on using the "RAMDirectory" to hook in Terracotta( billion(s) of items in an index anyone?..it would be possible using this.) This is noted in: https://issues.apache.org/jira/browse/SOLR-465 Out of cueriosity, any sense of perfo

Term frequency

2008-03-26 Thread Tim Mahy
Hi All, is there a way to get the term frequency per found result back from Solr ? Greetings, Tim Info Support - http://www.infosupport.com Alle informatie in dit e-mailbericht is onder voorbehoud. Info Support is op geen enkele wijze aansprakelijk voor vergissingen of onjuistheden in dit

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Dietrich
I understand that, and that makes sense. But, coming back to the orginal question: > > When performing searches, > > I need to be able to search against any combination of sites. > > Does anybody have suggestions what the best practice for a scenario > > like that would be, considering bot

Re: Update a field without reindexing the entire document?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 4:28 AM, Vinci wrote: One question: If the target field is a multi-value field, what will be the consequence of the update for SOLR-139: overriding or appending? You can specify when you update a field how that works. SOLR-139, though, seems a long way from being include

Index "corruption" makes it return a different result

2008-03-26 Thread Lucas F. A. Teixeira
Hello all! I had a problem this week, and I like to share with you all. My weblogic server that generate my index hrows its logs in a shared storage. During my indexing process (SOLR+Lucene), this shared storage became 100% full, and everything collapsed (all servers that uses this shared stor

Solr commits automatically on appserver shutdown

2008-03-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi, If my appserver fails during an update or if I do a planned shutdown without wanting to commit my changes Solr does not allow it?. It commits whatever unfinished changes. Is it by design? Can I change this behavior? --Noble

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Daniel Papasian <[EMAIL PROTECTED]>: [EMAIL PROTECTED] wrote: Quoting Jeryl Cook <[EMAIL PROTECTED]>: 2. Make the "schema.xml" configurable at runtime, not really sure the best way to address this, because changing the schema would require "re-indexing" the documents. Isn't the best

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Daniel Papasian
[EMAIL PROTECTED] wrote: > Quoting Jeryl Cook <[EMAIL PROTECTED]>: > >> 2. Make the "schema.xml" configurable at runtime, not really sure the >> best way to address this, because changing the schema would require >> "re-indexing" the documents. > > Isn't the best way to address this just to leave

document retrieval, nested field and HTMLStripStandardTokenizerFactory

2008-03-26 Thread Vinci
Hi all, I am working for developing the interface for Solr with JSON. And some question here: 1. Can I limit the number of returned document in config file to avoid misconfiguration pull down the server? 2. How can I retrieve the document by unique key for result view purpose ? And how can I do t

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Jeryl Cook <[EMAIL PROTECTED]>: 2. Make the "schema.xml" configurable at runtime, not really sure the best way to address this, because changing the schema would require "re-indexing" the documents. Isn't the best way to address this just to leave it to the persons that integrate sol

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Jeryl Cook
Top often requested feature: 1. Make the option on using the "RAMDirectory" to hook in Terracotta( billion(s) of items in an index anyone?..it would be possible using this.) 2. Make the "schema.xml" configurable at runtime, not really sure the best way to address this, because changing the schema w

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 3:11 AM, Vinci wrote: While I changing the default schema.xml, I found this attribute where defined the analyzer...seems it will add some space when multiple fields appear in document, but what is its effect appear in query and what is the values mean here? Suppose you

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Ryan McKinley <[EMAIL PROTECTED]>: In general, you need to be very careful when you change the schema without reindexing. Many changes will break all search, some may be just fine. for example, if you change sint to slong anything already indexed as an "sint" will be incompatible with

Re: Update a field without reindexing the entire document?

2008-03-26 Thread Vinci
Hi Otis, One question: If the target field is a multi-value field, what will be the consequence of the update for SOLR-139: overriding or appending? Thank you, Vinci Otis Gospodnetic wrote: > > Hi Galen, > > See SOLR-139 (this is from memory) issue in JIRA. Doable, but not in Solr > nightli

RE: Update a field without reindexing the entire document?

2008-03-26 Thread Ard Schrijvers
Hello Otis, I have been looking for something similar for Jackrabbit's lucene index, but I still have some uncertainty about wether I understand correctly what the patches in SOLR-139 supply: Do they just retrieve formerly stored fields of a lucene Document, change some field, and then analyze an

Re: Highlighting Quoted Phrases

2008-03-26 Thread Vinci
Hi, Would it be easier if you turn off the highlighting while viewing full document (but summary highlighting is still available) and use javascript to do the matching? (As long as we are need highlighting only when looking at specific document in runtime) Thank you, Vinci Brian Whitman wrote:

positionIncrementGap - what is its value meaning?

2008-03-26 Thread Vinci
Hi all, While I changing the default schema.xml, I found this attribute where defined the analyzer...seems it will add some space when multiple fields appear in document, but what is its effect appear in query and what is the values mean here? Thank you, Vinci -- View this message in context: