Re: Solr Transaction Log Question

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 11:30 PM, Jamie Johnson wrote: > How large will the transaction log grow, and how long should it be kept > around? We keep around enough logs to satisfy a minimum of 100 updates lookback. Unneeded log files are deleted automatically. When a hard commit is done, we create

Re: Indexing taking so much time to complete.

2012-02-25 Thread Erick Erickson
Right. My situation is simple, I have a 32G dump of Wikipedia data in a big XML file. I can parse it and dump it into a (local) Solr instance at 5-7K records/second. But it's stupid-simple, just a few fields and no database involved. Much of the 32G is XML. But that serves to illustrate that the si

RE: Indexing taking so much time to complete.

2012-02-25 Thread Mike O'Leary
What's your secret? OK, that question is not the kind recommended in the UsingMailingLists suggestions, so I will write again soon with a description of my data and what I am trying to do, and ask more specific questions. And I don't mean to hijack the thread, but I am in the same boat as the p

Re: TermsComponent show only terms that matched query?

2012-02-25 Thread Lance Norskog
I think you have to walk the term positions and offsets, look in the stored field, and find the terms that matched. Which is exactly what highlighting does. And this will only find the actual terms in the text, no synonyms. So if you search for Sempranillo and find Sempranillo in some wines and Tem

Re: TermsComponent show only terms that matched query?

2012-02-25 Thread Erick Erickson
Jay: I've seen the this question go 'round before, but don't remember a satisfactory solution. Are you talking on a per-document basis here? If so, I vaguely remember it being possible to do something with highlighting, just counting the tags returned after highlighting. Best Erick On Fri, Feb 2

Re: Indexing taking so much time to complete.

2012-02-25 Thread Erick Erickson
You have to tell us a lot more about what you're trying to do. I can import 32G in about 20 minutes, so obviously you're doing something different than I am... Perhaps you might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Sat, Feb 25, 2012 at 12:00 AM, Suneel wrote: > Hi

Re: TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

2012-02-25 Thread Erick Erickson
Well, you can give it a try, I don't know if anyone's done that before. And you're on your own, I haven't a clue what the results would be... Sorry I can't be more help here... Erick On Thu, Feb 23, 2012 at 10:44 PM, bing wrote: > Hi, all, > > I am using > org.apache.solr.update.processor.TikaLa

Re: lucene operators interfearing in edismax

2012-02-25 Thread William Bell
Please backport to 3x. On Mon, Feb 20, 2012 at 2:22 PM, Yonik Seeley wrote: > This should be fixed in trunk by LUCENE-2566 > > QueryParser: Unary operators +,-,! will not be treated as operators if > they are followed by whitespace. > > -Yonik > lucidimagination.com > > > > On Mon, Feb 20, 2012 a

RE: Problem with SolrCloud + Zookeeper + DataImportHandler

2012-02-25 Thread Agnieszka Kukałowicz
Hi, As you've asked. https://issues.apache.org/jira/browse/SOLR-3165 If you have any questions or need more details I can debug this problem more. Agnieszka > -Original Message- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: Friday, February 24, 2012 10:11 PM > To: solr-user

Re: Solr 4.0 Question

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:39 PM, Jamie Johnson wrote: > "Unfortunately, Apache Solr still uses this horrible code in a lot of > places, leaving us with a major piece of work undone. Major parts of > Solr’s facetting and filter caching need to be rewritten to work per > atomic segment! For those im

Re: upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:37 PM, Jamie Johnson wrote: >  I.e. just do if(!acceptDocs.get(doc)) return false; at > the top? Yep, that should do it. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10

Solr 4.0 Question

2012-02-25 Thread Jamie Johnson
I just got done reading http://www.searchworkings.org/blog/-/blogs/uwe-says%3A-is-your-reader-atomic and was specifically interested in the following line "Unfortunately, Apache Solr still uses this horrible code in a lot of places, leaving us with a major piece of work undone. Major parts of Solr

Re: upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Jamie Johnson
I am assuming you meant should not be returned right? I basically return a filtered doc id set and do the following return new FilteredDocIdSet(startingFilter.getDocIdSet(readerCtx, acceptDocs)) { @Override public boolean match(int doc) {

Re: upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:16 PM, Jamie Johnson wrote: > I'm trying to upgrade an application I have from an old snapshot of > solr to the latest stable trunk and see that the constructor for > Filter has changed, specifically there is another parameter named > acceptDocs, the API says the followin

upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Jamie Johnson
I'm trying to upgrade an application I have from an old snapshot of solr to the latest stable trunk and see that the constructor for Filter has changed, specifically there is another parameter named acceptDocs, the API says the following acceptDocs - Bits that represent the allowable docs to match

Re: SIREn integration with SOLR

2012-02-25 Thread Anuj Kumar
Hi Chitra, You can download the distribution using the details given here- http://siren.sindice.com/download.html License has been changed to AGPL3.0 Source code is available here- https://github.com/rdelbru/SIREn/ - Anuj On Wed, Feb 22, 2012 at 3:45 PM, chitra wrote: > Hi, > > We would

Re: nutch and solr

2012-02-25 Thread alessio crisantemi
thi is the problem! Becaus in my root there is a url! I write you my step-by-step configuration of nutch: (I use cygwin because I work on windows) *1. Extract the Nutch package* *2. Configure Solr* (*Copy the provided Nutch schema from directory apache-nutch-1.0/conf to directory apache-solr-1.3