from:"Simon Willnauer"

[ANNOUNCE] Apache Solr 4.7.0 released.

2014-02-26 Thread Simon Willnauer

February 2014, Apache Solr™ 4.7 available The Lucene PMC is pleased to announce the release of Apache Solr 4.7 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted searc

[ANNOUNCE] Apache Solr 4.6 released.

2013-11-24 Thread Simon Willnauer

24 November 2013, Apache Solr™ 4.6 available The Lucene PMC is pleased to announce the release of Apache Solr 4.6 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted se

[ANNOUNCE] Apache Solr 4.3 released

2013-05-06 Thread Simon Willnauer

May 2013, Apache Solr™ 4.3 available The Lucene PMC is pleased to announce the release of Apache Solr 4.3. Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, d

Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter

2012-08-01 Thread Simon Willnauer

On Thu, Aug 2, 2012 at 7:53 AM, roz dev wrote: > Thanks Robert for these inputs. > > Since we do not really Snowball analyzer for this field, we would not use > it for now. If this still does not address our issue, we would tweak thread > pool as per eks dev suggestion - I am bit hesitant to do th

Re: Solr 4.0 IllegalStateException: this writer hit an OutOfMemoryError; cannot commit

2012-07-10 Thread Simon Willnauer

it really seems that you are hitting an OOM during auto warming. can this be the case for your failure. Can you raise the JVM memory and see if you still hit the spike and go OOM? this is very unlikely a IndexWriter problem. I'd rather look at your warmup queries ie. fieldcache, FieldValueCache usa

Re: Multiple document types

2012-01-25 Thread Simon Willnauer

of the url, so that the url would > determine which index was to be loaded by the dataimport command. seems like you should look at solr's multicore feature: http://wiki.apache.org/solr/CoreAdmin simon > > F > > -Original Message- > From: Simon Willnauer [mailt

Call for Submission Berlin Buzzwords 2012all for Submission Berlin Buzzwords - http://berlinbuzzwords.de

2012-01-11 Thread Simon Willnauer

ittee Chairs: * Isabel Drost (Nokia & Apache Mahout) * Jan Lehnardt (CouchBase & Apache CouchDB) * Simon Willnauer (SearchWorkings & Apache Lucene) * Grant Ingersoll (Lucid Imagination & Apache Lucene) * Owen O’Malley (Yahoo Inc. & Apache Hadoop) * Jim Webber (Neo Tec

Re: Solr Scoring question

2012-01-05 Thread Simon Willnauer

hey, On Thu, Jan 5, 2012 at 9:31 PM, Christopher Gross wrote: > I'm getting different results running these queries: > > http://localhost:8080/solr/select?&q=*:*&fq=source:wiki&fq=tag:car&sort=score+desc,dateSubmitted+asc&fl=title,score,dateSubmitted&rows=100 > > http://localhost:8080/solr/select

Heads Up - Index File Format Change on Trunk

2012-01-05 Thread Simon Willnauer

Folks, I just committed LUCENE-3628 [1] which cuts over Norms to DocVaues. This is an index file format change and if you are using trunk you need to reindex before updating. happy indexing :) simon [1] https://issues.apache.org/jira/browse/LUCENE-3628

Re: spellcheck-index is rebuilt on commit

2012-01-03 Thread Simon Willnauer

On Tue, Jan 3, 2012 at 9:12 AM, OliverS wrote: > Hi all > > Thanks a lot, and it seems to be a bug, but not of 4.0 only. You are right, > I was doing a commit on an optimized index without adding any new docs (in > fact, I did this for replication on the master). I will open a ticket as > soon as

Re: spellcheck-index is rebuilt on commit

2012-01-02 Thread Simon Willnauer

hey, is it possible that during those commits nothing has changed in the index? I mean are you committing nevertheless there are changes? if so this could happen since the spellchecker gets a new even that you did a commit but the index didn't really change. The spellchecker really only checks if t

Re: Matching all documents in the index

2011-12-13 Thread Simon Willnauer

try *:* instead of *.* simon On Tue, Dec 13, 2011 at 5:03 PM, Kissue Kissue wrote: > Hi, > > I have come across this query in the admin interface: *.* > Is this meant to match all documents in my index? > > Currently when i run query with q= *.*, numFound is 130310 but the actuall > number of do

Re: Seek past EOF

2011-11-30 Thread Simon Willnauer

can you give us some details about what filesystem you are using? simon On Wed, Nov 30, 2011 at 3:07 PM, Ruben Chadien wrote: > Happened again…. > > I got 3 directories in my index dir > > 4096 Nov 4 09:31 index.2004083156 > 4096 Nov 21 10:04 index.2021090440 > 4096 Nov 30 14:55 index.2

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Simon Willnauer

I wonder if you have a explicitly configured merge policy? In Solr 1.4 ie. Lucene 2.9 LogMergePolicy was the default but in 3.5 TieredMergePolicy is used by default. This could explain the differences segment wise since from what I understand you are indexing the same data on 1.4 and 3.5? simon O

[ANNOUNCE] Apache Solr 3.5 released

2011-11-26 Thread Simon Willnauer

27 November 2011, Apache Solr™ 3.5.0 available The Lucene PMC is pleased to announce the release of Apache Solr 3.5.0. Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, fa

JVM Bugs affecting Lucene & Solr

2011-11-15 Thread Simon Willnauer

hey folks, we lately looked into https://issues.apache.org/jira/browse/LUCENE-3235 again, an issue where a class using ConcurrentHashMap hangs / deadlocks on specific JVMs in combination with specific CPUs. It turns out its a JVM bug in Sun / Oracle Java 1.5 as well as Java 1.6. Its apparently fix

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer

On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer wrote: > Hey Roman, > > On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov > wrote: >> Hi everyone, >> >> I'm looking for some help with Solr indexing issues on a large scale. >> >> We are indexing fe

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer

Hey Roman, On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov wrote: > Hi everyone, > > I'm looking for some help with Solr indexing issues on a large scale. > > We are indexing few terabytes/month on a sizeable Solr cluster (8 > masters / serving writes, 16 slaves / serving reads). After certain

Re: changing omitNorms on an already built index

2011-10-28 Thread Simon Willnauer

On Fri, Oct 28, 2011 at 12:20 AM, Robert Muir wrote: > On Thu, Oct 27, 2011 at 6:00 PM, Simon Willnauer > wrote: >> we are not actively removing norms. if you set omitNorms=true and >> index documents they won't have norms for this field. Yet, other >> segment st

Re: changing omitNorms on an already built index

2011-10-27 Thread Simon Willnauer

we are not actively removing norms. if you set omitNorms=true and index documents they won't have norms for this field. Yet, other segment still have norms until they get merged with a segment that has no norms for that field ie. omits norms. omitNorms is anti-viral so once you set it to true it wi

Re: How can I force the threshold for a fuzzy query?

2011-10-27 Thread Simon Willnauer

I am not sure if there is such an option but you might be able to override your query parser and reset that value if it is too fuzzy. look for protected Query newFuzzyQuery(Term term, float minimumSimilarity, int prefixLength) there you can change the actual value used for minimumSimilarity sim

Re: Optimization /Commit memory

2011-10-25 Thread Simon Willnauer

RAM costs during optimize / merge is generally low. Optimize is basically a merge of all segments into one, however there are exceptions. Lucene streams existing segments from disk and serializes the new segment on the fly. When you optimize or in general when you merge segments you need disk space

Re: some basic information on Solr

2011-10-25 Thread Simon Willnauer

hey, 2011/10/24 Dan Wu : > Hi all, > > I am doing a student project on search engine research. Right now I have > some basic questions about Slor. > > 1. How many types of data file Solr can support (estimate)? i.e. No. of > file types solr can look at for indexing and searching. basically you ca

Re: accessing the query string from inside TokenFilter

2011-10-25 Thread Simon Willnauer

On Tue, Oct 25, 2011 at 3:51 PM, Bernd Fehling wrote: > Dear list, > while writing some TokenFilter for my analyzer chain I need access to > the query string from inside of my TokenFilter for some comparison, but the > Filters are working with a TokenStream and get seperate Tokens. > Currently I c

Re: How to make UnInvertedField faster?

2011-10-22 Thread Simon Willnauer

ere. simon > > Hopefully we can fix that at some point :) > > Mike McCandless > > http://blog.mikemccandless.com > > On Fri, Oct 21, 2011 at 7:50 AM, Simon Willnauer > wrote: >> In trunk we have a feature called IndexDocValues which basically >> creates the uninv

Re: How to make UnInvertedField faster?

2011-10-21 Thread Simon Willnauer

In trunk we have a feature called IndexDocValues which basically creates the uninverted structure at index time. You can then simply suck that into memory or even access it on disk directly (RandomAccess). Even if I can't help you right now this is certainly going to help you here. There is no need

Re: Painfully slow indexing

2011-10-21 Thread Simon Willnauer

On Wed, Oct 19, 2011 at 3:58 PM, Pranav Prakash wrote: > Hi guys, > > I have set up a Solr instance and upon attempting to index document, the > whole process is painfully slow. I will try to put as much info as I can in > this mail. Pl. feel free to ask me anything else that might be required. >

Checkout SearchWorkings.org - it just went live!

2011-09-09 Thread Simon Willnauer

Hey folks, Some of you might have heard, myself and a small group of other passionate search technology professionals have been working hard in the last few months to launch a community site known as SearchWorkings.org [1]. This initiative has been set up for other search professionals to have a s

Re: Requiring multiple matches of a term

2011-08-22 Thread Simon Willnauer

On Mon, Aug 22, 2011 at 8:10 PM, Chris Hostetter wrote: > > : One simple way of doing this is maybe to write a wrapper for TermQuery > : that only returns docs with a Term Frequency > X as far as I > : understand the question those terms don't have to be within a certain > : window right? > > I d

Re: heads up: re-index 3.x branch Lucene/Solr indices

2011-08-22 Thread Simon Willnauer

Shawn, as long as you are only using a release version of lucene /solr you don't need to be worried at all. This is a index format change that has never been released. only if you use a svn checkout you should reindex. simon On Mon, Aug 22, 2011 at 8:56 PM, Shawn Heisey wrote: > On 8/22/2011 12:

heads up: re-index 3.x branch Lucene/Solr indices

2011-08-22 Thread Simon Willnauer

I just reverted a previous commit related to CompoundFile in the 3.x stable branch. If you are using unreleased 3.x branch you need to reindex. See here for details: https://issues.apache.org/jira/browse/LUCENE-3218 If you are using a released version of Lucene/Solr then you can ignore this m

Re: Requiring multiple matches of a term

2011-08-21 Thread Simon Willnauer

On Fri, Aug 19, 2011 at 6:26 PM, Michael Ryan wrote: > Is there a way to specify in a query that a term must match at least X times > in a document, where X is some value greater than 1? > One simple way of doing this is maybe to write a wrapper for TermQuery that only returns docs with a Term F

Re: OOM due to JRE Issue (LUCENE-1566)

2011-08-16 Thread Simon Willnauer

hey, On Tue, Aug 16, 2011 at 9:34 AM, Pranav Prakash wrote: > Hi, > > This might probably have been discussed long time back, but I got this error > recently in one of my production slaves. > > SEVERE: java.lang.OutOfMemoryError: OutOfMemoryError likely caused by the > Sun VM Bug described in htt

Re: Can I delete the stored value?

2011-07-11 Thread Simon Willnauer

On Mon, Jul 11, 2011 at 8:28 AM, Andrzej Bialecki wrote: > On 7/10/11 2:33 PM, Simon Willnauer wrote: >> >> Currently there is no easy way to do this. I would need to think how >> you can force the index to drop those so the answer here is no you >> can't! >>

Re: Can I delete the stored value?

2011-07-10 Thread Simon Willnauer

Currently there is no easy way to do this. I would need to think how you can force the index to drop those so the answer here is no you can't! simon On Sat, Jul 9, 2011 at 11:11 AM, Gabriele Kahlout wrote: > I've stored the contents of some pages I no longer need. How can I now > delete the stor

Re: DelimitedPayloadTokenFilter and Highlighter

2011-07-10 Thread Simon Willnauer

Hey hannes, the simplest solution here is maybe using a second field that is for highlighting only. This field would then store your content without the payloads. The other way would be stripping off the payloads during rendering which is not a nice option I guess. Since I am not a highlighter exp

Heads Up - Index File Format Change on Trunk

2011-06-10 Thread Simon Willnauer

Hey folks, I just committed LUCENE-3108 (Landing DocValues on Trunk) which adds a byte to FieldInfo. If you are running on trunk you must / should re-index any trunk indexes once you update to the latest trunk. its likely if you open up old trunk (4.0) indexes, you will get an exception related t

Travel Assistance applications now open for ApacheCon NA 2011

2011-06-06 Thread Simon Willnauer

The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is now accepting applications for ApacheCon North America 2011, 7-11 November in Vancouver BC, Canada. The TAC is seeking individuals from the Apache community at-large --users, developers, educators, students, Committers, an

Re: why query chinese character with bracket become phrase query by default?

2011-05-16 Thread Simon Willnauer

On Mon, May 16, 2011 at 3:51 PM, Yonik Seeley wrote: > On Mon, May 16, 2011 at 5:30 AM, Michael McCandless > wrote: >> To be clear, I'm asking that Yonik revert his commit from yesterday >> (rev 1103444), where he added "text_nwd" fieldType and dynamic fields >> *_nwd to the example schema.xml. >

Berlin Buzzwords - conference schedule released

2011-04-12 Thread Simon Willnauer

gives a presentation on how to integrate Solr with J2EE applications. The second day features presentations by Jonathan Gray on Facebook's use of HBase in their Messaging architecture, Dawid Weiss, Simon Willnauer and Uwe Schindler are showing the latest Apache Lucene developments, Mark M

[GSoC] Apache Lucene @ Google Summer of Code 2011 [STUDENTS READ THIS]

2011-03-11 Thread Simon Willnauer

Hey folks, Google Summer of Code 2011 is very close and the Project Applications Period has started recently. Now it's time to get some excited students on board for this year's GSoC. I encourage students to submit an application to the Google Summer of Code web-application. Lucene & Solr are ama

Re: Lucene 2.9.x vs 3.x

2011-01-16 Thread Simon Willnauer

On Sat, Jan 15, 2011 at 2:19 PM, Salman Akram wrote: > Hi, > > SOLR 1.4.1 uses Lucene 2.9.3 by default (I think so). I have few questions > > Are there any major performance (or other) improvements in Lucene > 3.0.3/Lucene 2.9.4? you can see all major changes here: http://lucene.apache.org/java/3

Re: Lucene Scorer Extension?

2011-01-09 Thread Simon Willnauer

you should look into this http://wiki.apache.org/solr/FunctionQuery simon On Fri, Jan 7, 2011 at 3:59 PM, dante stroe wrote: > Hello, > > What I am trying to do is build a personalized search engine. The aim > is to have the resulting documents' scores depend on users' preferences. > I've al

Re: The search response time is too loong

2010-09-27 Thread Simon Willnauer

2010/9/27 newsam : > I have setup a SOLR searcher instance with Tomcat 5.5.21. However, the > response time is too long. Here is my scenario: > 1. The index file is 8.2G. The doc num is 6110745. > 2. DELL Server: Intel(R) Xeon(TM) CPU (4 cores) 3.00GHZ, 6G Mem. > > I used "Key:*" to query all reco

Re: trie

2010-09-21 Thread Simon Willnauer

2010/9/21 Péter Király : > You can read about it in Lucene in Action second edition. have a look at http://www.lucidimagination.com/developer/whitepaper/Whats-New-in-Apache-Lucene-3-0 page 4 to 8 should give you a good intro to the topic simon > > Péter > > 2010/9/21 Papp Richard : >> is there

Re: Can I tell Solr to merge segments more slowly on an I/O starved system?

2010-09-18 Thread Simon Willnauer

On Sun, Sep 19, 2010 at 6:04 AM, Ron Mayer wrote: > My system which has documents being added pretty much > continually seems pretty well behaved except, it seems, > when large segments get merged. During that time > the system starts really dragging, and queries that took > only a couple seco

Re: No more trunk support for 2.9 indexes

2010-09-18 Thread Simon Willnauer

On Sat, Sep 18, 2010 at 4:13 AM, Chris Hostetter wrote: > > : Since Lucene 3.0.2 is 'out there', does this mean the format is nailed down, > : and some sort of porting is possible? > : Does anyone know of a tool that can read the entire contents of a Solr index > : and (re)write it another? (as an

Re: Field names

2010-09-13 Thread Simon Willnauer

On Tue, Sep 14, 2010 at 1:39 AM, Peter A. Kirk wrote: > Fantastic - that is exactly what I was looking for! > > But here is one thing I don't undertstand: > > If I call the url: > http://localhost:8983/solr/admin/luke?numTerms=10&fl=name > > Some of the result looks like: > > > > > 18

Re: mm=0?

2010-09-13 Thread Simon Willnauer

On Mon, Sep 13, 2010 at 8:07 PM, Lance Norskog wrote: > "Java Swing" no longer gives ads for "swinger's clubs". damned no i have to explicitly enter it?! - argh! :) simon > > On Mon, Sep 13, 2010 at 9:37 AM, Dennis Gearon wrote: >> I just tried several searches again on google. >> >> I think th

Re: stopwords in AND clauses

2010-09-13 Thread Simon Willnauer

On Mon, Sep 13, 2010 at 3:27 PM, Xavier Noria wrote: > Let's suppose we have a regular search field body_t, and an internal > boolean flag flag_t not exposed to the user. > > I'd like > > body_t:foo AND flag_t:true this is solr right? why don't you use filterquery for you unexposed flat_t fiel

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Simon Willnauer

On Mon, Sep 13, 2010 at 8:02 AM, Dennis Gearon wrote: > BTW, what is a segment? On the Lucene level an index is composed of one or more index segments. Each segment is an index by itself and consists of several files like doc stores, proximity data, term dictionaries etc. During indexing Lucene /

Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Simon Willnauer

On Sun, Sep 12, 2010 at 12:42 PM, Robert Muir wrote: > On Sat, Sep 11, 2010 at 7:51 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom >> wrote: >> > Is there an example of how to set up the divisor parameter in >> solrconfig.xml

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Simon Willnauer

On Sun, Sep 12, 2010 at 1:51 AM, Michael McCandless wrote: > On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom wrote: >> Is there an example of how to set up the divisor parameter in >> solrconfig.xml somewhere? > > Alas I don't know how to configure terms index divisor from Solr... You can s

Re: How to give path in SCRIPT tag?

2010-09-07 Thread Simon Willnauer

ankita, your questions seems to be somewhat unrelated to solr / lucene and should be asked somewhere else but not on this list. Please try to keep the focus of your questions to Solr related topics or use java-user@ for lucene related topics. Thanks, Simon On Tue, Sep 7, 2010 at 3:46 PM, ankita

Re: minMergeDocs supported ?

2010-08-24 Thread Simon Willnauer

Hey, I guess this option has been removed in Lucene 2.0 - you could look as maxBufferedDocs and ramBufferSizeMB to control how many documents / heap space is used to buffer documents before they are flushed and merged into a new segment. Don't know what you are trying to do but those are the factor

Re: search multiple default fields

2010-07-05 Thread Simon Willnauer

Have a look at http://wiki.apache.org/solr/DisMaxRequestHandler and http://wiki.apache.org/solr/DisMaxRequestHandler#qf_.28Query_Fields.29 that might help with what you are looking for... simon On Tue, Jul 6, 2010 at 3:48 AM, bluestar wrote: > hi there, > > is it possible to define multiple def

Re: Not split a field on whitespaces?

2010-07-05 Thread Simon Willnauer

Use solr.StrField or solr.KeywordTokenizerFactory instead. simon On Mon, Jul 5, 2010 at 2:47 PM, Sebastian Funk wrote: > Hey there, > > I might be just to blind to see this, but isn't it possible to have a > solr.TextField not getting filtered in any way. That means the input > "Michael Jackson"

Re: Weird memory error.

2007-11-21 Thread Simon Willnauer

Actually when I look at the errormessage, this has nothing to do with memory. The error message: java.lang.OutOfMemoryError: unable to create new native thread means that the OS can not create any new native threads for this JVM. So the limit you are running into is not the JVM Memory. I guess you

Re: Weird memory error.

2007-11-20 Thread Simon Willnauer

I'm using the Eclipse TPTP platfrom and I'm very happy with it. You will also find good howto or tutorial pages on the web. - simon On Nov 20, 2007 5:29 PM, Brian Carmalt <[EMAIL PROTECTED]> wrote: > Can you recommend one? I am not familar with how to profile under Java. > > Yonik Seeley schrieb

Re: Extending Solr's Admin functionality

2006-09-27 Thread Simon Willnauer

cation, and allowing users to plug it in or not. If I'm understanding that correctly then I'm quite +1 on JMX! And I suppose some of these adapters already have built in web service interfaces. Erik On Sep 27, 2006, at 6:20 AM, Simon Willnauer wrote: > @Otis: I suggest we go a

Re: Extending Solr's Admin functionality

2006-09-27 Thread Simon Willnauer

@Otis: I suggest we go a bit more in detail about the features solr should expose via JMX and talk about the contribution. I'd love to extend solr with more JMX support. On 9/27/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 9/26/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > On the other h

Re: Extending Solr's Admin functionality

2006-09-24 Thread Simon Willnauer

I followed the discussion the last 3 day and I still wondering why nobody turned up with an integration of solr monitoring and administration functionality using javas fantastic management extension JMX. I joined a team 2 years ago building a distributed webspider / searcher (similar to nutch). In

Re: update partial document

2006-09-18 Thread Simon Willnauer

I'm not into the code of Solr at all but I know that Solr is based on the lucene core which has no kind of update mechanism. To update a document using lucene you have to delete and reinsert the document. That might be the reason for the solr behaviour as well. You should consider that lucene is

Re: does solr know classpath

2006-09-16 Thread Simon Willnauer

/solrwebapp/WEB-INF/lib to point out one solution best regards simon On 9/16/06, James liu <[EMAIL PROTECTED]> wrote: i set classpath where i put lucene-analyzers-2.0.0.jar...i can use it. but solr not find it.. where i should put it in?

Solr in production env.

2006-09-11 Thread Simon Willnauer

Hello, I almost convinced my boss to use Solr in production for a new project and hopefully for lots of following projects but I'm a bit confused that there is no release available for download. Is Solr still in a beta state, are there solr servers in production. Is it recommendable to use it in

65 matches

Mail list logo