Benchmarking Solr

2010-04-09 Thread Blargy
I am about to deploy Solr into our production environment and I would like to do some benchmarking to determine how many slaves I will need to set up. Currently the only way I know how to benchmark is to use Apache Benchmark but I would like to be able to send random requests to the Solr... not ju

Re: Solr date "NOW" - format?

2010-04-09 Thread Lance Norskog
The example function seems to round time to years, so you're boosting by year? Your dates are stored as UTC 64-bit longs counting the number of milliseconds since Jan 1, 1970. That's it. They're in milliseconds whether you supplied them that way or not. So I think the example is what you want. Fu

Solr date "NOW" - format?

2010-04-09 Thread Shawn Heisey
I've been trying to work out how SOLR thinks about dates internally so I can boost newer documents. My post_date field is stored as seconds since the epoch, so I think the following is probably what I want. I used 3.17 instead of the 3.16 in all the examples because my own math suggests that'

OOM while indexing with Tika

2010-04-09 Thread Lance Norskog
There is a low-level memory "leak" (really an unfortunate retention) in Lucene which can cause OOMs when using the Tika tools on large files like PDF. A patch will be in the trunk sometime soon. http://markmail.org/thread/lhr7wodw4ctsekik https://issues.apache.org/jira/browse/LUCENE-2387 -- Lan

Re: including external files in config by corename

2010-04-09 Thread Shawn Heisey
On 4/8/2010 1:15 PM, Chris Hostetter wrote: ...i suspect you want something like... where handlers.xml looks like... The xpointer you mentioned above didn't work. I finally found something that did, though: href="/index/solr/config/requestHandlers.xml#xpointer(/*/node())"

Re: use a solr-built index with lucene?

2010-04-09 Thread Lance Norskog
Are the Trie types in Lucene 2.9.2? Otherwise, be sure to use the old int (or sint?) types in your schema. On Fri, Apr 9, 2010 at 4:12 AM, Erik Hatcher wrote: > Oh, sorry, I got the direction backwards in my initial reply. > > Yes, of course you can use an index from Solr with Lucene directly.  

Re: StreamingUpdateSolrServer hangs

2010-04-09 Thread Yonik Seeley
Stephen, were you running stock Solr 1.4, or did you apply any of the SolrJ patches? I'm trying to figure out if anyone still has any problems, or if this was fixed with SOLR-1711: * SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that could halt the streaming of documents. (At

Re: Questions about Solr

2010-04-09 Thread Smiley, David W.
If the user query is not going to have wildcards then use NGrams. I talk about the black art of ngrams in my book. There are multiple ways of configuring it. If the query will have wildcards, Solr comes with a sample schema with a field type named, "text_rev" (I think that's what it's named)

Questions about Solr

2010-04-09 Thread noel
Hi, I would like to know the answer to the following: - How am I able to use wildcard searches with Solr? EX: querying Ado with a result that would retrieve something like Adolescent. - Phrase searches with stop words completely ruin the query and finds no results. How can I query something lik

Re: Minimum Should Match the other way round

2010-04-09 Thread MitchK
I have searched for a tutorial in Lucene - instead of Solr itself - and I've found something on lucenetutorials.com: String querystr = args.length > 0 ? args[0] : "lucene"; // the "title" arg specifies the default field to use // when no field is explicitly specified in the query. Q

Re: "json.nl=arrarr" does not work with "facet.date"

2010-04-09 Thread fabritw
Yonik Seeley-2-2 wrote: > > If order is more important here, then it should have been a NamedList. > Hi Yonik, thanks for your quick reply! Unfortunately I cannot use the NamedList as I need to use the dateField parameters in my query also. I am trying to compile a list of facets, displayin

Re: "json.nl=arrarr" does not work with "facet.date"

2010-04-09 Thread Yonik Seeley
On Fri, Apr 9, 2010 at 1:04 PM, fabritw wrote: > > Apologies for the second post, I noticed the "json.nl=arrarr" does work with > "facet.field" but not with "facet.date"? Hmmm, this is because date faceting uses a SimpleOrderedMap instead of a NamedList (implying that access-like-a-map is more im

Re: "json.nl=arrarr" does not work with "facet.date"

2010-04-09 Thread fabritw
Apologies for the second post, I noticed the "json.nl=arrarr" does work with "facet.field" but not with "facet.date"? Is there a separate parameter required for "facet.date" to make it display as an array? Any help is much appreciated, Will { "responseHeader":{ "status":0, "QTime":2, "p

Re: solr.WordDelimiterFilterFactory problem with hyphenated terms?

2010-04-09 Thread Robert Muir
but this behavior is correct, as you have position increments enabled. if you want the second query (which has 2 gaps) to match, you need to either use slop, or disable these increments alltogether. On Fri, Apr 9, 2010 at 11:44 AM, Demian Katz wrote: > I've given it a try, and it definitely seems

RE: solr.WordDelimiterFilterFactory problem with hyphenated terms?

2010-04-09 Thread Demian Katz
I've given it a try, and it definitely seems to have improved the situation. However, there is still one weird case that's clearly related to term positions. If I do this search, it fails: title:"love customs in eighteenthcentury spain" ...but if I do this search, it succeeds: title:"love cu

Re: Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Walter Underwood
You don't need multi-core. Solr already does this automatically. It creates a new Searcher and auto-warms the cache. But, it will still be slow. If you use auto-warming, it uses most of one CPU, which slows down queries during warming. Also, warming isn't perfect, so queries will be slower afte

RE: index corruption / deployment strategy

2010-04-09 Thread Nagelberg, Kallin
Thanks Erik, I forwarded your thoughts to management and put in good word for Lucid Imagination. Regards, Kallin Nagelberg -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Thursday, April 08, 2010 2:18 PM To: solr-user@lucene.apache.org Subject: Re: index cor

refreshing synonyms.txt - or other configs

2010-04-09 Thread Markus.Rietzler
i am wondering how config files like synonyms.txt or stopwords.txt can be refreshed without restarting of solr, maybe also how changes in solrconfig.xml or schema.xml can be refreshed? i can use a multicore setup - i just tested it with a "multicore"-setup with one one core (core0), there i can

Re: Minimum Should Match the other way round

2010-04-09 Thread MitchK
Hoss, before I ran into some missunderstandings, I want to come back to topic first. I will have a look at some classes later, to find out whether some other ideas which are not directly related to this topic (like the multiword-synonyms at query-time) will work or not. I'm sorry for beeing off-t

Re: Solr giving 500's

2010-04-09 Thread Yonik Seeley
Looks like you're missing one of the index files... segments_ It points to all the other index files. -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Fri, Apr 9, 2010 at 6:20 AM, william pink wrote: > Hi, > > I was seeing this error from Solr this morning > > "Severe_errors_in_solr

Re: use a solr-built index with lucene?

2010-04-09 Thread Erik Hatcher
Oh, sorry, I got the direction backwards in my initial reply. Yes, of course you can use an index from Solr with Lucene directly. It's just a Lucene index. Just make sure you use the same version of Lucene (pull the JARs from solr.war, I'd say). For example, you can open a "Solr index" w

Re: Faceting on a multi-valued field by index

2010-04-09 Thread Erik Hatcher
Though if you added a prefix to all your root id's, say "root" format, then you could use facet.prefix=root Erik On Apr 8, 2010, at 10:24 PM, Lance Norskog wrote: Nope! Lucene is committed to maintaining the order of values added to a field, but does not have this feature. On Th

Solr giving 500's

2010-04-09 Thread william pink
Hi, I was seeing this error from Solr this morning "Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_infomation_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_solrconfig

Re: Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Marco Martinez
Hi Marcin, This is because when you do the replication, all the caches are rebuild cause the index has changed, so the searchs performance decrease. You can change your architecture to a multicore one to reduce the impact of the replication. Using two cores, one to do the replication, and other to

Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Marcin
Hi guys, I have noticed that Master/Slave replication process is slowing down slave read/search performance during replication being done. please help cheers

Re: use a solr-built index with lucene?

2010-04-09 Thread Tommy Chheng
I was thinking of the reverse case: from solr to lucene. lucene doesn't use a schema.xml Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 4/9/10 12:15 AM, Paul Libbrecht wrote: This looks like an interesting avenue for a smooth transition

Re: use a solr-built index with lucene?

2010-04-09 Thread Paul Libbrecht
This looks like an interesting avenue for a smooth transition from lucene to solr. thanks for more hints you find around. (e.g. maybe it is not too hard to pre-generate a schema.xml from an actual index for the field-types?) paul Le 09-avr.-10 à 02:32, Erik Hatcher a écrit : Yes... gott