Re: performance crossover between single index and sharding

2011-08-03 Thread Bernd Fehling
Hi Shawn, the 0.05 seconds for search time at peek times (3 qps) is my target for Solr. The numbers for solr are from Solr's statistic report page. So 39.5 seconds average per request is definately to long and I have to change to sharding. For FAST system the numbers for the search dispatcher ar

csv responsewriter and numfound

2011-08-03 Thread Pooja Verlani
Hi, Is there anyway to get numFound from csv response format? Some parameter? Or shall I change the code for csvResponseWriter for this? Thanks, Pooja

java.lang.IllegalStateException: Committed error in the logs

2011-08-03 Thread Anand.Nigam
I am getting following error log on trying to search. Any idea why this error is coming. Search results are coming after a long delay. SEVERE: org.mortbay.jetty.EofException at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791) at org.mortbay.jetty.AbstractGenerator$

Highlighting does not works with uniqueField set

2011-08-03 Thread Anand.Nigam
Hi, I am new to solr. Am facing an issue wherein the highlighting of the searchresults for matches is not working when I have set a unique field as: id If this is commented then highlighting starts working. I need to have a unique field. Could someone please explain this erratic behaviour. I a

Re: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.ICUTokenizerFactory'

2011-08-03 Thread Satish Talim
Guys, I am still stuck. Any help? Thanks, Satish On Tue, Aug 2, 2011 at 5:23 PM, Robert Muir wrote: > did you add the analysis-extras jar itself? thats what has this factory. > > On Tue, Aug 2, 2011 at 5:03 AM, Satish Talim > wrote: > > I am using Solr 3.3 on a Windows box. > > > > I want to

Re: Is there anyway to sort differently for facet values?

2011-08-03 Thread Erick Erickson
have you looked at the facet.sort parameter? The "index" value is what I think you want. Best Erick On Aug 3, 2011 7:03 PM, "Way Cool" wrote: > Hi, guys, > > Is there anyway to sort differently for facet values? For example, sometimes > I want to sort facet values by their values instead of # of

Re: Records skipped when using DataImportHandler

2011-08-03 Thread Erick Erickson
Sorry, I'm on a restricted machine so can't get the precise URL. But, there's a debug page for DIH that might allow you to see what the query actually returns. I'd guess one of two things: 1> you aren't getting the number of rows you think. 2> you aren't committing the documents you add. But that'

solr-user@lucene.apache.org

2011-08-03 Thread Erick Erickson
Hmmm, the only thing that comes to mind is the "join" feature being added to Solr 4.x, but I confess I'm not entirely familiar with that functionality so can't tell if it really solver your problem. Other than that I'm out of ideas, but the again it's late and I'm tired so maybe I'm not being very

Re: MultiSearcher/ParallelSearcher - searching over multiple cores?

2011-08-03 Thread Erick Erickson
As far as I know, you're right. There's no built-in way to do what you want, especially since the fact that you're talking about different search fields implies that the scores from the documents aren't comparable anyway. How do you intend to combine the results for presentation to the user? Best

Re: lucene/solr, raw indexing/searching

2011-08-03 Thread Erick Erickson
I predict you'll spend a lot of time on the admin/analysis page understanding what the various combinations of tokenizers and filters do. Because, you see, you already have differences, to whit: your Solr schema has LowercaseFilter and removeDuplicates. Have you determined *why* Solr indexing is s

Re: indexing taking very long time

2011-08-03 Thread Erick Erickson
What version of Solr are you using? If it's a recent version, then optimizing is not that essential, you can do it during off hours, perhaps nightly or weekly. As far as indexing speed, have you profiled your application to see whether it's Solr or your indexing process that's the bottleneck? A q

Is there anyway to sort differently for facet values?

2011-08-03 Thread Way Cool
Hi, guys, Is there anyway to sort differently for facet values? For example, sometimes I want to sort facet values by their values instead of # of docs, and I want to be able to have a predefined order for certain facets as well. Is that possible in Solr we can do that? Thanks, YH

Help with ShardParams

2011-08-03 Thread John Brewer
Hello, Can someone point me a good example or two of usage of the ShardParams shards.start and shards.rows? I have a Solr instance of 250M documents spread across 4 shards. And I need to be able to reliably and quickly access the records by "page" at the request of the user. I understan

Re: Does solr support multiple index set

2011-08-03 Thread Helton Alponti
Hello Sharath, Yes you can create many indexes. See this article: http://wiki.apache.org/solr/CoreAdmin See you, Helton On Wed, Aug 3, 2011 at 4:55 PM, Sharath Jagannath wrote: > Hey, > > This might be completely naive question. > > Could, I create more than one instance of index sets on a sin

Does solr support multiple index set

2011-08-03 Thread Sharath Jagannath
Hey, This might be completely naive question. Could, I create more than one instance of index sets on a single instance of solr server? If so, how could I specify which schema to use and which index set to use. I am planning to create 2 separate index set using a single solr server. Data that n

RE: question on solr.ASCIIFoldingFilterFactory

2011-08-03 Thread cquezel
lboutros wrote: > > I used Spanish stemming, put the ASCIIFoldingFilterFactory before the > stemming filter and added it in the query part too. > > Ludovic. > My experiments with french stemmer does not yield good results with this order. Applying the ASCIIFoldingFilterFactory before stemming

Re: Setting up Namespaces to Avoid Running Multiple Solr Instances

2011-08-03 Thread Jonathan Rochkind
I think that Solr "multi-core" (nothing to do with CPU cores, just what it's called in Solr) is what you're looking for. http://wiki.apache.org/solr/CoreAdmin On 8/3/2011 2:25 PM, Mike Papper wrote: Hi, we run several independent websites on the same machines. Each site uses a similar codebase

Setting up Namespaces to Avoid Running Multiple Solr Instances

2011-08-03 Thread Mike Papper
Hi, we run several independent websites on the same machines. Each site uses a similar codebase for search. Currently each site contacts its own solr server on a slightly different port. This means of course that we are running several solr servers (each on their own port) on the same machine. I wo

Records skipped when using DataImportHandler

2011-08-03 Thread anand sridhar
Hi, I am a newbie to Solr and have been trying to learn using DataImportHandler. I have a query in data-config.xml that fetches about 5 records when i fire it in SQL Query manager. However, when Solr does a full import, it is skipping 4 records and only importing 1 record. What could be the reason

Re: Dismax mm per field

2011-08-03 Thread Dmitriy Shvadskiy
Thanks Jonathan. I thought it would be possible via nested queries but somehow could not get it to work. I'll give it another shot. On Wed, Aug 3, 2011 at 12:32 PM, Jonathan Rochkind [via Lucene] < ml-node+3222792-952640420-221...@n3.nabble.com> wrote: > There is not, and the way dismax works mak

A rant about field collapsing

2011-08-03 Thread baronDodd
I am working on an implementation of search within our application using solr. About 2 months ago we had the need to group results by a certain field. After some searching I came across the JIRA in progress for this - field collapsing: https://issues.apache.org/jira/browse/SOLR-236 It was schedul

Re: Solr request filter and indexing process

2011-08-03 Thread 于浩
A ha,I have found the root cause , the Solr has return the result properly .The root cause is the SolrPHPClient, The SolrPHPClient uses file_get_contents function for connecting to Solr by default ,this function is not stable, usually returns http status error. thanks for everybody who gives me

Re: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Jonathan Rochkind
Not so much that it's a corner case in the sense of being unusual neccesarily (I'm not sure), it's just something that fundamentally doesn't fit well into lucene's architecture. I'm not sure that filing a JIRA will be much use, it's really unclear how one would get lucene to do this, it would

Re: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Jonathan Rochkind
There's no great way to do this. I understand your problem as: It's a multi-valued field, but you want to sort on whichever of those values matched the query, not on the values that didn't. (Not entirely clear what to do if the documents are in the result set becuse of a match in an entirely di

Re: Dismax mm per field

2011-08-03 Thread Jonathan Rochkind
There is not, and the way dismax works makes it not really that feasible in theory, sadly. One thing you could do instead is combine multiple separate dismax queries using the nested query syntax. This will effect your relevancy ranking possibly in odd ways, but anything that accomplishes 'mm

RE: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Olson, Ron
*Sigh*...I had thought maybe reversing it would work, but that would require creating a whole new index, on a separate core, as the existing index is used for other purposes. Plus, given the volume of data, that would be a big deal, update-wise. What would be better would be to remove that parti

solr-user@lucene.apache.org

2011-08-03 Thread karsten-solr
Hi Erick, our two "changable" fields are used for linking between documents on application level. >From lucene point of view they are just two searchable fields with stored term >vector for one of them. Our queries will use one of this fields and a couple of fields from the "stable" fields. So

Re: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Smiley, David W.
Hi Ron. This is an interesting problem you have. One idea would be to create an index with the entity relationship going in the other direction. So instead of one to many, go many to one. You would end up with multiple documents with varying names but repeated parent entity information -- perh

Dismax mm per field

2011-08-03 Thread Dmitriy Shvadskiy
Hello, Is there a way to apply (e)dismax mm parameter per field? If I have a query field1:(blah blah) AND field2:(foo bar) is there a way to apply mm only to field2? Thanks, Dmitriy -- View this message in context: http://lucene.472066.n3.nabble.com/Dismax-mm-per-field-tp3222594p3222594.html S

RE: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Olson, Ron
Right, the search term is the sort field. I can manually sort an individual page, but when the user clicks on the next page, the sort is "reset", visually. -Original Message- From: Mike Sokolov [mailto:soko...@ifactory.com] Sent: Wednesday, August 03, 2011 9:52 AM To: solr-user@lucene.apa

Re: Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Mike Sokolov
Although you weren't very clear about it, it sounds as if you want the results to be sorted by a name that actually matched the query? In general that is not going to be easy, since it is not something that can be computed in advance and thus indexed. -Mike On 08/03/2011 10:39 AM, Olson, Ro

Strategies for sorting by array, when you can't sort by array?

2011-08-03 Thread Olson, Ron
Hi all- Well, this is a problem. I have a list of names as a multi-valued field and I am searching on this field and need to return the results sorted. I know from searching and reading the documentation (and getting the error) that sorting on a multi-valued field isn't possible. Okay, so, what

Re: performance crossover between single index and sharding

2011-08-03 Thread Shawn Heisey
Replies inline. On 8/3/2011 2:24 AM, Bernd Fehling wrote: To show that I compare apples and oranges here are my previous FAST Search setup: - one master server (controlling, logging, search dispatcher) - six index server (4.25 mio docs per server, 5 slices per index) (searching and indexing a

RE: Joining on multi valued fields

2011-08-03 Thread matthew . fowler
No I haven't. I will get the latest out of the trunk and report back. Cheers again, Matt -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 03 August 2011 14:51 To: Fowler, Matthew (Markets Eikon) Cc: solr-user@lucene.apache.org Subject:

solr-user@lucene.apache.org

2011-08-03 Thread Erick Erickson
How are these fields used? Because if they're not used for searching, you could put them in their own core and rebuild that index at your whim, then querying that core when you need the relationship information. If you have a DB backing your system, you could perhaps store the info there and query

Re: Joining on multi valued fields

2011-08-03 Thread Yonik Seeley
Hmmm, if these are real responses from a solr server "at rest" (i.e. documents not being changed between queries) then what you show definitely looks like a bug. That's interesting, since TestJoin implements a random test that should cover cases like this pretty well. I assume you are using a vers

Re: Different Access Permissions?

2011-08-03 Thread Erick Erickson
Sure, it's possible. It's just that you have to do the work yourself ... You could define a series of request handlers for various classes of user and route the request to the correct handler based on that user's attributes. You could construct the query manually based on the user's attributes. Yo

Re: Looking for a senior search engineer

2011-08-03 Thread Erick Erickson
Here's a page where you can "hire guns" that you might be interested in... http://wiki.apache.org/solr/Support Best ERick On Fri, Jul 29, 2011 at 8:59 PM, Michael Economy wrote: > Hi, > > Sorry if this isn't the right place for this message, but it's a very > specific role we're looking for and

Re: dealing with so many different sorting options

2011-08-03 Thread Erick Erickson
Well, you're kind of stuck unfortunately. It's pretty much required that you'll have to reindex when you add new fields if you want existing documents to have that field. I don't think there's any good way to use the DB to sort Solr results that would be performant. About using Solr as your data

Re: Possible to use quotes in dismax qf?

2011-08-03 Thread Erick Erickson
Did you look at phrase fields (pf) in dismax? Best Erick On Thu, Jul 28, 2011 at 11:26 AM, O. Klein wrote: > I removed the post as it might confuse people. > > But because of analysers combining 2 words in a phrase query using shingles > and positionfilter and the usage of dismax, I need q to be

Re: about the Solr request filter

2011-08-03 Thread Erick Erickson
Can we see the queries you're running and the data you expect back? And an idea of the documents you're expecting to be matched, including the field definitions from your schema.xml for the fields in question. Are you using SolrJ? Just a URL in a browser? How do you mean "manually"? It might hel

Re: changing the root directory where solrCloud stores info inside zookeeper File system

2011-08-03 Thread Mark Miller
Thanks - let me try and do this here manually later today and get back to you. - Mark On Aug 2, 2011, at 7:41 AM, Yatir Ben Shlomo wrote: > Thanks A lot mark, > Since My SolrCloud code was old I tried downloading and building the > newest code from here > https://svn.apache.org/repos/asf/lucene/

RE: Joining on multi valued fields

2011-08-03 Thread matthew . fowler
Hi Yonik Sorry for my late reply. I have been trying to get to the bottom of this but I'm getting inconsistent behaviour. Here's an example: Query = "pi:rcs100" - Here going to use "pid_rcs" as join value rcs100 rcs G1 Emerging Market Countries All business events rela

Re: Why Slop doens't match anything?

2011-08-03 Thread Ahmet Arslan
> Hm... > > No. Can you paste output of &debugQuery=on for two queries?

Re: Why Slop doens't match anything?

2011-08-03 Thread Alexander Ramos Jardim
Hm... No. 2011/8/3 Gora Mohanty > On Wed, Aug 3, 2011 at 1:33 AM, Alexander Ramos Jardim > wrote: > [...] > > I am not using dismax. I didn't find the solution for the problem. I just > > made a full-import and the problem ended. Still odd. > [...] > > Maybe you changed the type of the field i

Dispatching a query to multiple different cores

2011-08-03 Thread Ahmed Boubaker
Hello there! I have a multicore solr with 6 different "simple" cores and somewhat different schemas and I defined another "meta" core which I would it to be a dispatcher: the requests are sent to "simple" cores and results are aggregated before sending back the results to the user. Any idea or h

Re: performance crossover between single index and sharding

2011-08-03 Thread Dmitry Kan
OK, here is a brief on our sharded setup. We have 10 shards, 3 per high-end Amazon machine. Majority of the searches are done on 2 shards at most, that have the latest data in their indices. We use logical sharding, not hash based. These two lead to a situation, where given a user query that *will

Re: performance crossover between single index and sharding

2011-08-03 Thread Bernd Fehling
On 02.08.2011 21:00, Shawn Heisey wrote: ... I did try some early tests with a single large index. Performance was pretty decent once it got warmed up, but I was worried about how it would perform under a heavy load, and how it would cope with frequent updates. I never really got very far with

Re: Jetty error message regarding EnvEntry in WebAppContext

2011-08-03 Thread Marian Steinbach
On Tue, Aug 2, 2011 at 18:42, Jonathan Rochkind wrote: > > You know that Solr distro comes with a jetty with a Solr in it, right, as an > example application? Even if you don't want to use it for some reason, that > would probably be the best model to look at for a working jetty with solr. Sure,