date:20081029

RE: replication handler - compression

2008-10-29 Thread Simon Collins

I have now optimized the index - down to 325mb, it compresses down to 20mb. I think the new replication thing in solr is great, but if it could compress the files it's sending, it would be an awful lot more useful when replicating, as we are, between sites. ---

Re: replication handler - compression

2008-10-29 Thread christophe

Hi, Is the new replication feature based on HTTP requests between sites ? If yes, then I guess it might be possible to configure an HTTP server with mod_deflate so the data is compressed on the fly. C. Simon Collins wrote: I have now optimized the index - down to 325mb, it compresses down to

Re: how to improve concurrent request performance and stress testing

2008-10-29 Thread sunnyfr

Hi, I'm trying as well to stress test solr. I would love some advice to manage it properly. I'm using solr 1.3 and tomcat55. Thanks a lot, zqzuk wrote: > > Hi, I am doing a stress testing of my solr application to see how many > concurrent requests it can handle and how long it takes. But I m

Re: replication handler - compression

2008-10-29 Thread Noble Paul നോബിള്‍ नोब्ळ्

open a JIRA issue. we will use a gzip on both ends of the pipe . On the slave side you can say true as an extra option to compress and send data from server --Noble On Wed, Oct 29, 2008 at 3:06 PM, Simon Collins <[EMAIL PROTECTED]> wrote: > I have now optimized the index - down to 325mb, it com

Re: how to improve concurrent request performance and stress testing

2008-10-29 Thread zqzuk

Hi, try to firstly have a look at http://wiki.apache.org/solr/SolrCaching the section on firstsearcher and warming. Search engines rely on caching, so first searches will be slow. I think to be fair testing it is necessary to warm up the search engine by sending most frequently used and/or most

Search by example

2008-10-29 Thread Marian Steinbach

Hi! Out of curiosity: How would one implement "search by example" with Solr? What I mean: Say I have a result entry with these fields/attributes: id: 1 title: blue big slow car color: blue size: 30 maxspeed: 100 make: buses inc. What would I have to do in order to find "similar" items? Do a se

Re: Search by example

2008-10-29 Thread Jaco

Hi, This can be done with 'more like this' functionality in Solr: http://wiki.apache.org/solr/MoreLikeThis Bye, Jaco. 2008/10/29 Marian Steinbach <[EMAIL PROTECTED]> > Hi! > > Out of curiosity: How would one implement "search by example" with Solr? > > What I mean: > > Say I have a result ent

Multicore

2008-10-29 Thread sanraj25

Hi Now I am using SOLR and two different type of data indexed and searched.For ex: 1) JobRec 2) JobSel I stored the data by specify type:JobRec similarly I specify type:JobSel while indexing .If I want to retrieve the data i will get by querying with type:job rec. This is perfect

Re: replication handler - compression

2008-10-29 Thread Bill Au

Do keep in mind that compression is a CPU intensive process so it is a trade off between CPU utilization and network bandwidth. I have see cases where compressing the data before a network transfer ended up being slower than without compression because the cost of compression and un-compression wa

Distributed search, standard request handler and more like this

2008-10-29 Thread Jaco

Hello, I'm doing some expirements with the morelikethis functionality using the standard request handler to see if it also works with distributed search (I saw that it will not yet work with the MoreLikeThis handler, https://issues.apache.org/jira/browse/SOLR-788). As far as I can see, this also d

Re: how to improve concurrent request performance and stress testing

2008-10-29 Thread sunnyfr

Dont get really with httpstone, I did my ant dist it works fine, but then don't get ... when I do a java -jar I've an error: ~/httpstone-read-only/dist/lib$ java -jar httpstone.jar Failed to load Main-Class manifest attribute from httpstone.jar Any idea? Sorry I'm new in java zqzuk wrote: >

Re: replication handler - compression

2008-10-29 Thread Walter Underwood

Why invent something when compression is standard in HTTP? --wunder On 10/29/08 4:35 AM, "Noble Paul നോബിള്‍ नोब्ळ्" <[EMAIL PROTECTED]> wrote: > open a JIRA issue. we will use a gzip on both ends of the pipe . On the slave > side you can say true as an extra option to compress and > send data fr

Re: Search by example

2008-10-29 Thread Marian Steinbach

Awesome! Thanks for the pointer, I will check this out. Marian On Wed, Oct 29, 2008 at 1:52 PM, Jaco wrote: > Hi, > > This can be done with 'more like this' functionality in Solr: > http://wiki.apache.org/solr/MoreLikeThis

RE: Best way to prevent max warmers error

2008-10-29 Thread Chris Hostetter

: As far as our application goes, Commits and reads are done to the index : during the normal business hours. However, we observed the max warmers : error happening during a nightly job when the only operation is 4 : parallel threads commits data to index and Optimizes it finally. We : increas

Re: how to improve concurrent request performance and stress testing

2008-10-29 Thread sunnyfr

just a question about your httpstone's configuration ? I would like to know how did you simulate several word search ... ?? Did you create a lot of different workers with lof of different word search ? Thanks, zqzuk wrote: > > Hi, > > try to firstly have a look at http://wiki.apache.o

Re: FileNotFoundException on slave after replication - script bug?

2008-10-29 Thread Chris Hostetter

I think you may be right i've opened SOLR-830 : We may have identified the root cause but wanted to run it by the community. : We figure there is a bug in the snappuller shell script, line 181: -Hoss

Highlighting and fields

2008-10-29 Thread christophe

Hi, I'm doing the following query: q=text:abc AND type:typeA And I ask to return highlighting (query.setHighlight(true);). The search term for field "type" (typeA) is also highlighted in the "text" field. Anyway to avoid this ? Thanks Christophe

Re: Solr 1.3 Maven Artifact Problem

2008-10-29 Thread Chris Hostetter

: 1) solr-core artifact contains org.apache.solr.client.solrj packages, and at : the same time, the solr-core artifact depends on the solr-solrj artifact. what you are seeing isn't specific to the maven jars, that's the way it is in hte standard release. i believe the inclusion of solrj code in

Re: how to improve concurrent request performance and stress testing

2008-10-29 Thread zqzuk

Yes, I created different workers and with same or different queries. sorry it has been a while since I used it, and the link to its source code (no jars) should be here: http://code.google.com/p/httpstone/source/checkout sunnyfr wrote: > > just a question about your httpstone's configuration

Re: Solr 1.3 Maven Artifact Problem

2008-10-29 Thread Shalin Shekhar Mangar

On Wed, Oct 29, 2008 at 9:11 PM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > > i believe the inclusion of solrj code in the core jar is intentional, the > core jar is intended (as i understand it) to encapsulate everything needed > to run "Solr" (and because of the built in distributed search feat

Re: Multicore

2008-10-29 Thread Mark Miller

Depends on your use cases. Having things in one index will generally make things easier in the long run, and generally shouldn't be a bottleneck. However, if the two types will be treated very differently it may make sense to have two cores - say one type is not changed very often, while the ot

Re: Lucene project & subprojects news RSS feed?

2008-10-29 Thread Chris Hostetter

: > On the main lucene web page: http://lucene.apache.org/index.html : > There is a list of news items spanning all the lucene subprojects. Does FYI: that news section is just a manually maintained list of items as regular forrest content (forrest is the tool used to generate the site and buil

Re: Highlighting and fields

2008-10-29 Thread Mark Miller

christophe wrote: Hi, I'm doing the following query: q=text:abc AND type:typeA And I ask to return highlighting (query.setHighlight(true);). The search term for field "type" (typeA) is also highlighted in the "text" field. Anyway to avoid this ? Thanks Christophe I havn't used solrj really,

Re: Solr 1.3 Maven Artifact Problem

2008-10-29 Thread Chris Hostetter

: > I'm not sure if there's any reason for solr-core to declare a maven : > dependency on solr-solrj. : When creating the POMs, I had (incorrectly) assumed that the core jar does : not contain SolrJ classes, hence the dependency. I consider it a totally justifiable assumption. the current packa

Re: replication handler - compression

2008-10-29 Thread Noble Paul നോബിള്‍ नोब्ळ्

we are not doing anything non-standard GZipInputStream/GZipOutputStream are standards. But asking users to setup an extra apache is not fair if we can manage it with say 5 lines of code On Wed, Oct 29, 2008 at 7:44 PM, Walter Underwood <[EMAIL PROTECTED]> wrote: > Why invent something when compres

Re: replication handler - compression

2008-10-29 Thread Walter Underwood

You propose to do compressed transfers over HTTP ignoring the standard support for compressed transfers in HTTP. Programming that with a library doesn't make it "standard". In Ultraseek, we implemented index synchronization over HTTP with compression. It wasn't that hard. I doubt that compression

Re: Index partitioning

2008-10-29 Thread Chris Hostetter

: I want to partition my index based on category information. Also, while : indexing I want to store particular category data to corresponding index : partition. In the same way I need to search for category information on : corresponding partition.. I found some information on wiki link : h

exceeded limit of maxWarmingSearchers

2008-10-29 Thread Jon Drukman

I am getting this error quite frequently on my Solr installation: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=8, try again later. I've done some googling but the common explanation of it being related to autocommit doesn't a

RE: exceeded limit of maxWarmingSearchers

2008-10-29 Thread Feak, Todd

Have you looked at how long your warm up is taking? If it's taking longer to warm up a searcher then it does for you to do an update, you will be behind the curve and eventually run into this no matter how big that number. -Original Message- From: news [mailto:[EMAIL PROTECTED] On Behalf

Re: Error in Integrating JBoss 4.2 and Solr-1.3.0:

2008-10-29 Thread sbutalia

I'm having the same issue.. have you had any progress with this? -- View this message in context: http://www.nabble.com/Error-in-Integrating-JBoss-4.2-and-Solr-1.3.0%3A-tp20202032p20234054.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlighting and fields

2008-10-29 Thread Lars Kotthoff

> I'm doing the following query: > q=text:abc AND type:typeA > And I ask to return highlighting (query.setHighlight(true);). The search > term for field "type" (typeA) is also highlighted in the "text" field. > Anyway to avoid this ? Use setHighlightRequireFieldMatch(true) on the query object [1]

Qsol (or surround or xmlqueryparser...) in Solr

2008-10-29 Thread Chris Harris

I was just looking at Mark Miller's Qsol parser for Lucene ( http://www.myhardshadow.com/qsol.php), and my users would really like to have a similar ability to combine proximity and boolean search in arbitrary, nested ways. The simplest use case I'm interested in is "phrase proximity", where you sa

date range query performance

2008-10-29 Thread Alok Dhir

Hi -- using solr 1.3 -- roughly 11M docs on a 64 gig 8 core machine. Fairly simple schema -- no large text fields, standard request handler. 4 small facet fields. The index is an event log -- a primary search/retrieval requirement is date range queries. A simple query without a date rang

Re: date range query performance

2008-10-29 Thread Chris Harris

Do you need to search down to the minutes and seconds level? If searching by date provides sufficient granularity, for instance, you can normalize all the time-of-day portions of the timestamps to midnight while indexing. (So index any event happening on Oct 01, 2008 as 2008-10-01T00:00:00Z.) That

Re: date range query performance

2008-10-29 Thread Alok Dhir

Well, no - we don't care so much about the seconds, but hours & minutes are indeed crucial. --- Alok K. Dhir Symplicity Corporation www.symplicity.com (703) 351-0200 x 8080 [EMAIL PROTECTED] On Oct 29, 2008, at 4:41 PM, Chris Harris wrote: Do you need to search down to the minutes and seconds

Re: exceeded limit of maxWarmingSearchers

2008-10-29 Thread Jon Drukman

Feak, Todd wrote: Have you looked at how long your warm up is taking? If it's taking longer to warm up a searcher then it does for you to do an update, you will be behind the curve and eventually run into this no matter how big that number. Most of them say warmupTime=0. It ranges from 0 to

RE: date range query performance

2008-10-29 Thread Feak, Todd

It strikes me that removing just the seconds could very well reduce overhead to 1/60 of original. 30 second query turns into 500ms query. Just a swag though. -Todd -Original Message- From: Alok Dhir [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 29, 2008 1:48 PM To: solr-user@lucene.

Re: replication handler - compression

2008-10-29 Thread Chris Hostetter

My understanding of Noble's comment (and i could be wrong, i'm reading between the lines) is that if you specify the new setting he's suggesting when initializing the replication handler on the slave, then the slave should start using an "Accept-Encoding: gzip" header when querying the master,

Re: Qsol (or surround or xmlqueryparser...) in Solr

2008-10-29 Thread Mark Miller

Chris Harris wrote: I was just looking at Mark Miller's Qsol parser for Lucene ( http://www.myhardshadow.com/qsol.php), and my users would really like to have a similar ability to combine proximity and boolean search in arbitrary, nested ways. The simplest use case I'm interested in is "phrase pr

Re: DocSet: BitDocSet or HashDocSet ?

2008-10-29 Thread Chris Hostetter

: The doc of HashDocSet says "t can be a better choice if there are few : docs in the set" . What does 'few' means in this context ? it's relative the total size of your index. if you have a million docs, but you are dealing with DocSets that are only going to contain 10 docs, then both the m

RE: timeouts

2008-10-29 Thread Chris Hostetter

: Tomcat is using about 98mb memory, mysql is about 500mb. Tomcat : completely freezes up - can't do anything other than restart the : service. a thread dump from the jvm running tomcat would probably be helpful in figuring out what's going on : timing out well before getting to the commit. As

Re: date range query performance

2008-10-29 Thread Erick Erickson

I've also seen the suggestion (more from a pure Lucene perspective) of breaking apart your dates. Remember that the time/space issues are due to the number of terms. So it's possible (although I haven't tried it) to, index many fewer distinct terms. e.g. break your dates into some number of fields,

where's the bottleneck

2008-10-29 Thread Barnett, Jeffrey

I saw a similar subject posted earlier. This is not a continuation of that thread, but the problem is similar. I have a large, fast, dedicated machine, that despite boosting various parameters in solrconfig.xml (attached) and in the JVM, utilizes at most 10% of the cpu while importing: (from t

Re: where's the bottleneck

2008-10-29 Thread Yonik Seeley

On Wed, Oct 29, 2008 at 9:48 PM, Barnett, Jeffrey <[EMAIL PROTECTED]> wrote: > Reported import rates start a 70 docs per second, and decrease as more > records are added. It might just be segment merges (that takes more time as segments grow in size). >From the solrconfig.xml I see you have autoc

Re: replication handler - compression

2008-10-29 Thread Noble Paul നോബിള്‍ नोब्ळ्

Hoss, You are partially right. Instead of the HTTP header , we use a request parameter. (RequestHandlers cannot read HTP headers). If the param is present it wraps the response in an zip outputstream. It is configured in the slave because Every slave may not want compression. . Slaves which are nea

Re: replication handler - compression

2008-10-29 Thread Chris Hostetter

: You are partially right. Instead of the HTTP header , we use a request : parameter. (RequestHandlers cannot read HTP headers). If the param is hmmm, i'm with walter: we shouldn't invent new mechanisms for clients to request compression over HTTP from servers. replicatoin is both special enoug

RE: where's the bottleneck

2008-10-29 Thread Barnett, Jeffrey

I thought it was turned off already. ( Lucene vs Solr ?) Where do I make this change? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Wednesday, October 29, 2008 11:28 PM To: solr-user@lucene.apache.org Subject: Re: where's the bottlen

Re: exceeded limit of maxWarmingSearchers

2008-10-29 Thread Shalin Shekhar Mangar

On Thu, Oct 30, 2008 at 2:46 AM, Jon Drukman <[EMAIL PROTECTED]> wrote: > > Most of them say warmupTime=0. It ranges from 0 to 37. I hope that is > msec and not seconds!! > Correct, that is in milliseconds. -- Regards, Shalin Shekhar Mangar.

corrupt solr index on ec2

2008-10-29 Thread Bill Graham

Hi, I've been running solr 1.3 on an ec2 instance for a couple of weeks and I've had some stability issues. It seems like I need to bounce the app once a day. That I could live with and ultimately maybe troubleshoot, but what's more disturbing is that three times in the last 2 weeks my index ha

49 matches

Mail list logo