date:20081030

Re: Using Solrj

2008-10-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

First of all you need to index your data in Solr. I suggest DataImportHandler because it can help you join multiple tables and index data On Fri, Oct 31, 2008 at 10:20 AM, Raghunandan Rao <[EMAIL PROTECTED]> wrote: > Thank you so much. > > Here goes my Use case: > > I need to search the database f

Re: DIH and rss feeds

2008-10-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

run full-import with clean=false for full-import clean is set to true by default and for delta-import clean is false by default. On Fri, Oct 31, 2008 at 9:16 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: > I have a DataImportHandler configured to index from an RSS feed. It is a > "latest stuff" fe

Re: replication handler - compression

2008-10-30 Thread Walter Underwood

It could also be that the C version is a lot more efficient than the Java version and it could take longer regardless. I could not find a benchmark on that, but C is usually better for bit twiddling. wunder On 10/30/08 10:36 PM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > man gzip: > >

Re: replication handler - compression

2008-10-30 Thread Otis Gospodnetic

man gzip: -# --fast --best Regulate the speed of compression using the specified digit #, where -1 or --fast indicates the fastest compres- sion method (less compression) and -9 or --best indicates the slowest compression method (best compression). The

Re: Changing mergeFactor in mid-stream?

2008-10-30 Thread Otis Gospodnetic

Yes, you can change the mergeFactor. More important than the mergeFactor is this: 32 Pump it up as much as your hardware/JVM allows. And use appropriate -Xmx, of course. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: "Barnett, Jeffre

Re: DIH and rss feeds

2008-10-30 Thread Norberto Meijome

On Thu, 30 Oct 2008 20:46:16 -0700 "Lance Norskog" <[EMAIL PROTECTED]> wrote: > Now: a few hours later there are a different 100 "lastest" documents. How do > I add those to the index so I will have 200 documents? 'full-import' throws > away the first 100. 'delta-import' is not implemented. What

RE: Using Solrj

2008-10-30 Thread Raghunandan Rao

Thank you so much. Here goes my Use case: I need to search the database for collection of input parameters which touches 'n' number of tables. The data is very huge. The search query itself is so dynamic. I use lot of views for same search. How do I make use of Solr in this case? -Origi

Re: DIH and rss feeds

2008-10-30 Thread Jon Baer

Id like to say that deal is part of https://issues.apache.org/jira/browse/SOLR-783 but looking @ it closely it might be different. I think the issue is that delta-import does not have anything to match it's last_index_time against when doing feeds. Im also interested in that type of merge

DIH and rss feeds

2008-10-30 Thread Lance Norskog

I have a DataImportHandler configured to index from an RSS feed. It is a "latest stuff" feed. It reads the feed and indexes the 100 documents harvested from the feed. So far, works great. Now: a few hours later there are a different 100 "lastest" documents. How do I add those to the index so I wi

Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Norberto Meijome

On Thu, 30 Oct 2008 15:50:58 -0300 "Jorge Solari" <[EMAIL PROTECTED]> wrote: > > > in the schema file. or use Dismax query handler. b _ {Beto|Norberto|Numard} Meijome Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you

Re: Max Number of Facets

2008-10-30 Thread Jeryl Cook

wow ,30k in under 3 seconds On 10/30/08, Stephen Weiss <[EMAIL PROTECTED]> wrote: > I've actually seen cases on our site where it's possible to bring up > over 30,000 facets for one query. And they actually come up quickly - > like, 3 seconds. It takes longer for the browser to render them.

Re: Max Number of Facets

2008-10-30 Thread Jeryl Cook

I understand what you mean..I am building a system that will dynammically generate facets which could possible be thousands , but at most about 6 or 7 facets will be returned using a facet ranking algorithm so I get what you mean if I request in my query that I want 1000 faets back compared to

Re: Max Number of Facets

2008-10-30 Thread Stephen Weiss

I've actually seen cases on our site where it's possible to bring up over 30,000 facets for one query. And they actually come up quickly - like, 3 seconds. It takes longer for the browser to render them. -- Steve On Oct 30, 2008, at 6:04 PM, Ryan McKinley wrote: the only 'limit' is th

Re: Max Number of Facets

2008-10-30 Thread Ryan McKinley

the only 'limit' is the effect on your query times... you could have 1000+ facets if you are ok with the response time. Sorry to give the "it depends" answer, but it totally depends on your data and your needs. On Oct 30, 2008, at 7:28 AM, Jeryl Cook wrote: is there a limit on the numb

Re: How to get the min and max values from facets?

2008-10-30 Thread Chris Hostetter

: myfacet, ASC, limit 1 : myfacet, DESC, limit 1 : So I can get the first value and the last one. : : Do you think I will get more performance with this way than using stats? I'm guessing that by all measurable metrics, the StatsComponent will blow that out of the water -- i was just putting it

Changing mergeFactor in mid-stream?

2008-10-30 Thread Barnett, Jeffrey

The http://wiki.apache.org/lucene-java/ImproveIndexingSpeed page suggests that indexing will be sped up by using higher values of mergeFactor, while search speed improves with lower values. I need to create an index using multiple batches of documents. My question is, can I begin building with

Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Jorge Solari

I didn't mean with that it was the way to define the default field it in the schema, only a generic way to say "default field name". The default field name, seems to be "text" in your case. If the search query doesn't say on which field to search, the word will be searched in that field. in the

Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Yerraguntla

Never mind, I understand now. I have text. I was searching on a string field with space in it and with no quotes. This is causing to scan for text fields(since default search field is text) in the schema. Also in my schema there is an indexed field(AnimalNameText) which is not populated whi

Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Yerraguntla

Hmm, I dont have any defined in the schema.xml. Can you give the exact syntax how it looks like in schema.xml I have text. Does it mean if sufficient requested count not available, it looks for the search string in any of the text fields that are indexed? Thanks Ravi Jorge Solari wrote: >

Re: How to get the min and max values from facets?

2008-10-30 Thread Vincent Pérès

Hello, Yes I understand, like : myfacet, ASC, limit 1 myfacet, DESC, limit 1 So I can get the first value and the last one. Do you think I will get more performance with this way than using stats? Thanks ! Vincent -- View this message in context: http://www.nabble.com/How-to-get-the-min-and-m

Re: How to get the min and max values from facets?

2008-10-30 Thread Chris Hostetter

: hundred thousands)... so, I have to display the range (minimum and maximum : values) from a facet. Is there any way to do that? : I found the new statistics components, follow the link : : http://wiki.apache.org/solr/StatsComponent : But it's for solr 1.4. there haven't been many changes on the

Re: replication handler - compression

2008-10-30 Thread Chris Hostetter

: Yeah. I'm just not sure how much benefit in terms of data transfer this : will save. Has anyone tested this to see if this is even worth it? one mans trash is another mans treasure ... if you're replicating snapshoots very frequently within a single datacenter speed is critical and bandwidt

Re: Distributed search, standard request handler and more like this

2008-10-30 Thread Chris Hostetter

: I'm doing some expirements with the morelikethis functionality using the : standard request handler to see if it also works with distributed search (I : saw that it will not yet work with the MoreLikeThis handler, : https://issues.apache.org/jira/browse/SOLR-788). As far as I can see, this : also

Re: corrupt solr index on ec2

2008-10-30 Thread Bill Graham

Thanks Yonik, I'll try changing the lock type to seeing how that works. Looking closer at the logs I see the app was started at Oct 28, 2008 9:49:38, but not long afterwards it got it's first exception when warming the index: INFO: [] webapp=/solr path=/update params={} status=0 QTime=3 Oct 28,

Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Jorge Solari

Your query AnimalName:German Shepard. means AnimalName:German :Shepard. whichever the defaultField is Try with AnimalName:"German Shepard" or AnimalName:German AND AnimalName:Shepard. On Thu, Oct 30, 2008 at 12:58 PM, Yerraguntla <[EMAIL PROTECTED]> wrote: > > Hi, > > I have a data set w

Re: corrupt solr index on ec2

2008-10-30 Thread Michael McCandless

One small correction below: Yonik Seeley wrote: - I've seen OOM exceptions during warming. I've changed maxWarmingSearchers=1, which I suspect will do he trick OOM errors are really tricky - if they happen in the wrong place, it's hard to recover gracefully from. Correctly cleaning up after

Re: replication handler - compression

2008-10-30 Thread Walter Underwood

CPU was at 100%, it was not IO bound. --wunder On 10/30/08 8:58 AM, "christophe" <[EMAIL PROTECTED]> wrote: > Gziping on disk requires quite some I/O. I guess that on the fly zipping > should be faster. > > C. > > Walter Underwood wrote: >> About a factor of 2 on a small, optimized index. Gzipp

Re: replication handler - compression

2008-10-30 Thread christophe

Gziping on disk requires quite some I/O. I guess that on the fly zipping should be faster. C. Walter Underwood wrote: About a factor of 2 on a small, optimized index. Gzipping took 20 seconds, so it isn't free. $ cd index-copy $ du -sk 134336 . $ gzip * $ du -sk 62084 . wunder On 10/30/0

Solr Searching on other fields which are not in query

2008-10-30 Thread Yerraguntla

Hi, I have a data set with the following schema. PersonName:Text AnimalName:Text PlantName:Text < lot more attributes about each of them like nick name, animal nick name, plant generic name etc which are multually exclusive> UniqueId:long For each of the document data set, there will be on

Re: replication handler - compression

2008-10-30 Thread Walter Underwood

About a factor of 2 on a small, optimized index. Gzipping took 20 seconds, so it isn't free. $ cd index-copy $ du -sk 134336 . $ gzip * $ du -sk 62084 . wunder On 10/30/08 8:20 AM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > Yeah. I'm just not sure how much benefit in terms of data tran

Re: corrupt solr index on ec2

2008-10-30 Thread Yonik Seeley

On Thu, Oct 30, 2008 at 2:06 AM, Bill Graham <[EMAIL PROTECTED]> wrote: > I've been running solr 1.3 on an ec2 instance for a couple of weeks and I've > had some stability issues. It seems like I need to bounce the app once a day. > That I could live with and ultimately maybe troubleshoot, but wh

Re: replication handler - compression

2008-10-30 Thread Otis Gospodnetic

Yeah. I'm just not sure how much benefit in terms of data transfer this will save. Has anyone tested this to see if this is even worth it? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Erik Hatcher <[EMAIL PROTECTED]> > To: solr-user@l

RE: Performanec Lucene / Solr

2008-10-30 Thread Feak, Todd

I realize you said caching won't help because the searches are different, but what about Document caching? Is every document returned different? What's your hit rate on the Document cache? Can you throw memory at the problem by increasing Document cache size? I ask all this, as the Document cache

Re: replication handler - compression

2008-10-30 Thread Erik Hatcher

+1 - the GzipServletFilter is the way to go. Regarding request handlers reading HTTP headers, yeah,... this will improve, for sure. Erik On Oct 30, 2008, at 12:18 AM, Chris Hostetter wrote: : You are partially right. Instead of the HTTP header , we use a request : parameter. (Re

Re: Performanec Lucene / Solr

2008-10-30 Thread Kraus, Ralf | pixelhouse GmbH

Grant Ingersoll schrieb: Have you gone through http://wiki.apache.org/solr/SolrPerformanceFactors ? Can you explain a little more about your testcase, maybe even share code? I only know a little PHP, but maybe someone else who is better versed might spot something. I just wrote my JSP scrip

Re: Max Number of Facets

2008-10-30 Thread Yonik Seeley

On Thu, Oct 30, 2008 at 7:28 AM, Jeryl Cook <[EMAIL PROTECTED]> wrote: > is there a limit on the number of facets that i can create in > Solr?(dynamically generated facets.) Not really, It's practically limited by CPU and memory, which can vary widely with what the facet fields look like (number o

Re: Using Solrj

2008-10-30 Thread Erick Erickson

Generally, you need to get your head out of the database world and into the search world to be successful with Lucene. For instance, one of the cardinal tenets of database design is to normalize your data. It goes against every instinct to *denormalize* your data when creating an Lucene index expli

ApacheCon Reminder

2008-10-30 Thread Grant Ingersoll

For those attending ApacheCon in New Orleans next week, the Lucene Search and Machine Learning Birds of a Feather (BOF) will be held Wednesday night. Please indicate your interest at: http://wiki.apache.org/apachecon/BirdsOfaFeatherUs08 Also, note there are a number of Lucene/Solr/Mahout tal

Re: Using Solrj

2008-10-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

not really. you can explain your usecase and it will be more clear On Thu, Oct 30, 2008 at 6:20 PM, Raghunandan Rao <[EMAIL PROTECTED]> wrote: > Thanks Noble. > > So you mean to say that I need to create a view according to my query and > then index on the view and fetch? > > -Original Messag

Re: Performanec Lucene / Solr

2008-10-30 Thread Grant Ingersoll

Have you gone through http://wiki.apache.org/solr/ SolrPerformanceFactors ? Can you explain a little more about your testcase, maybe even share code? I only know a little PHP, but maybe someone else who is better versed might spot something. On Oct 30, 2008, at 8:39 AM, Kraus, Ralf | pixe

Re: Performanec Lucene / Solr

2008-10-30 Thread Mark Miller

All search reuqest are different so caching don´t do it for me. P.S. If caching is not helping you, turn it off. It costs to populate / maintain the cache, so if its not helping, its only hurting.

RE: Using Solrj

2008-10-30 Thread Raghunandan Rao

Thanks Noble. So you mean to say that I need to create a view according to my query and then index on the view and fetch? -Original Message- From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, October 30, 2008 6:16 PM To: solr-user@lucene.apache.org Subject: Re:

Re: Performanec Lucene / Solr

2008-10-30 Thread Yonik Seeley

On Thu, Oct 30, 2008 at 8:39 AM, Kraus, Ralf | pixelhouse GmbH <[EMAIL PROTECTED]> wrote: > Okay okay :-) I am writing a new JSP Handler for my requests as we speak :-) > I really hope performence will be better than with {wt=javabin} What are your requirements for requests/sec, and how many

Re: Using Solrj

2008-10-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

hi , There are two sides to this . 1. indexing (getting data into Solr) SolrJ or DataImportHandler can be used for this 2.querying . getting data out of solr. Here you do not have the choice of joining multiple tables. There only one index for Solr On Thu, Oct 30, 2008 at 5:34 PM, Raghunandan Ra

Re: Performanec Lucene / Solr

2008-10-30 Thread Kraus, Ralf | pixelhouse GmbH

Mark Miller schrieb: Kraus, Ralf | pixelhouse GmbH wrote: Mark Miller schrieb: Right now I am stress testing the performence and sending 2500 search request via JSON protocol and from my PHPUnit testcase. All search reuqest are different so caching don´t do it for me. Right now our old Lucene

Re: where's the bottleneck

2008-10-30 Thread Yonik Seeley

On Thu, Oct 30, 2008 at 1:02 AM, Barnett, Jeffrey <[EMAIL PROTECTED]> wrote: > I thought it was turned off already. ( Lucene vs Solr ?) Where do I make > this change? Comment out this part in your solrconfig.xml 2 4 -Yonik > -Original Message- > From: [EMAIL PROTECTED] [

Re: Performanec Lucene / Solr

2008-10-30 Thread Shalin Shekhar Mangar

On Thu, Oct 30, 2008 at 5:22 PM, Kraus, Ralf | pixelhouse GmbH < [EMAIL PROTECTED]> wrote: > Right now I am using this php classes to send and receiver my requests : > > - Apache_Solr_Service.php > - Responce.php > > It has the advantage that I don´t need to write extra JSP oder JAVA code... > Un

Using Solrj

2008-10-30 Thread Raghunandan Rao

Hi, I am trying to use Solrj for my web application. I am indexing a table using the @Field annotation tag. Now I need to index or query multiple tables. Like, get all the employees who are managers in Finance department (interacting with 3 entities). How do I do that? Does anyone have any ide

Re: Performanec Lucene / Solr

2008-10-30 Thread Mark Miller

Kraus, Ralf | pixelhouse GmbH wrote: Mark Miller schrieb: Right now I am stress testing the performence and sending 2500 search request via JSON protocol and from my PHPUnit testcase. All search reuqest are different so caching don´t do it for me. Right now our old Lucene-JSPs are avout 4 time

Re: Performanec Lucene / Solr

2008-10-30 Thread Kraus, Ralf | pixelhouse GmbH

Mark Miller schrieb: Right now I am stress testing the performence and sending 2500 search request via JSON protocol and from my PHPUnit testcase. All search reuqest are different so caching don´t do it for me. Right now our old Lucene-JSPs are avout 4 times faster than my SOLR Sollution :-(

Re: Performanec Lucene / Solr

2008-10-30 Thread Kraus, Ralf | pixelhouse GmbH

Shalin Shekhar Mangar schrieb: Well, with Lucene it is an API call in the same JVM in the same web application. With Solr, you are making HTTP calls across the network, serializing requests and de-serializing responses. So the comparison is not exactly apples to apples. Look at what Solr offers

Re: Performanec Lucene / Solr

2008-10-30 Thread Mark Miller

Shalin Shekhar Mangar wrote: On Thu, Oct 30, 2008 at 4:12 PM, Kraus, Ralf | pixelhouse GmbH < [EMAIL PROTECTED]> wrote: I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate from Lucene to Solr because of the much better plugins and search functions. Very nice!

Max Number of Facets

2008-10-30 Thread Jeryl Cook

is there a limit on the number of facets that i can create in Solr?(dynamically generated facets.) -- Jeryl Cook /^\ Pharaoh /^\ http://pharaohofkush.blogspot.com/ "Whether we bring our enemies to justice, or bring justice to our enemies, justice will be done." --George W. Bush, Address to a Join

Re: Performanec Lucene / Solr

2008-10-30 Thread Shalin Shekhar Mangar

On Thu, Oct 30, 2008 at 4:12 PM, Kraus, Ralf | pixelhouse GmbH < [EMAIL PROTECTED]> wrote: > I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate > from Lucene to Solr because of the much better plugins and search > functions. Very nice! > Right now I am stress testing the p

Performanec Lucene / Solr

2008-10-30 Thread Kraus, Ralf | pixelhouse GmbH

Hello, I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate from Lucene to Solr because of the much better plugins and search functions. Right now I am stress testing the performence and sending 2500 search request via JSON protocol and from my PHPUnit testcase. All search r

How to get the min and max values from facets?

2008-10-30 Thread Vincent Pérès

Hello, I'm using Solr 1.3. I would like to get only minimum and maximum values from a facet. In fact I'm using a range to get the results : [value TO value], and I don't need to get the facets list in my XML results (which could be more than hundred thousands)... so, I have to display the range (

Re: Highlighting and fields

2008-10-30 Thread christophe

Hi Lars, Thanks for it: it works great. BR Christophe Lars Kotthoff wrote: I'm doing the following query: q=text:abc AND type:typeA And I ask to return highlighting (query.setHighlight(true);). The search term for field "type" (typeA) is also highlighted in the "text" field. Anyway to avoid

57 matches

Mail list logo