date:20081105

Re: Search in SOLR multi cores in a single request

2008-11-05 Thread Shalin Shekhar Mangar

The idea behind multicore is that you will use them if you have completely different type of documents (basically multiple schemas). You might want to look at Distributed Search which allows for sharding of the data on multiple servers and searching them all. http://wiki.apache.org/solr/Distribute

Search in SOLR multi cores in a single request

2008-11-05 Thread gurudev

I have been reading the SOLR 1.3 wiki, which says that to fetch documents from each cores in a multi-cores setup we need to request each core independently. What i was under impression that SOLR multi-core feature might be using lucene's multisearcher to search among multiple cores. Anyone with

Re: Large Data Set Suggestions

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्

The performance of DIH is likely to be faster than SolrJ. Because , it does not have the overhead of an http request. What is your data source? I am assuming it is xml. SolrJ cannot directly index xml . You may need to read docs from xml before solrj can index it. --Noble On Wed, Nov 5, 2008 at

Re: Regex Transformer Error

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्

did you try w/o escaping the '<' characters? On Wed, Nov 5, 2008 at 11:48 PM, Ahmed Hammad <[EMAIL PROTECTED]> wrote: > Hi, > > I am using Solr 1.3 data import handler. One of my table fields has html > tags, I want to strip it of the field text. So obviously I need the Regex > Transformer. > > I

Bias score proximity for a given field

2008-11-05 Thread Nguyen, Joe

Hi Is there a way to specify a range boosting for a numeric/date field? Suppose I have articles whose published dates are in 2005,...,2008,...,2011. I want to boost the score of 2008 article by 20%. Articles whose published dates 3-year distance from 2008 article would be boosted by 0%, e.g. 2

RE: Regex Transformer Error

2008-11-05 Thread Norskog, Lance

There is a nice HTML stripper inside Solr. "solr.HTMLStripStandardTokenizerFactory" -Original Message- From: Ahmed Hammad [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 05, 2008 10:43 AM To: solr-user@lucene.apache.org Subject: Re: Regex Transformer Error Hi, It works with the att

Re: How to use multicore feature in JBOSS

2008-11-05 Thread Norberto Meijome

On Tue, 4 Nov 2008 23:45:40 -0800 (PST) con <[EMAIL PROTECTED]> wrote: > But for the first question, I am still not clear. > I think to use the multicore feature we should inform the server. In the > Jetty server, we are starting the server using: java > -Dsolr.solr.home=multicore -jar start.jar

Re: Large Data Set Suggestions

2008-11-05 Thread souravm

Hi Fergus, Does the 6.6m doc resides on a single box (node) or multiple boxes ? Do u use distributed search ? Regards, Sourav - Original Message - From: Fergus McMenemie <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wed Nov 05 08:21:45 2008 Subject: Re: Large Data Set Sugge

Re: Throughput Optimization

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 5:18 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > I'd like to integrate this improvement into my deployment. Is it just a > matter of getting the latest Lucene jars (Lucene nightly build)? You need to apply this source code patch to Solr: https://issues.apache.org/jira/browse/

Re: Throughput Optimization

2008-11-05 Thread wojtekpia

I'd like to integrate this improvement into my deployment. Is it just a matter of getting the latest Lucene jars (Lucene nightly build)? Yonik Seeley wrote: > > You're probably hitting some contention with the locking around the > reading of index files... this has been recently improved in Luc

Highlighting Oddities

2008-11-05 Thread Chris Harris

I'm testing out the default (gap) fragmenter with some simple, single-word queries on a patched 1.3.0 release populated with some real-world data. (I think the primary quirk in my setup is that I'm using ShingleFilterFactory to put word bigrams (aka shingles) into my index. I was worried that this

Re: Throughput Optimization

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 2:44 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > I'll try changing my other caches to LRUCache and observe performance. > Interestingly, the FastLRUCache has given me a ~10% increase in performance, > much lower than I've read on the SOLR-667 thread. That's better than I woul

Re: Throughput Optimization

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 2:29 PM, Feak, Todd <[EMAIL PROTECTED]> wrote: > Yonik said something about the FastLRUCache giving the most gain for > high hit-rates and the LRUCache being faster for low hit-rates. Right, for single-threaded requests. FastLRUCache has faster gets and slower puts (on aver

RE: Throughput Optimization

2008-11-05 Thread wojtekpia

I'll try changing my other caches to LRUCache and observe performance. Interestingly, the FastLRUCache has given me a ~10% increase in performance, much lower than I've read on the SOLR-667 thread. Would compressing some of my stored fields significantly improve performance? Most of my stored fie

RE: Throughput Optimization

2008-11-05 Thread Feak, Todd

Yonik said something about the FastLRUCache giving the most gain for high hit-rates and the LRUCache being faster for low hit-rates. It's in his Nov 1 comment on SOLR-667. I'm not sure if anything changed since then, as it's an active issue, but you may want to try the LRUCache for your query cache

Re: Retrieving a non-indexed but stored field

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 2:09 PM, Andrew Nagy <[EMAIL PROTECTED]> wrote: > Nope - I made the schema change and then indexed all of my content. > > I can confirm that the URL string is included, cause when I change my schema > back to have both stored and indexed, it shows the URL data in the search

Re: Retrieving a non-indexed but stored field

2008-11-05 Thread Erick Erickson

What's the query you're hitting SOLR with? If it's on the URL field, that would match your behavior I.e. if you're getting results based upon whether you index the field or not, it would be neatly explained by whether you're *searching* on that field. Best [EMAIL PROTECTED] P.S. Luke might h

RE: Retrieving a non-indexed but stored field

2008-11-05 Thread Andrew Nagy

Nope - I made the schema change and then indexed all of my content. I can confirm that the URL string is included, cause when I change my schema back to have both stored and indexed, it shows the URL data in the search results. When I change it to stored and not indexed, no data is returned. A

RE: Throughput Optimization

2008-11-05 Thread wojtekpia

My documentCache hit rate is ~.7, and my queryCache is ~.03. I'm using FastLRUCache on all 3 of the caches. Feak, Todd wrote: > > What are your other cache hit rates looking like? > Which caches are you using the FastLRUCache on? > > -Todd Feak > > -Original Message- > From: wojtekpia

Re: Throughput Optimization

2008-11-05 Thread wojtekpia

Where is the alt directory in the source tree (or what is the JIRA issue number)? I'd like to apply this patch and re-run my tests. Does changing the lockType in solrconfig.xml address this issue? (My lockType is the default - single). markrmiller wrote: > > The latest alt directory patch uses

Re: Redirecting output of post.jar and start.jar

2008-11-05 Thread Ryan McKinley

On Nov 5, 2008, at 7:30 AM, Muhammed Sameer wrote: Salaam, When I run post.jar or start.jar its throws a lot of information on the screen, I even tried redirecting the info but that does not seem to help, I have configured a cron to run post.jar to run every 2mins to keep the index updat

Re: Regex Transformer Error

2008-11-05 Thread Ahmed Hammad

Hi, It works with the attribute regex="<(.|\n)*?>" Sorry for the disturbance. Regards, ahmd On Wed, Nov 5, 2008 at 8:18 PM, Ahmed Hammad <[EMAIL PROTECTED]> wrote: > Hi, > > I am using Solr 1.3 data import handler. One of my table fields has html > tags, I want to strip it of the field text.

Re: Retrieving a non-indexed but stored field

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 11:47 AM, Andrew Nagy <[EMAIL PROTECTED]> wrote: > Sorry for the late follow-up. I am doing this, but get nothing back. Did you change the field to "stored" in the schema after you added the document? I've never seen anyone having this problem, so perhaps verify that you ar

Regex Transformer Error

2008-11-05 Thread Ahmed Hammad

Hi, I am using Solr 1.3 data import handler. One of my table fields has html tags, I want to strip it of the field text. So obviously I need the Regex Transformer. I added transformer="RegexTransformer" attribute to my entity and a new field with: Every thing works fine. The text is replace wi

Re: question about Solr directories on mounted file systems

2008-11-05 Thread Walter Underwood

I do not recommend using Lucene or Solr on a mounted file system. My implementation was 100X faster after I moved it from NFS to local disk. --wunder On 11/5/08 10:01 AM, "Jim Adams" <[EMAIL PROTECTED]> wrote: > I have an application that is using SOLR on a mounted file system. However, > machin

question about Solr directories on mounted file systems

2008-11-05 Thread Jim Adams

I have an application that is using SOLR on a mounted file system. However, machine or human error can sometimes unmount the file system. This causes Solr to write index files to a different area from the index I am using. This also means that the index instance becomes corrupt, because some ent

Need help with Parsing user Query

2008-11-05 Thread Rajiv2

Hi, I need help with solving a particular problem I'm having. I have a one box search website where users can type "cosmetic surgery houston tx" or "cosmetic surgery 22151". I need to come up with a reliable way to parse out the geo terms/or zipcodes from the user query so that I can submit a qu

RE: Throughput Optimization

2008-11-05 Thread Feak, Todd

What are your other cache hit rates looking like? Which caches are you using the FastLRUCache on? -Todd Feak -Original Message- From: wojtekpia [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 05, 2008 8:15 AM To: solr-user@lucene.apache.org Subject: Re: Throughput Optimization Yes,

RE: Retrieving a non-indexed but stored field

2008-11-05 Thread Andrew Nagy

Sorry for the late follow-up. I am doing this, but get nothing back. Can anyone replicate this problem? Andrew From: Erik Hatcher [EMAIL PROTECTED] Sent: Tuesday, October 14, 2008 12:36 PM To: solr-user@lucene.apache.org Subject: Re: Retrieving a non-inde

Re: Throughput Optimization

2008-11-05 Thread Yonik Seeley

On Wed, Nov 5, 2008 at 11:14 AM, wojtekpia <[EMAIL PROTECTED]> wrote: > Yes, I am seeing evictions. I've tried setting my filterCache higher, but > then I start getting Out Of Memory exceptions. My filterCache hit ratio is > > .99. It looks like I've hit a RAM bound here. Evictions on the filterCa

RE: Throughput Optimization

2008-11-05 Thread Feak, Todd

If you are seeing < 90% CPU usage and are not IO (File or Network) bound, then you are most probably bound by lock contention. If your CPU usage goes down as you throw more threads at the box, that's an even bigger indication that that is the issue. A good profiling tool should help you locate thi

Re: Throughput Optimization

2008-11-05 Thread christophe

Does the number of searcher affect CPU usage ? Not totally sure about it but I think some versions of Tomcat were not totally scalable over 4 CPUs (or 4 cores). C. wojtekpia wrote: Yes, I am seeing evictions. I've tried setting my filterCache higher, but then I start getting Out Of Memory exc

RE: Retrieving a non-indexed but stored field

2008-11-05 Thread Andrew Nagy

Sorry for the late follow-up. I am doing this, but get nothing back. Can anyone replicate this problem? Andrew From: Erik Hatcher [EMAIL PROTECTED] Sent: Tuesday, October 14, 2008 12:36 PM To: solr-user@lucene.apache.org Subject: Re: Retrieving a non-inde

Re: Large Data Set Suggestions

2008-11-05 Thread Fergus McMenemie

>Greetings! > >I've been asked to do some indexing performance testing on Solr 1.3 >using large XML document data sets (10M-60M docs) with DIH versus SolrJ. > > >Does anyone have any suggestions where I might find a good data set this >size? > >I saw the wikipedia dump reference in the DIH wik

Re: Throughput Optimization

2008-11-05 Thread wojtekpia

Yes, I am seeing evictions. I've tried setting my filterCache higher, but then I start getting Out Of Memory exceptions. My filterCache hit ratio is > .99. It looks like I've hit a RAM bound here. I ran a test without faceting. The response times / throughput were both significantly higher, there

Large Data Set Suggestions

2008-11-05 Thread Steven Anderson

Greetings! I've been asked to do some indexing performance testing on Solr 1.3 using large XML document data sets (10M-60M docs) with DIH versus SolrJ. Does anyone have any suggestions where I might find a good data set this size? I saw the wikipedia dump reference in the DIH wiki, but tha

Re: Throughput Optimization

2008-11-05 Thread Mark Miller

The latest alt directory patch uses It. - Mark On Nov 5, 2008, at 9:25 AM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: You're probably hitting some contention with the locking around the reading of index files... this has been recently improved in Lucene for non-Windows boxes, and we're integra

Fwd: [Solr Wiki] Update of "SolrResources" by GrantIngersoll

2008-11-05 Thread Shalin Shekhar Mangar

Thank you Grant, very nicely written! http://www.ibm.com/developerworks/java/library/j-solr-update/?S_TACT=105AGX01&S_CMP=HP -- Forwarded message -- From: Apache Wiki <[EMAIL PROTECTED]> Date: Wed, Nov 5, 2008 at 7:25 PM Subject: [Solr Wiki] Update of "SolrResources" by GrantInger

Re: Throughput Optimization

2008-11-05 Thread Yonik Seeley

You're probably hitting some contention with the locking around the reading of index files... this has been recently improved in Lucene for non-Windows boxes, and we're integrating that into Solr (should def be in the next release). -Yonik On Tue, Nov 4, 2008 at 9:01 PM, wojtekpia <[EMAIL PROTECT

Trying to run solr-1.3.0 under tomcat 5.5.20 on OS X 10.5.5

2008-11-05 Thread Fergus McMenemie

Hello all, I downloaded everything and set it up as per the instructions, and while it does run under jetty, I can not get it to start under tomcat at all. I get the following errors. This is with solrconfig.xml straight from the tgz file. HTTP Status 500 - Severe errors in solr configuratio

Redirecting output of post.jar and start.jar

2008-11-05 Thread Muhammed Sameer

Salaam, When I run post.jar or start.jar its throws a lot of information on the screen, I even tried redirecting the info but that does not seem to help, I have configured a cron to run post.jar to run every 2mins to keep the index updated, and each time this runs it throws a lot of stuff on th

Re: Need to write a start.jar file

2008-11-05 Thread Muhammed Sameer

Salaam, Thanks for the response, I'll only change this if I need any customization done Regards, Muhammed Sameer --- On Wed, 11/5/08, Erik Hatcher <[EMAIL PROTECTED]> wrote: > From: Erik Hatcher <[EMAIL PROTECTED]> > Subject: Re: Need to write a start.jar file > To: solr-user@lucene.apache.org >

Re: How to handle large field values.

2008-11-05 Thread Luca Molteni

This worked, thank you very much. Any idea on how I can help documenting it? Can I write in the wiki? maybe in http://wiki.apache.org/solr/SolrConfigXml#head-13e17f74dde0751b8a7cfe539f631d58029b8080 L.M. 2008/11/5 Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]>: > the fl must have the unique i

Re: How to handle large field values.

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्

the fl must have the unique id field also. because if fl is mentioned it returns only the mentioned one On Wed, Nov 5, 2008 at 4:36 PM, Luca Molteni <[EMAIL PROTECTED]> wrote: > Uhm, this works great when using only one server, because I can > specify the fields in the configuration file, but It g

ndis push search results to top

2008-11-05 Thread Simon Collins

Hi Is there an easy way (bearing in mind i'm still very new to this solr lark) to push certain items to the top of search results. For instance, if customers are searching for boots on our site, i might want to push up higher margin products to the top of the results, or push popular items

Re: How to handle large field values.

2008-11-05 Thread Luca Molteni

Uhm, this works great when using only one server, because I can specify the fields in the configuration file, but It gives me a nice nullpointer exception when using distributed shards: HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.return

Re: Throughput Optimization

2008-11-05 Thread Erik Hatcher

One quick question are you seeing any evictions from your filterCache? If so, it isn't set large enough to handle the faceting you're doing. Erik On Nov 4, 2008, at 8:01 PM, wojtekpia wrote: I've been running load tests over the past week or 2, and I can't figure out my sy

Re: Need to write a start.jar file

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्

can you tell what exactly you wish to customize? On Wed, Nov 5, 2008 at 10:46 AM, Muhammed Sameer <[EMAIL PROTECTED]> wrote: > Salaam, > > I read somewhere that it is better to write a new start.jar file than use the > one that is provided within the example directory, can someone please guide

Re: How to handle large field values.

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्

the 'fl' parameter can be added to the defaults for your search handler in solrconfig.xml On Wed, Nov 5, 2008 at 3:22 PM, Luca Molteni <[EMAIL PROTECTED]> wrote: > Hello everybody, > > dealing with very large fields, let's say text documents, I found that there > is a global slowness (on my comput

Re: Need to write a start.jar file

2008-11-05 Thread Erik Hatcher

I've never heard of this need to provide a customized start.jar. Could you send us a pointer to where you read that if you still have that available? But, no, there is no need to provide a different start.jar. However, Jetty is really just one example of how you deploy Solr - any modern

How to handle large field values.

2008-11-05 Thread Luca Molteni

Hello everybody, dealing with very large fields, let's say text documents, I found that there is a global slowness (on my computer) in returning those field. Since most of the time what we want is an "highlight" value of the field and not the entire field, I thought that we can omit these field f

51 matches

Mail list logo