RE: solr highlighting

2008-05-22 Thread Kevin Xiao
Just in case anyone wants to know: I figured out that you have to set uniqueKey stored="true" for highlighting to work. Thanks for everyone's help. Thanks, - Kevin -Original Message- From: Kevin Xiao [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 13, 2008 11:22 PM To: solr-user@lucene.apac

Re: solr sorting problem

2008-05-22 Thread pmg
I forgot to mention that I made changes to schema after indexing. pmg wrote: > > I have problem sorting solr results. Here is my solr config > > > > stored="true"/> > > > > > > > > search query > > select/?&rows=100&start=0&q=artistId:100346%20AND%20type:trac

solr sorting problem

2008-05-22 Thread pmg
I have problem sorting solr results. Here is my solr config search query select/?&rows=100&start=0&q=artistId:100346%20AND%20type:track&sort=alphaTrackSort%20desc&fl=track does not sort track. Don't understand what is missing from config -- View this message i

Re[3]: the time factor

2008-05-22 Thread JLIST
Hello Chris, > : If this is how it works, it sounds like the bq will be used first > : to get a result set, then the result set will be sorted by q > : (relevance)? > no. bq doesn't influence what matches -- that's q -- bq only influence > the scores of existing matches if they also match the bq

Re: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread alan.
Thanks for the new jar. I ended up building solr+dataimporthandler but an updated jar is a blessing for folks trying DataImportHandler. I've updated the example-solr-home.jar on the DataImportHandler wiki page with the latest code. Please let us know if you find any issues. -- View this messa

Re[2]: the time factor

2008-05-22 Thread Chris Hostetter
: I'm not quite understanding how boost query works though. How does it : "influence" the score exactly? Does it just simply append to the "q" : param? From the wiki: Esentially yes, but documents must match the at least one clause of the "q", matching the "bq" is optional (and when it happens,

Re: DocSet to BitSet

2008-05-22 Thread Kevin Osborn
In v1.3, it is public. In v1.2, it is still protected. - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, May 22, 2008 1:50:22 PM Subject: Re: DocSet to BitSet : That is more or less what I did. Once I found that function, it ju

Re: DocSet to BitSet

2008-05-22 Thread Chris Hostetter
: That is more or less what I did. Once I found that function, it just : took a small patch to expose that functionality, and then the problem : was solved. I'm not sure why you needed a patch at all ... SolrIndexSearch.getDocSet(List) and getDocSet(Query) are both public methods. as is DocS

Re: DocSet to BitSet

2008-05-22 Thread Kevin Osborn
That is more or less what I did. Once I found that function, it just took a small patch to expose that functionality, and then the problem was solved. - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, May 22, 2008 12:32:56 PM Su

Re: DocSet to BitSet

2008-05-22 Thread Chris Hostetter
: One of the primary reasons that I was doing it this way is because I am : sending several filters, one is a big docset and others are BooleanQuery : objects (products in stock, etc.). : Since, the interface for SolrIndexSearcher.getDocListAndSet supports : only (Query, DocSet,...) or (Query,

RE: Indexing HTML Content

2008-05-22 Thread Lance Norskog
The HTMLStripReader tool worked very well for us. It handles garbled HTML well. The only hole we found was that it does not find alt-text attributes for images. Also, note that this code is written as a Java Reader class rather than a Solr class. This makes it useful for other projects. Given the

Re: SOLR OOM (out of memory) problem

2008-05-22 Thread Mike Klaas
On 22-May-08, at 4:27 AM, gurudev wrote: Hi Rong, My cache hit ratio are: filtercache: 0.96 documentcache:0.51 queryresultcache:0.58 Note that you may be able to reduce the _size_ of the document cache without materially affecting the hit rate, since typically some documents are much m

Re: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Otis Gospodnetic
Julio, no worries, I'm 99% sure DIH is going to be in 1.3 and be in a nightly in a week or two. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Julio Castillo <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, May 22,

Re: Indexing HTML Content

2008-05-22 Thread Otis Gospodnetic
John, Solr already has some of this stuff: $ ff \*HTML\*java ./src/test/org/apache/solr/analysis/HTMLStripReaderTest.java ./src/java/org/apache/solr/analysis/HTMLStripStandardTokenizerFactory.java ./src/java/org/apache/solr/analysis/HTMLStripReader.java ./src/java/org/apache/solr/analysis/HTMLStr

Re: SOLR OOM (out of memory) problem

2008-05-22 Thread Otis Gospodnetic
Hi, Seriously, try making that monster document cache smaller. Sure, there will be more evictions and more cache misses, but at least you will be less likely to get OOMs :). Oits -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: gurudev <[EMAIL

Re: How to limit number of pages per domain

2008-05-22 Thread Otis Gospodnetic
You mean you don't understand the difference. Here is an example of each: 1) field collapsing: http://www.google.com/search?q=lucene+in+action Note how Google figures out that the first 2 hits are from the same site (manning.com) and after showing those 2 hits offer "More results from www.mann

RE: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Julio Castillo
Thanks Shalin, I will add my vote to it that it becomes an integral part of the Solr ASAP. I don't know why indexing dB content is not the highest on the project's list. I've seen other messages regarding the status of SOLR 1.3, so I'm not holding my breath there. My request is that it makes it to

Re: How to limit number of pages per domain

2008-05-22 Thread Otis Gospodnetic
I don't know yet, so I asked directly in that JIRA issue :) Applying patches is done something like this: Ah, just added it to the Solr FAQ on the Wiki for everyone: http://wiki.apache.org/solr/FAQ#head-bd01dc2c65240a36e7c0ee78eaef88912a0e4030 Can you provide feedback about this particular

Re: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Shalin Shekhar Mangar
I've updated the example-solr-home.jar on the DataImportHandler wiki page with the latest code. Please let us know if you find any issues. On Thu, May 22, 2008 at 10:19 PM, Shalin Shekhar Mangar <[EMAIL PROTECTED]> wrote: > It is scheduled to be released with the next release of Solr. > Shouldn't

Re: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Shalin Shekhar Mangar
It is scheduled to be released with the next release of Solr. Shouldn't be too long before it becomes part of the trunk/nightly code. If you find it useful, please do tell us here or vote/comment in the Jira issue. Bug reports are welcome too :) You can also add yourself as a watcher to the SOLR-4

RE: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Julio Castillo
Question on the status of the DataImportHandler. For the time being we are applying the recent patch. What are the plans for incorporating it as part of the nightly build or at least part of the subversion tree? I just want to make sure that I get updates/fixes/enhancements to this module when th

Re: How to limit number of pages per domain

2008-05-22 Thread Jack
I think I'll give it a try. I haven't done this before. Are there any instructions regarding how to apply the patch? I see 9 files, some displayed in gray links, some in blue links; some named as .diff, some .patch; one has 1.3 in file name, one has 1.3, I suppose the other files are for both versi

Re: Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread Shalin Shekhar Mangar
Hi Alan, Yes, it is a bit out of date. Please try using the SOLR-469.patch directly from the jira issue On Thu, May 22, 2008 at 9:23 PM, alan. <[EMAIL PROTECTED]> wrote: > > I downloaded example-solr-home.jar and was experimenting with > '${dataimporter.functions.escapeSql(item.ID)}' > > It didn'

Is example-solr-home.jar synchronized with DataImportHandler documentation?

2008-05-22 Thread alan.
I downloaded example-solr-home.jar and was experimenting with '${dataimporter.functions.escapeSql(item.ID)}' It didn't work so I looked in dataimporter.jar and noticed that it didn't include classes for EvaluatorBag etal I'm assuming example-solr-home.jar on http://wiki.apache.org/solr/DataImpor

RE: SOLR OOM (out of memory) problem

2008-05-22 Thread Yongjun Rong
That looks good to use those cache. Keep those cache will help improve your search performance. Try the concurrent GC and see if you get better result. Please let me know the results. Best, Yongjun Rong -Original Message- From: gurudev [mailto:[EMAIL PROTECTED] Sent: Thursday, May 22

RE: SOLR OOM (out of memory) problem

2008-05-22 Thread Yongjun Rong
-Original Message- From: gurudev [mailto:[EMAIL PROTECTED] Sent: Thursday, May 22, 2008 7:28 AM To: solr-user@lucene.apache.org Subject: RE: SOLR OOM (out of memory) problem Hi Rong, My cache hit ratio are: filtercache: 0.96 documentcache:0.51 queryresultcache:0.58 Thanx Pravesh

Re: [poll] Change logging to SLF4J?

2008-05-22 Thread Henrib
Ryan McKinley wrote: > >> [ ] Keep solr logging as it is. (JDK Logging) >> [X ] Use SLF4J. > Can't "keep as is" since this strictly precludes configuring logging in a container agnostic way. -- View this message in context: http://www.nabble.com/-poll--Change-logging-to-SLF4J--tp17084684p1

Re: [poll] Change logging to SLF4J?

2008-05-22 Thread Grant Ingersoll
On May 6, 2008, at 10:40 AM, Ryan McKinley wrote: [ ] Keep solr logging as it is. (JDK Logging) [X ] Use SLF4J. But you already knew that...

Re: How to limit number of pages per domain

2008-05-22 Thread Jonathan Ariel
Sorry, but I can't really understand the difference with facets. On Thu, May 22, 2008 at 2:09 AM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Actually, the best documentation are really the comments in the JIRA issue > itself. > Is there anyone actually using Solr with this patch? > > > Otis >

RE: SOLR OOM (out of memory) problem

2008-05-22 Thread gurudev
Hi Rong, My cache hit ratio are: filtercache: 0.96 documentcache:0.51 queryresultcache:0.58 Thanx Pravesh Yongjun Rong-2 wrote: > > I had the same problem some weeks before. You can try these: > 1. Check the hit ratio for the cache via the solr/admin/stats.jsp. If > the hit ratio is very low

Re: Indexing HTML Content

2008-05-22 Thread David Arpad Geller
Actually, it's very easy: http://us2.php.net/strip_tags I also store the data in a separate field with the html intact for display. In that case, I use urlencode on the string. David McBride, John wrote: Hello, In my application I wish to index articles which are stored in HTML format. Up

Re: Indexing HTML Content

2008-05-22 Thread solr
Hi, Maybe this one? http://htmlparser.sourceforge.net/ /Jimi Quoting "McBride, John" <[EMAIL PROTECTED]>: Hello, In my application I wish to index articles which are stored in HTML format. Upon indexing these the html gets stored along with the content of the article, which is undesirable.

Indexing HTML Content

2008-05-22 Thread McBride, John
Hello, In my application I wish to index articles which are stored in HTML format. Upon indexing these the html gets stored along with the content of the article, which is undesirable. Do you know of any common way of parsing the text content from HTML before adding to SOLR? I understand SOLR 1

RE: solr highlighting

2008-05-22 Thread Kevin Xiao
Thanks, Mike. Sorry I was busy with something else. What does it mean "field F must have an analyzer defined"? My F defined as: text is defined as: Do you see