How can I get Solr-Cell to extract to multi-valued fields?

2010-03-02 Thread Mark Roberts
Hi, I have a schema with a multivalued field like so: I am uploading html documents to the Solr extraction handler which contain meta in the head, like so: I want the extraction handler to map each of these pieces of meta onto the product field, however, there seems to be a problem - onl

How to find the location of the highlighted snippet?

2010-03-04 Thread Mark Roberts
Hi, I want to display my results in google-style "...snippet...snippet...", except I need to be able to determine if a snippet is at the beginning or the end of the content to tell whether or not to add leading/trailing "..."s At the moment, I'm using string comparison with the content field, b

Position of snippet within highlighted field

2010-03-08 Thread Mark Roberts
Does anyone know if it's possible to get the position of the highlighted snippet within the field that's being highlighted? It would be really useful for me to know if the snippet is at the beginning or at the end of the text field that it comes from. Thanks, Mark.

HTML encode extracted docs

2010-03-08 Thread Mark Roberts
I'm uploading .htm files to be extracted - some of these files are "include" files that have snippets of HTML rather than fully formed html documents. solr-cell stores the raw HTML for these items, rather than extracting the text. Is there any way I can get solr to encode this content prior to s

RE: HTML encode extracted docs - Problems with solr.HTMLStripCharFilter

2010-03-09 Thread Mark Roberts
enough? http://www.lucidimagination.com/search/document/CDRG_ch05_5.7.2 On Mon, Mar 8, 2010 at 5:50 AM, Mark Roberts wrote: > I'm uploading .htm files to be extracted - some of these files are "include" > files that have snippets of HTML rather than fully formed html documen

Dummy boost question

2010-03-09 Thread Mark Roberts
Hi, I have indexed some documents that have title, content and keyword (multi-value). I want to *search* on title and content, and then, within these results *boost* by keyword. I have set up my qf as such: content^0.5 title^1.0 And my bq as such: keyword:(*.*

Best way to get usable terms out of TermComponent

2010-03-12 Thread Mark Roberts
Hi, I want to implement a suggestion/autocomplete dropdown on my searchbox - can anyone help with: 1) Is the TermComponent the advised way for autocomplete? 2) How can I ensure that the returned terms are complete, valid and English words? Any help much appareciated.

RE: Field Collapsing SOLR-236

2010-03-25 Thread Mark Roberts
Yeah got it working fine - but I needed to revert to Trunk (1.5) to get the patch to apply. It does certainly have some performance implications, but tweaking configuration can help here. Overall the benefits very much outweigh the costs for us :) Mark. -Original Message- From: Denn