RE: How to copy and extract information from a multi-line text before the tokenizer

2011-08-25 Thread Jaeger, Jay - DOT
h SNOBOL. Uh oh: I just dated myself. 8^) ). JRJ -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, August 25, 2011 7:54 AM To: solr-user@lucene.apache.org Subject: Re: How to copy and extract information from a multi-line text before the tokenizer

Re: How to copy and extract information from a multi-line text before the tokenizer

2011-08-25 Thread Erick Erickson
You could consider writing your own UpdateHandler. It allows you to get access to the underlying SolrInputDocument, and freely modify the fields before it even gets to the analysis chain in defined in your schema. So you can get your "AllData" out of the doc, split it apart as many ways as you want

Re: How to copy and extract information from a multi-line text before the tokenizer

2011-08-23 Thread Chantal Ackermann
Hi Michael, have you considered the DataImportHandler? You could use the the LineEntityProcessor to create fields per line and then copyField to collect everything for the AllData field. http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor Chantal On Tue, 2011-08-23 at 12:28 +02