Thanks Jack.

I tried (Regex Transformer) it out and the indexing has gone really slow. Is it 
(RegEx Transformer) slower than N-Gram Indexing? I mean they may be apples and 
oranges but what I mean is finally after extracting the field I want to NGram 
Index it. So It seems going in for NGram Indexing of Full Text (i.e. without 
extracting what I need using RegexTransformer) is a better solution ignoring 
space complexity??

Any views?

THANKS!!

-----Original Message-----
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, May 10, 2012 4:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr On Fly Field creation from full text for N-Gram Indexing

You can use "Regex Transformer" to extract from a source field.

See:
http://wiki.apache.org/solr/DataImportHandler#RegexTransformer

-- Jack Krupansky

-----Original Message-----
From: Husain, Yavar
Sent: Thursday, May 10, 2012 6:04 AM
To: solr-user@lucene.apache.org
Subject: Solr On Fly Field creation from full text for N-Gram Indexing

I have full text in my database and I am indexing that using Solr. Now at 
runtime i.e. when the indexing is going on can I extract certain parameters 
based on regex and create another field/column on the fly using Solr for that 
extracted text?

For example my DB has just 2 columns (DocId & FullText):

DocId    FullText
1            My name is Avi. RoleId: GYUIOP-MN-1087456. .....

Now say while indexing I want to extract RoleId, place it in another column 
created on fly and index that column using N-Gram indexing. I dont want to go 
for N-Gram of Full text as that would be too time expensive.

Thanks!! Any clues would be appreciated.
</PRE>
<BR>
******************************************************************************************<BR>This
message may contain confidential or proprietary information intended only for 
the use of the<BR>addressee(s) named above or may contain information that is 
legally privileged. If you are<BR>not the intended addressee, or the person 
responsible for delivering it to the intended addressee,<BR>you are hereby 
notified that reading, disseminating, distributing or copying this message is 
strictly<BR>prohibited. If you have received this message by mistake, please 
immediately notify us by<BR>replying to the message and delete the original 
message and any copies immediately thereafter.<BR> <BR> Thank you.~<BR> 
******************************************************************************************<BR>
FAFLD<BR>
<PRE> 

Reply via email to