LineEntityProcessor Usage
Hello, I have a question regarding configuration of LineEntityProcessor. How do we configure LineEntityProcessor to read a line of text from a file,parse the line and assign it to specific fields in schema. How exactly is the text in a line gets mapped to fields in schema. I have searched a lot and didn't find any example of how to do that. Can somebody please give me an example of how to do this.Also please help me in understanding the concept of lineEntityProcessor. Thanks & Regards, Kiran Bushireddy
LineEntityProcessor Usage
Hello, I have a question regarding configuration of LineEntityProcessor. How do we configure LineEntityProcessor to read a line of text from a file,parse the line and assign it to specific fields in schema. How exactly is the text in a line gets mapped to fields in schema. I have searched a lot and didn't find any example of how to do that. Can somebody please give me an example of how to do this.Also please help me in understanding the concept of lineEntityProcessor. Thanks & Regards, Kiran Bushireddy
Re: How do we use HTMLStripCharFilterFactory
Hi specify transformer="HTMLStripTransformer" at entity level and for the field you want to strip html just set stripHTML="true" It should work.. Kiran On Thu, Jun 28, 2012 at 4:09 PM, derohit wrote: > Hi All, > > I am new to SOLR. Please hellp me with configuration of > HTMLStripCharFilterFactory. > > If some tutorial is there, will be of great help. > > Regards > Rohit > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-do-we-use-HTMLStripCharFilterFactory-tp3991955.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Thanks & Regards, Kiran Kumar
Indexing Wikipedia
Hi, In our office we have wikipedia setup for intranet. I want to index the wikipedia, I have been recently studying that all the wiki pages are stored in database and the schema is a bit of standard followed from mediawiki. I am also thinking of whether to use xmldumper to dump all the wiki pages into xml and index from there. Have anybody done something like this. If so, which way is more efficient and easy to implement. For me the DB schema look quite a bit complicated. Can somebody please help me in understanding what is the better implementation for this. Thanks, Kiran Bushireddy.