I'm having some trouble getting the PlainTextEntityProcessor to populate a
field in an index. I'm using the TemplateTransformer to fill 2 fields, and
have a timestamp field in schema.xml, and these fields make it into the
index. Only the plaintText data is missing. Here is my configuration:

<dataConfig>
    <dataSource type="FileDataSource" encoding="UTF-8" />
    <document>
        <entity
       name="f"
       processor="FileListEntityProcessor"
       baseDir="/Users/jayhill/test/dir"
       fileName=".*txt"
       recursive="true"
       rootEntity="true"
       >

        <entity
           name="pt"
           processor="PlainTextEntityProcessor"
           url="${f.fileAbsolutePath}"
           transformer="RegexTransformer,TemplateTransformer"
           >
          <field column="plainText" name="text"/>
          <field column="datasource" template="textfiles" />
        </entity>

        </entity>
    </document>
</dataConfig>

I've tried adding "plainText" as a field in schema.xml, but that didn't work
either.

When I look at what the PlainTextEntityProcessor class is doing I see that
it has correctly parsed the file and has the text in a StringWriter:
    row.put(PLAIN_TEXT, sw.toString());
I just don't know how to get that text into a field in the index

Any pointers appreciated.

-Jay
  • PlainTextEntitiyProces... Jay Hill

Reply via email to