On Mon, Feb 16, 2009 at 3:22 PM, Fergus McMenemie <fer...@twig.me.uk> wrote:
> Hello.
>
> I have been beating my head around the data-config.xml listed
> at the end of this message. It breaks in a few different ways.
>
>  1) I have bodged TemplateTransformer to allow it to return
>     when one of the variables is undefined. This ensures my
>     uniqueKey is always defined. But thinking more on
>     Nobel's comments there is use in having it work both ways.
>     ie leaving the column undefined or replacing the variable
>     with "". I still like my idea about using the default
>     value of a solr field from schema.xml, but I cant figure
>     out how/where to best implement it.
When a value is missing from the templatewe may end up giving
constructing a partial string which may not be desired. If we leave it
out as empty, then Solr would automatically put in the default value
and it should be solved. Just in case you wish to know the
defaultvalue in the schema.xml you can get it from the api.
fields = context.getAllEntityFields();
String defval = fields.get(0).get("defaultvalue");
>
>  2) Having used TemplateTransformer to assign a value to an
>     entity column that column cannot be used in other
>     TemplateTransformer operations. In my project I am
>     attempting to reuse "x.fileWebPath". To fix this, the
>     last line of transformRow() in TemplateTransformer.java
>     needs replaced with the following which as well as
>     'putting' the templated-ed string in 'row' also saves it
>     into the 'resolver'.
>
>     **originally**
>      row.put(column, resolver.replaceTokens(expr));
>      }
>
>     **new**
>      String columnName = map.get(DataImporter.COLUMN);
>      expr=resolver.replaceTokens(expr);
>      row.put(columnName, expr);
>      resolverMapCopy.put(columnName, expr);
>      }

isn't it better to write a custom transformer to achieve this. I did
not want a standard component to change the state of the
VariableResolver .

I am not sure what is the best way.

>
>     As an aside I think I ran into the issues covered by
>     SOLR-993. It took a while to figure out I could not a
>     a single columnname/value to the resolver. I had instead
>     to add to the map that was already stored within the
>     resolver.
>
>  3) No entity column names can be used within RegexTransformer.
>     I guess all the stuff that was added to TemplateTransformer
>     to allow column names to be used in templates needs re-added
>     into RegexTransformer. I am doing that now... but am confused
>     by the fragment of code which copies from resolverMap into
>     resolverMapCopy. As best I can see resolverMap is always
>     empty; but I am barely able to follow the code! Can somebody
>     explain when/why resolverMap would be populated.

The behavior is like this, the expression ${currentEntity.colName}
does not work automatically. Because the row is not added to
VariableResolver .TemplateTransformer has hacked the stuff to make it
work.

We can think of modifying this behavior
>
>     Also, I begin to understand comments made by Noble in
>     SOL-1001 about resolving "entity attributes in
>     ContextImpl.getEntityAttribute" and I guess Shalin was
>     right as well. However it also seems wrong that at the
>     top of every transformer we are going to repeat the
>     same code to load the resolver with information about the
>     entity.
>
>  4) In that I am reusing template output within other templates
>     the order of execution becomes important. Can I assume that
>     the explicitly listed columns in an entity are processed by
>     the various transformers in the order they appear within
>     data-config.xml. I *think* that the list of columns within
>     an entity as returned by getAllEntityFields() is actually
>     an ArrayList which I think or order dependent. IS this
>     correct?

IT IS CORRECT
>
>  5) Should I raise this as a single JIRA issue?
Do not add ONE issue forall. If they are logically connected  put all
of them into one.If not, split them into as many issues as possible.
>
>  6) Having played with this stuff, I was going to add a bit
>     more to the wiki highlighting some of the possibilities
>     and issues with transformers. But want to check with the
>     list first!
>
>
>   <dataConfig>
>   <dataSource name="myfilereader" type="FileDataSource"/>
>    <document>
>    <entity name="jc"
>               processor="FileListEntityProcessor"
>               fileName="^.*\.xml$"
>               newerThan="'NOW-1000DAYS'"
>               recursive="true"
>               rootEntity="false"
>               dataSource="null"
>               baseDir="/Volumes/spare/ts/solr/content"
>               >
>    <entity name="x"
>                  dataSource="myfilereader"
>                  processor="XPathEntityProcessor"
>                  url="${jc.fileAbsolutePath}"
>                  rootEntity="true"
>                  stream="false"
>                  forEach="/record | /record/mediaBlock"
>                  
> transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer">
>
> <field column="fileAbsolutePath"       template="${jc.fileAbsolutePath}" />
> <field column="fileWebPath"            regex="${x.test}(.*)" 
> replaceWith="/ford$1" sourceColName="fileAbsolutePath"/>
> <field column="title"                  xpath="/record/title" />
> <field column="para1" name="para"      xpath="/record/sect1/para" />
> <field column="para2" name="para"      xpath="/record/list/listitem/para" />
> <field column="pubdate"                
> xpath="/record/metadata/da...@qualifier='pubDate']" dateTimeFormat="yyyyMMdd" 
>   />
>
> <field column="vurl"                   
> xpath="/record/mediaBlock/mediaObject/@vurl" />
> <field column="imgSrcArticle"          
> template="${dataimporter.request.fordinstalldir}" />
> <field column="imgCpation"             xpath="/record/mediaBlock/caption"  />
>
> <field column="test"                   
> template="${dataimporter.request.contentinstalldir}" />
> <!-- **problem is that vurl is just a fragment of the info needed to access 
> the picture. -->
> <field column="imgWebPathICON"         regex="(.*)/.*" 
> replaceWith="$1/imagery/${x.vurl}s.jpg" sourceColName="fileWebPath"/>
> <field column="imgWebPathFULL"         regex="(.*)/.*" 
> replaceWith="$1/imagery/${x.vurl}.jpg"  sourceColName="fileWebPath"/>
> <field column="vdkvgwkey"              
> template="${jc.fileAbsolutePath}#${x.vurl}" />
>       </entity>
>       </entity>
>       </document>
>    </dataConfig>
>
> Regards Fergus.
>
> --
>
> ===============================================================
> Fergus McMenemie               Email:fer...@twig.me.uk
> Techmore Ltd                   Phone:(UK) 07721 376021
>
> Unix/Mac/Intranets             Analyst Programmer
> ===============================================================
>



-- 
--Noble Paul

Reply via email to