Re: using DIH with mets/alto file sets

2010-11-26 Thread Alexey Serba
> The idea is to create a full text index of the alto content, accompanied by > the author/title info from the mets file for purposes of results display. - Then you need to list only alto files in your landscapes entity (fileName="^ID.{3}-ALTO\d{3}.xml$" or something like that), because you don't

Re: using DIH with mets/alto file sets

2010-11-18 Thread Lance Norskog
Some ideas: XPathEntityProcessor parses a very limited XPath syntax. However, you can add an XSL script as an attribute, and this somehow gets called instead. With this, you might be able to create an XPath that selects out every combination that you want. A second option: SOLR-1499 is an entity

using DIH with mets/alto file sets

2010-11-18 Thread Fred Gilmore
mets/alto is an xml standard for describing physical objects. In this case, we're describing books. The mets file holds the metadata (author, title, etc.), the alto file is the physical description (words on the page, formatting of the page). So it's a one (mets) to many (alto) relationship.