Le 09-mars-09 à 22:29, Fergus McMenemie a écrit :
how would I implement entity-processor if I were able to get the list of recently changed documents of our sites?Hmmmm, this sounds like a job for my manifestEnityProcessor see if you can find the thread titled:- "a new DIH manifestEnityProcessor" is your list of changed documents a list of additions and updates only, or does it contain deletes as well?
Fergus,I think you should then rename it... Manifest is not the right name to me (manifest refers to something such as the manifest of a jar or of an IMS-content-package, both are a metadata of the data).
I looked at your original description and I could not read anything about the changed files.
The regex approach is a nice one for sure...I think a useful DIH Entity-processor that would maintain its deltas well would have as parameters, url to a list of recently updated urls, url to a list of recently deleted urls. Is this yours?
I would have one for URLs with the list of recent things basically from an RSS; the transformer is custom in all cases.
paul
smime.p7s
Description: S/MIME cryptographic signature