On 30 Nov 2003, at 22:04, Unico Hommes wrote:
Stefano Mazzocchi wrote:
I'm working on Doco and I finished my first phase: I have a repository that I like and does what I need. It's Slide, in case you haven't noticed ;-)
I did :-) I must say I am very happy you are doing this. I was involved with Slide some time ago but got frustrated with it because there were no decent store implementations for it. I must still have an X-Hive DB store implementation lying around somewhere. (B.t.w. I got the impression that there were fundamental flaws in Slide's design that would make it impossible to create a performant store implementation. I thoroughly hope these issues have been taken care of now.)
No, there are design issues, but we'll deal with those later on.
So, now I have a WebDAV/DeltaV/DASL/ACL repository and I want to connect to it.
There has been a lot of work in the area of "Repository API" lately, both inside and outside cocoon.
Cocoon currently hosts four different repositories concepts:
1) two in the linotype block 2) one in the slide block 3) one in the repository block (which is a refactoring of the SourceRepository in linotype)
the linotype repository is a big time hack: it does what linotype needed, but it's not reusable outside (concerns overlap in its interface). The SourceRepository is an implementation of the linotype Repository over a source instead that over a file system. While nicer, it inherits all the problems of the original interface. It does versioning but it doesn't do properties or property querying.
Yep, which is exactly where you can see the whole concept starting to break.
the repository in the slide block uses slide directly and, mostly, for authentication purposes... it's based on an older version of slide,
Hmm, really? I thought not much has changed to Slide's API since 1.16.
Uh, I looked again and you are right. Still, I don't like this since it prevents me from decoupling cocoon from the repository.
doesn't handle versioning, doesn't handle file properties. It's based on actions, generators and transformers. To me, looks old and the need to have the repository on the local machine (and keep it opaque to the outside world) makes it impossible to use in what I need.
I had already been thinking of resurecting some of the stuff given your
recent activity on the slide dev. I've been lurking on that list for
more than two years and can definitely confirm that Slide was previously
dead compared to recent activity.
Hmmm, not sure it's worth the effort... I think contract to webdav is much more future compatible than a contract to the slide API... also because we might change that contract to use JCR when it's ready.
the one in the repository block is the cleanest one, but IMO, its design is backwards. I'll explain what I mean in a second.
For now, I think it's a must that, just as we did with forms, we take a look at the various approaches and choose one to follow and ignore the other ones.
Total definate +1.
Good
I think the repository block is the best effort, but it needs substantial redesign.
- o -
First of all, let me introduce what I mean with a "repository".
A repository is a place where I store my content.
Functionality I need is:
1) open/save document 2) create collection of documents 3) attach metadata to documents (externally to them!!) 4) query the repository against document metadata 5) versioning (autoversioning on saving and version update)
how all these functionalities are implemented should *NOT* be my concern, nor I want it to be when I'm using the repository.
The linotype repository uses this design, while the one in the repository block does not.
Why not? well, it's fully based on sources and tries to obtain the above functionalities from the source abstractions. This means that the contract is not on the API but on the source URL.... but this also means that we cannot fully separate concerns since it's the driver of the repository who chooses which source the repository needs to write on.
This is not entirely true. If you take a look at the SourceRepository interface you will see no reference to Source whatsoever. All public methods only deal with Strings. I guess this means its naming is really bad. I was trying to avoid name clash with the slide block Repository. Of course, if you are talking about SourceRepository's only implementation you are entirely right ;-)
Yes, I looked at the implementation and I see your point. I'll follow up with a discussion on the right interfaces for a Repository class.
I strongly dislike this design because I think it got it all backwards: it should be the Repository to implement Source and give source access to those components who want to access content (say a FileGenerator or even a TraxTransformer)
I can only agree to this full hartedly. Thanks for speaking your opinion!
I looked into the repository block and I find a *lot* of things (locking, permissions, properties) that look very much like a duplication of effort.
Historically, this is exacly how it happenened. All these interfaces were "designed" in the slide block. We've only migrated them from there to the repository block looking for an opertunity like this to discuss just what the hell we are going to do with them.
The Slide project spent years optimizing and polishing issues like transactionality and locking, do you really want to implement a layer to "emulate" those things in case the given source is not capable of handling it itself?
You talking to me? ;-) definitely no! If slide (for the moment still quite reserved about that if you don't mind), JCR, whatever can provide that for me, I'm game!
:-)
I think a much better approach would be to come up with a
Repository.java
interface and a few implementations that I can choose when I install cocoon. This implementation would also implement Source.java and provide its functionality thru a URL protocol.
This allows:
- clear separation of concerns: cocoon should *NOT* be doing repository stuff, which is already big and complex enough
- complete IoC: you choose the implementation and the implementation decides what to do and how to do it. Your contract remains the same (thru the source-provided URL protocol and thru the component interface)
- transparent polymorphism: you can have different implementations of a repository... file system, webdav, CVS, JCR, ... without having to change any code in your application
Thoughts?
Well, yeah. I thought JCR was supposed to be this "Repository.java"? Why
not just use that? Do we really need another layer?
I think so, yes. JCR is incredibly powerful, but exactly because of this power, it feels a little "low level". JCR is sort of a virtual hypergranular file system with multidimensions. Think of it as a persistent DOM with enhanced serializing and query functionalities.
I think you will always need a sort of "application oriented API" on top of JCR... just like you need business objects on top of a relational database.
So, JCR is a sort of "JDBC for hierarchical databases". You could use that directly, sure, no problem, but you end up with the same troubles that you do with using JDBC directly.
This is why I think we need a higher level "repository" API that is *much* easier for people to learn and use, gives immediate gratification against the use of a relational database or a custom file system approach and solves 80% of the content storage needs.
For that remaining 20%, you will need to connect to JCR directly, but that's another story and, for now, JCR is not even there so...
-- Stefano.
smime.p7s
Description: S/MIME cryptographic signature
