Re: [RT] Converging the repository concept in cocoon

Stefano Mazzocchi Tue, 02 Dec 2003 09:36:17 -0800

On 30 Nov 2003, at 22:04, Unico Hommes wrote:

Stefano Mazzocchi wrote:

I'm working on Doco and I finished my first phase: I have a
repository
that I like and does what I need. It's Slide, in case you haven't
noticed ;-)


I did :-) I must say I am very happy you are doing this. I was involved
with Slide some time ago but got frustrated with it because there were
no decent store implementations for it. I must still have an X-Hive DB
store implementation lying around somewhere. (B.t.w. I got the
impression that there were fundamental flaws in Slide's design that
would make it impossible to create a performant store implementation. I
thoroughly hope these issues have been taken care of now.)

No, there are design issues, but we'll deal with those later on.

So, now I have a WebDAV/DeltaV/DASL/ACL repository and I want to
connect to it.

There has been a lot of work in the area of "Repository API" lately,
both inside and outside cocoon.

Cocoon currently hosts four different repositories concepts:

  1) two in the linotype block
  2) one in the slide block
  3) one in the repository block (which is a refactoring of the
SourceRepository in linotype)

the linotype repository is a big time hack: it does what linotype
needed, but it's not reusable outside (concerns overlap in its
interface). The SourceRepository is an implementation of the linotype
Repository over a source instead that over a file system.
While nicer,
it inherits all the problems of the original interface. It does
versioning but it doesn't do properties or property querying.


Yep, which is exactly where you can see the whole concept starting to
break.

the repository in the slide block uses slide directly and,
mostly, for
authentication purposes... it's based on an older version of slide,

Hmm, really? I thought not much has changed to Slide's API since 1.16.

Uh, I looked again and you are right. Still, I don't like this since it prevents me from decoupling cocoon from the repository.

doesn't handle versioning, doesn't handle file properties. It's based
on actions, generators and transformers. To me, looks old and
the need
to have the repository on the local machine (and keep it
opaque to the
outside world) makes it impossible to use in what I need.
I had already been thinking of resurecting some of the stuff given your
recent activity on the slide dev. I've been lurking on that list for
more than two years and can definitely confirm that Slide was previously
dead compared to recent activity.

Hmmm, not sure it's worth the effort... I think contract to webdav is much more future compatible than a contract to the slide API... also because we might change that contract to use JCR when it's ready.

the one in the repository block is the cleanest one, but IMO, its
design is backwards. I'll explain what I mean in a second.

For now, I think it's a must that, just as we did with forms,
we take a
look at the various approaches and choose one to follow and
ignore the
other ones.

Total definate +1.

Good

I think the repository block is the best effort, but it needs
substantial redesign.

- o -

First of all, let me introduce what I mean with a "repository".

A repository is a place where I store my content.

Functionality I need is:

  1) open/save document
  2) create collection of documents
  3) attach metadata to documents (externally to them!!)
  4) query the repository against document metadata
  5) versioning (autoversioning on saving and version update)

how all these functionalities are implemented should *NOT* be my
concern, nor I want it to be when I'm using the repository.

The linotype repository uses this design, while the one in the
repository block does not.

Why not? well, it's fully based on sources and tries to obtain the
above functionalities from the source abstractions. This
means that the
contract is not on the API but on the source URL.... but this also
means that we cannot fully separate concerns since it's the driver of
the repository who chooses which source the repository needs to write
on.


This is not entirely true. If you take a look at the SourceRepository
interface you will see no reference to Source whatsoever. All public
methods only deal with Strings. I guess this means its naming is really
bad. I was trying to avoid name clash with the slide block Repository.
Of course, if you are talking about SourceRepository's only
implementation you are entirely right ;-)

Yes, I looked at the implementation and I see your point. I'll follow up with a discussion on the right interfaces for a Repository class.

I strongly dislike this design because I think it got it all
backwards:
it should be the Repository to implement Source and give
source access
to those components who want to access content (say a
FileGenerator or
even a TraxTransformer)


I can only agree to this full hartedly. Thanks for speaking your
opinion!

I looked into the repository block and I find a *lot* of things
(locking, permissions, properties) that look very much like a
duplication of effort.


Historically, this is exacly how it happenened. All these interfaces
were "designed" in the slide block. We've only migrated them from there
to the repository block looking for an opertunity like this to discuss
just what the hell we are going to do with them.

The Slide project spent years optimizing and
polishing issues like transactionality and locking, do you
really want
to implement a layer to "emulate" those things in case the
given source
is not capable of handling it itself?


You talking to me? ;-) definitely no! If slide (for the moment still
quite reserved about that if you don't mind), JCR, whatever can provide
that for me, I'm game!

:-)

I think a much better approach would be to come up with a

Repository.java

interface and a few implementations that I can choose when I install
cocoon. This implementation would also implement Source.java and
provide its functionality thru a URL protocol.

This allows:

  - clear separation of concerns: cocoon should *NOT* be doing
repository stuff, which is already big and complex enough

  - complete IoC: you choose the implementation and the
implementation
decides what to do and how to do it. Your contract remains the same
(thru the source-provided URL protocol and thru the component
interface)

  - transparent polymorphism: you can have different
implementations of
a repository... file system, webdav, CVS, JCR, ... without having to
change any code in your application

Thoughts?

Well, yeah. I thought JCR was supposed to be this "Repository.java"? Why
not just use that? Do we really need another layer?

I think so, yes. JCR is incredibly powerful, but exactly because of this power, it feels a little "low level". JCR is sort of a virtual hypergranular file system with multidimensions. Think of it as a persistent DOM with enhanced serializing and query functionalities.

I think you will always need a sort of "application oriented API" on top of JCR... just like you need business objects on top of a relational database.

So, JCR is a sort of "JDBC for hierarchical databases". You could use that directly, sure, no problem, but you end up with the same troubles that you do with using JDBC directly.

This is why I think we need a higher level "repository" API that is *much* easier for people to learn and use, gives immediate gratification against the use of a relational database or a custom file system approach and solves 80% of the content storage needs.

For that remaining 20%, you will need to connect to JCR directly, but that's another story and, for now, JCR is not even there so...

--
Stefano.

smime.p7s
Description: S/MIME cryptographic signature

Re: [RT] Converging the repository concept in cocoon

Reply via email to