RE: [cms-list] Content Management Framework (was: Vignette to Acquire Epicentric)

Charley Bay Sat, 02 Nov 2002 19:57:47 -0800

> From what I understand <...>, you have an advanced
> wiki-like system (I don't know how else to describe
> it).  The advantage is freeform editing.  The
> editor isn't restrained by predefined structures. 
> The interpretation by your engine afterwards 
> structures the information and builds the necessary
> links, relations, and presentation (HTML, TeX, PDF,
> etc. exports). Internally, it is "raw" content 
> intelligently interpreted for output.
> <snip>
> I think the inherent disadavantage is that you
> can't force structure - specify the information 
> you want.


Yes, I think that's a fair way to describe it.  The
advantage is freeform content (that was our specific
need, lots of cross-referenced content in biotech
that defied our structural assumptions), and the
disadvantage (you hit it right on the head!) is
that you don't force structure, so the author can
do anything, so it's terribly important that someone
play the role of "knowledge architect" or "content
architect".  Also, we can't "auto-gen" templated
UI data entry interfaces like some systems, nor
(as you said) ensure the same information
consistently exists across many things (records?
documents?) in the system that are supposed to be
held consistent.

Thus, content creation is authoring a text file.
It's terribly flexible, so the creative author
is quite happy.  OTOH, it kind of looks like
source code (there are keywords, and blank
lines are sometimes logically significant to 
separate paragraphs), so you *know* what that 
will do to a lot of people!  ;-)   Maybe with a
WYSIWYM editor with "bold" and "itallic" text
formatting buttons it won't be so bad, but we haven't
written that.

Like a Wicki system, you just add text, and it's
in there.  That's simple.  Some of the text is 
interpretted as references to images, a "heading"
for the beginning of a section, etc.  Unlike Wicki,
the "back-end" is not a database, but is simple text
files.  Our compiler starts with one file (the
"root" of the publication, any file can be the root),
and then reads that file, following links to other
files (like a spider), until there are no more
links or for some reason it decides to give up
(another topic for another day, I suppose).

In this "compile" process, an internal model is
created of atomic "media units", like text, formatted
text, images, videos, and links.  These happen to
be aggregated within higher-level (logical) entities
like paragraphs or sections (sections can have 
a "heading" and can recursively nest), based on how
they were found in the text file. Then, we have a
publishing engine that descends this web of cross-
referenced stuff and spits out one or more files,
the "publication(s)".

Our reference target platform is HTML, since it's
very flexible and easy to represent hyperlinks
(and our content has *lots* of them).  However, we
have need for both online and printed documentation,
so we're in the process of creating a resource file
and modifying the publishing engine to write DocBook
(PDF seems hard).  

Curiously, we used to have a design where we derive
a new publishing engine for each target platform 
(HTML, TeX, DocBook, etc.)  However, there was so
much in common that we instead created one publishing
engine and it's driven off resource files to 
write different platforms (HTML and DocBook, TeX
next, others someday).

> <snip>, a template is a file used to render
> content.   A structure is the pre-defined group of
> content elements (schema).
> ...<snip>.  The templates chooses what's shown and
> what isn't.

Oh, ok:  That seems like a good break.  So, we are
without templates and without schemas.  We've spec'd
how we are going to add schemas in the future to
force structure for *some* content (at the author's
behest), but we've taken a different approach for the
template part, which we're calling the "publication
context".

Our "publication context" is comprised of, what's
the target technology (HTML? TeX?), what's its limits
(images ok? hyperlinks ok?  formatted [bolded] text
ok?), and what's the number of "levels" you want
your document to go, and a bunch of other stuff
(access/security, font preference, should we dump
a glossary of terms, etc.)  For example, holding
everything else constant, we can publish for 
"infinite levels deep" (the default) or "five levels
deep", meaning don't go to a sixth nested section
(a reference to a sixth nested section would be
represented as a hyperlink to a different document,
or as plain text).

So, I don't think it's really a template (we
call it a context).  That's probably most significant
for how we are different from ?most CMS.

We've found it good for consistently publishing
very complex, large, heterogeneous sets of content.
However, it's not a desktop publisher, and it's
not MSWord, because you don't get "one-off" pixel
alignment override decisions, and you don't get
platform-specific things like HTML client-side image
maps (but we did add capability to "embed native
code" specific to a target platform, so that's how
we handle exceptional HTML or TeX things).

> <snip>  The question really is whether you want to
> be editor-centric or integrator-centric I think.  
> Who gets to define the structure?

Hmmm... interesting.  If we don't have a central
authority that is able to force structure, and the
authors can do whatever, which one is that?  I'll
have to think about this, because I've not thought
in those terms before.  Our big focus was separation
of content from formatting, which is *why* we don't
have a visual editor, and *why* we ended up with
text files that look like source code.  As we
mature, though, we're interested in a more 
content-interactive environment like an IDE that
would include a WYSIWYM editor (probably still
for the advanced user for quite a while).

> With regards to the your approach Charley, you
> throw a nice curveball.  ;)

Aren't I a stinker?  ;-))

> You don't have structure, and you don't need
> templates for content, yet you can manage the 
> content intelligently "internally". 

Yes, and yes.  I think that's a fair description.
The rules for managing the content are the rules
for our data language (describes paragraphs, 
images, sections, applets, footnotes, quotes, ...)

> I've been trying to think of a hybrid approach.

We started out in the "unstructured content"
domain, and we're finding it increasingly easy
to add features that permit you to impose
structure.  So, in the end, I think we'll be
a hybrid system.  In reality, much of what we
have now presupposes some form of structure
to embed applets or define tables, and we're
even looking at formal "records" defined through
"schemas" to rock down to Structured Avenue
(that's going to be a bit of work, though).

If CM systems were quite mature, I'm not sure too
many people would be interested in what we're doing.
Our interface is non-existent (we're a command-line
compiler).  However, since so many CM systems appear
to be owned and operated by script writers and 
administrators that invest a lot of energy into
installing and configuring these systems, it's
probably about the same amount of work.  But, even
in that respect, we're currently just a static page 
publisher.

Isn't CM *FUN*??  ;-)

--charley
[EMAIL PROTECTED]


__________________________________________________
Do you Yahoo!?
HotJobs - Search new jobs daily now
http://hotjobs.yahoo.com/
--
http://cms-list.org/
trim your replies for good karma.

RE: [cms-list] Content Management Framework (was: Vignette to Acquire Epicentric)

Reply via email to