On Mon, 10 May 2021 Emanuel Berg wrote:
...and this somewhat more complex-looking one...
"W3C RSS 1.0 News Feed Creation How-To"
https://www.w3.org/2001/10/glance/doc/howto
Great, but stops on <figure> and <figurecaption>,
Elsewhere in the thread you seem to have moved on from XSLT to more
promising options, but I'll make a few comments here anyways.
I suspect that those specific tags are not a primary cause of
difficulty.
I could be mistaken, of course. But I am unable to replicate this
without more information about the input document.
I wonder whether the input document is XML. (No unclosed tags, etc.)
these are HTML5 tags:
http://html5doctor.com/the-figure-figcaption-elements/
so either we must change the XSLT rules to make use of for
example the caption at least, _or_ we must either make
a rule or tell the tool to ignore them, if such an option
exists...
Here is the Makefile [last] only one problem, the XSLT file or
xsltproc tool (?) doesn't seem to transform the HTML into RSS,
really, output is basically a text file with no markup whatsoever
except for the first line which is
<?xml version="1.0" encoding="utf-8"?>
TLDR: What you describe will happen when none of an XSLT stylesheet's
template rules match anything in the input.
XSLT is template-based, a little bit like sed or awk, but instead of
processing records (lines) in a text file it processes the nodes of an
XML tree.
When a node matches no template in your stylesheet, then *built-in*
template rules are applied:
* The built-in template rule for the "document node" (at top of the
tree), is to apply templates to that node's children.
* The built-in template rule for any element is the same as for the
document node -- apply templates to the children of that element.
* And, when we reach the leaves of the tree, the built-in template
rule for text nodes is to copy the text to the result tree.
As a consequence, applying an XSLT stylesheet to a document that
matches none of the templates in the stylesheet results in output that
looks identical to the output you would get by applying a trivial
stylesheet containing no template rules at all!
It's a little like how the output of
$ sed '' somefile
is indistinguishable from
$ cat somefile
Maybe I do something wrong?
I lack fluency in make/Makefile, and I have not dug into the weeds of
that stylesheet at https://www.w3.org/2001/10/glance/doc/howto .
However, when you call xsltproc it looks to me like you not are
supplying any of the four parameters that the stylesheet html2rss.xsl
expects:
<xsl:param name = "Base" />
<xsl:param name = "Channel" />
<xsl:param name = "xmlfile" />
<xsl:param name = "xslfile" />
<xsl:param name = "Page" />
You might supply them like so
$ xsltproc -o Overview.rss
--stringparam xmldata "$webpage" \
--stringparam xlsfile html2rss.xsl \
--stringparam Base "$(dirname "$webpage")" \
--stringparam Page "$(dirname "$webpage")" \
--stringparam Channel Overview.rss \
html2rss.xsl \
"$webpage"
name = tree-house
src = ${name}.html
srcpp = ${name}-pp.html
trans = html2rss.xsl
dst = ${name}.rss
opts = --html
all: ${dst}
${srcpp}: ${src}
sed -e 's/<\/*fig\(ure\|caption\)>//g' $< > $@
${dst}: ${srcpp}
xsltproc -o $@ ${opts} ${trans} $<
--
Ce qui est important est rarement urgent
et ce qui est urgent est rarement important
-- Dwight David Eisenhower