I have some XML that includes a stylesheet maintained by another organization
that renders to HTML. The HTML is pretty good - it is not "structured" in
RDFa or schema.org, but has classes and anchors that can be used to find some
key data. So, I can probably get all the meta-data I want from
1:47 AM
To: solr-user@lucene.apache.org
Subject: Indexing both meta-data and full content of HTML
I have some XML that includes a stylesheet maintained by another organization
that renders to HTML. The HTML is pretty good - it is not "structured" in
RDFa or schema.org, but has classes