Indexing both meta-data and full content of HTML

2016-03-19 Thread Davis, Daniel (NIH/NLM) [C]
I have some XML that includes a stylesheet maintained by another organization that renders to HTML. The HTML is pretty good - it is not "structured" in RDFa or schema.org, but has classes and anchors that can be used to find some key data. So, I can probably get all the meta-data I want from

RE: Indexing both meta-data and full content of HTML

2016-03-19 Thread Davis, Daniel (NIH/NLM) [C]
1:47 AM To: solr-user@lucene.apache.org Subject: Indexing both meta-data and full content of HTML I have some XML that includes a stylesheet maintained by another organization that renders to HTML. The HTML is pretty good - it is not "structured" in RDFa or schema.org, but has classes