To follow up on this, there is resistance against implementing the more complex Microdata or RDFa specifications in Gecko.
We definitely now need some form of Linked Data support for Firefox OS 2.5 so I'm suggesting the following: We should support Open Graph (because of its wide usage by existing web content) and JSON-LD (because it supports Gaia's more complex use cases). Both of these should be simple to implement in Gecko as events on the Browser API, without requiring any complex parsing on the Gecko side. Open Graph just requires firing metachange events (see bug 962626 for an example) for all meta tags which specify a property attribute. (This would be a crude subset of RDFa. We don't need to specify particular vocabularies in Gecko, just include the value of the property attribute in the payload of the event). <meta property="og:title" content="The Rock" /> JSON-LD just requires firing a new linkeddatachange event whenever a JSON-LD script tag is encountered, sending the contents of the script tag in the payload of the event. <script type="application/ld+json"> We can then easily parse the JSON in Gaia and even directly store it directly in our Places database. If there's resistance against implementing the more complex Microdata and RDFa specifications in Gecko then I don't think we should implement Microformats either, the data I have and our experience through prototyping just don't justify it. Unless there's a really good reason not to do so, I'm going to file the bugs and look towards getting this implemented on the Browser API as soon as possible. Thanks Ben On 4 June 2015 at 10:19, Benjamin Francis <bfran...@mozilla.com> wrote: > On 3 June 2015 at 19:42, Benjamin Francis <bfran...@mozilla.com> wrote: > >> This is what I'd really like to get more of, particularly usage data. >> > > I've reached out to a few people at Yahoo, Google and a couple of > universities and have managed to turn up a few studies with useful data > [1][2][3][4]. > > My conclusions so far are: > > - Microformats are used on a large number of web sites but are limited > by their case by case syntax and more fixed vocabulary and are less > formally defined. > - Microdata and RDFa are vocabulary agnostic which makes them > inherently more extensible, they're increasing in popularity due to > schema.org and consumption by major search engines, whilst the use of > Microformats has remained relatively constant over time. > - Microdata is a bit more concise than RDFa but doesn't allow for the > mixing of vocabularies. > - Open Graph is a simplistic form of RDFa with a limited vocabularly > and limited usefulness in comparison to other formats, but is very widely > used due to Facebook and Twitter being major consumers. > - Microformats is used by more websites (domains) but Microdata is > used by more web pages (more URLs, more typed entities and more triples) > and is growing the fastest. Microformats has the breadth, but Microdata has > the depth. In our case I think what we care about is the latter - the > amount of pinnable content. > - JSON-LD is the newest format, the main difference being that it > isn't intended to be embedded in with HTML markup, but is included > separately in a script tag. It's also useful as a canonical JSON-based > format to represent all of the other formats. > > That leads me to recommend that we do the following: > > - Parse Microdata and RDFa (including Open Graph) from web pages in > Gecko > - Expose all of this data to Gaia via a single getLinkedData() or > getStructuredData() method on the Browser API which returns a Promise that > resolves with the data in a canonical JSON-LD format > - Also consider supporting JSON-LD directly as no parsing is required, > we just need to detect a script tag > > If anyone finds any more usage data, or has a different interpretation of > the data below, then please do share. > > Thanks > > Ben > > 1. Web Data Commons website based on Common Crawl corpus (2009-2014) > http://webdatacommons.org/ > 2. Web Data Commons Paper based on Common Crawl Corpus (2009-2012) > http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-2.pdf > 3. Yahoo post based on Yahoo corpus (2011) > > https://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/ > 4. Yahoo paper based on Bing corpus (2012) > http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-1.pdf > > > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform