Hey Subbu,
Is there an easy way to determine whether or not my extensions are using
parser hooks? For example, a canonical list of hooks I can grep for in my
code?

On Mon, Sep 14, 2020 at 1:17 PM Subramanya Sastry <[email protected]>
wrote:

> [---- Long mail - but only relevant to extension developers ----]
>
> Greetings!
>
> As some of you might know, on the Parsing Team [0], we are aspiring to
> replace the core wikitext parser with Parsoid [1] on Wikimedia wikis late
> next year and start to put to rest the two-parser ghost that has haunted us
> for many years. In recent years, we achieved two major milestones along
> the way: replace HTML4 tidy with HTML5 Remex [2], and port Parsoid from
> Javascript to PHP [3].
>
> Given that context, if you (help) maintain an extension that:
>
> * uses a "parser hook" and/or
> * uses the "parser API" (i.e. uses public properties / methods in
>    Parser.php, ParserOutput.php, ParserOptions.php, etc.)
>
> please read on. If you don't fit that description, you can stop reading
> now!
>
> Parsoid models and processes wikitext quite differently from the
> core parser - all that Parsoid guarantees is that the rendering is largely
> identical, not the specific process of generating the rendering. This
> means that extensions that extend the behavior of the parser will need to
> adapt to work with Parsoid instead to provide similar functionality. With
> that in mind, we have been working to more clearly specify how extensions
> need to adapt to the Parsoid regime.
>
> PARSOID & EXTENSIONS:
>
> At a high level, here are the questions we needed to answer, along with
> some highly simplified answers:
>
> 1. How do extensions "hook" into Parsoid?
> A. Extensions need to think in terms of transformations (convert this
>     to that) instead of parser pipeline events (at this point in the
>     pipeline, call this listener). An additional detail here is that
>     extensions cannot maintain global ordered state within extension code
>     since Parsoid doesn't guarantee handlers will be invoked in the same
>     order in which they showed up in page source. See the wiki [4] for
>     more details.
>
>     As for the mechanics of registration, Parsoid uses existing mechanisms
>     based on the extension.json file.
>
> 2. When the registered hook listeners are invoked by Parsoid, how do they
>     process any wikitext they need to process?
> A. Parsoid provides all registered listeners with an API object to interact
>     with it. Direct use of Parsoid internals code is strongly discouraged
>     and will be enforced in various ways including via code review.
>
> 3. How is the extension's output assimilated into the page output?
> A. The output is treated as a "fully-processed" page/DOM fragment (with
>     some caveats which will be clarified on wiki). It is appropriately
>     decorated with additional markup, and slotted into place into the page.
>     Extensions need not make any special efforts (aka strip state) to
>     protect it from the parsing pipeline.
>
> Slides 8-12 of the August 12 2020 Tech Talk [7] goes over the differences.
> Check the wiki [4] for more details of Parsoid's Extension API. It also
> maps core parser hooks to Parsoid's extension functionality.
>
> CURRENT STATUS:
>
> We consider the current proposal to be in late draft stage. That said, as
> we discover unsupported functionality, we will augment the set of hooks and
> the Parsoid Extension API as needed.
>
> While there are a wide variety of extensions in the MediaWiki universe
> with varied use cases, our initial goal for the next year is just Wikimedia
> wikis and hence extensions that are deployed on the Wikimedia wikis.
> Once we are done with that, we will turn our attention to supporting
> extension use cases in the wider MediaWiki universe. But, now is a
> good time for all extension developers to study and review this API
> and give us feedback.
>
> Since the beginning of this year, we've refactored all of the extensions
> we've written Parsoid versions of (Cite, Gallery, Poem, Pre, JSON) to
> now strictly use the Parsoid Extension API without cheating by virtue
> of being in the Parsoid codebase. So, this proposal is actually backed
> by an implementation that is in production for Wikimedia wikis.
>
> FEEDBACK:
>
> Here is where you come in.
>
> * If you maintain / develop an extension, please review the document
>    to see if your extension's use case is covered.
>
>    Ideally, leave your feedback on the Parsoid Extension API talk page [5]
>    since it helps keep it all in one place. Alternatively, you can also
>    leave questions / concerns / other feedback on the Phabricator task
>    we've filed for TechCom's RFC process [6].
>
> * If you feel bold, start the process of updating your extensions *now*.
>    Note that your extension will need to operate with both the existing
>    core parser as well as Parsoid till such time we deprecate and stop
>    using the core parser.
>
>    There are known functionality gaps related to exposing ParserOutput
>    object and providing setFunctionHook functionality. If your extension
>    needs those, you should probably wait for us to fill that gap.
>
> DOCS / MORE INFO / CONTACT:
>
> * Check the wiki page [4] for docs and discuss on the talk page [5]
> * Check the August 12, 2020 Tech Talk [7]
> * Look at Parsoid code for extensions [8]
> * Look at Parsoid docs for the Ext/ namespace [9]
> * Talk to us on IRC in the #mediawiki-parsoid channel
> * Email us at [email protected]
>
> Thanks!
> Subbu (on behalf of the Parsing Team).
>
> -------------------------------------------------------------------------
>
> 0. https://www.mediawiki.org/wiki/Parsing
> 1. https://www.mediawiki.org/wiki/Parsing/Parser_Unification
> 2. https://blog.wikimedia.org/2018/07/09/tidy-html5-replacement/
> 3.
>
> https://techblog.wikimedia.org/2020/02/12/parsoid-in-php-or-there-and-back-again/
>
> 4. https://www.mediawiki.org/wiki/Parsoid/Extension_API
> 5. https://www.mediawiki.org/wiki/Parsoid/Talk:Extension_API
> 6. https://phabricator.wikimedia.org/T260714
> 7. Slides:
>
> https://commons.wikimedia.org/wiki/File:Parsoid_%26_Extensions_August_2020_Tech_Talk.pdf
>
>     Video: https://www.youtube.com/watch?v=lS1xPkERWCM
> 8. https://github.com/wikimedia/parsoid/tree/master/src/Ext
> 9. https://doc.wikimedia.org/Parsoid-PHP/master/
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to