On Tue, Oct 31, 2017, at 05:46 AM, Henri Sivonen wrote: > (Context: I'm trying to understand the requirements for our > serializers in case we rewrite them [in Rust].) > > The HTML fragment parsing algorithm can have only one context node. > The context is never a chain of nodes towards to the root, since such > a thing wouldn't affect the result per the HTML parsing algorithm. > > However, when the HTML parsing algorithm is in the non-fragment mode, > some tags get ignored without appropriate parent, so e.g. to represent > <td> in the non-fragment mode, you need to include <table>, etc. But > that's about it. > > The Windows CF_HTML clipboard format, > https://msdn.microsoft.com/en-us/library/windows/desktop/ms649015(v=vs.85).aspx > , represents fragments by designating them in a full HTML document, so > what are logically fragments have to work with non-fragment parsing. > > This indicates that when we export a fragment to the clipboard, we > should serialize its parent if not table-related or reconstruct a full > table if table-related. > > Yet, it seems that we serialize much more ancestor context. > > Is there a good reason to? For example, does Microsoft office (our old > bugs suggest that Excel is the pickiest consumer) or other CF_HTML > consumers on Windows care about more context than the standard HTML > parsing algorithm? What could consumers possibly do with knowlegde > about ancestors beyond parent or the nearest <table>? (I'm ignoring > SVG and MathML for the moment.) > > OTOH, it seems that we include only some element types in the context > (https://searchfox.org/mozilla-central/source/dom/base/nsDocumentEncoder.cpp#1540). > It's unclear to me why. The first revision of the list came from jst > during the Netscape 6 crunch without an explanation either in Bugzilla > or code comments. (https://bugzilla.mozilla.org/show_bug.cgi?id=50742) > > Does anyone know why?
I don't know exactly why, but I did try to fix pasting table cells into Excel a long time ago (someone else eventually fixed it), and it was definitely tricky and underspecified: https://bugzilla.mozilla.org/show_bug.cgi?id=137450 Comments on the bug indicate that there are non-table cases where the context is important, like `<ol><li>` to ensure you wind up pasting numbered list items. -Ted _______________________________________________ dev-platform mailing list [email protected] https://lists.mozilla.org/listinfo/dev-platform

