On 08/09/2024 04:22, Richard Owlett wrote:
[My examples are from my experiments with re-formatting
text at https://ebible.org/engkjvcpb/ for comfortable reading by fellow
tri-focal wearing senior citizens - that I want to minimize the number
of HTML tags & eliminating all CSS usage annoys some HTML5 purists ;]
Instead of BASH and regular expression use some programming language
where a reliable HTML parser is available. E.g. in python you may use
lxml.html.html5parser, lxml.etree.HTMLParser, BeautifulSoup.
Calibre aggressively strips CSS and some markup during conversion of
HTML pages to various ebook formats.