Although implementations usually get this wrong, Markdown is supposed
to be an extension of HTML; that is, any HTML document is also a
Markdown document. Consequently, you can use cat(1) to convert.
cat webpage.html > webpage.md
You likely want also to remove some of the HTML tags and use the
M
Quoth Alexander Krotov:
> > Ideally, with sed/awk, or better in C.
>
> "Parsing" HTML with sed is simply wrong.
This is a good point that I should have mentioned. I spent years
using sed and awk to extract things from HTML, writing crawlers and
suchlike, for personal projects. It can work, of c
> Ideally, with sed/awk, or better in C.
"Parsing" HTML with sed is simply wrong.
You need to use a decent HTML parsing library, as parsing HTML is complex.
There is https://github.com/yujiahaol68/downmark that uses Go html
library, but I have not tried it.
Seriously though, if you are not g