On Wed, Sep 16, 2020 at 12:57:43PM +0300, Jani Nikula wrote: > Email messages need two levels of decoding: First, content transfer > encoding, such as base64 or quoted-printable. Second, charset decoding. > > We've done the first (with part.get_payload(decode=True)), but we've > ignored the charset. Mostly, it has not mattered, since most email is > ascii or utf-8 anyway, and python2 has been relaxed about it. However, > python3 part.get_payload(decode=True) gives us binary instead of > unicode, so we also need to do the charset decoding to get the result we > want. > > The problem has likely been observed only now that 'python' no longer > exists or points at python3 instead of python2. > > Use part.get_content_charset() for charset decoding, defaulting to > 'us-ascii' source charset if nothing is specified. > > Cc: Rodrigo Vivi <[email protected]> > Cc: Daniel Vetter <[email protected]> > Signed-off-by: Jani Nikula <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]> Tested-by: Rodrigo Vivi <[email protected]> (Although it continue to fail with the encoded email) Thanks, Rodrigo. > --- > dim | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/dim b/dim > index c3a048db8956..3f489976c6bc 100755 > --- a/dim > +++ b/dim > @@ -447,7 +447,7 @@ def print_msg(file): > msg = email.message_from_file(file) > for part in msg.walk(): > if part.get_content_type() == 'text/plain': > - print(part.get_payload(decode=True)) > + > print(part.get_payload(decode=True).decode(part.get_content_charset(failobj='us-ascii'))) > > print_msg(open('$1', 'r')) > EOF > -- > 2.20.1 > _______________________________________________ dim-tools mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/dim-tools
