Package: mailscripts
Tags: patch
Severity: wishlist

The attached patch supplies a python3 script for extracting OpenPGP
certificates from an rfc822/message input stream.  I wrote it (with some
guidance from anarcat and others), and i offer it (and the accompanying
documentation under the GPLv3 for wider distribution with the
mailscripts package.

Thanks for maintaining mailscripts!

       --dkg

From ae2f662d2200fb7edc4f5cfff90e29e41bd5046f Mon Sep 17 00:00:00 2001
From: Daniel Kahn Gillmor <d...@fifthhorseman.net>
Date: Thu, 25 Jul 2019 12:38:52 -0400
Subject: [PATCH] offer email-extract-openpgp-certs

---
 Makefile                          |  1 +
 debian/copyright                  |  1 +
 email-extract-openpgp-certs       | 99 +++++++++++++++++++++++++++++++
 email-extract-openpgp-certs.1.pod | 57 ++++++++++++++++++
 4 files changed, 158 insertions(+)
 create mode 100755 email-extract-openpgp-certs
 create mode 100644 email-extract-openpgp-certs.1.pod

diff --git a/Makefile b/Makefile
index 220aa6f..48cb2fa 100644
--- a/Makefile
+++ b/Makefile
@@ -1,5 +1,6 @@
 MANPAGES=mdmv.1 mbox2maildir.1 \
 	notmuch-slurp-debbug.1 notmuch-extract-patch.1 maildir-import-patch.1 \
+	email-extract-openpgp-certs.1 \
 	notmuch-import-patch.1
 
 all: $(MANPAGES)
diff --git a/debian/copyright b/debian/copyright
index b3d860e..d891ffd 100644
--- a/debian/copyright
+++ b/debian/copyright
@@ -3,6 +3,7 @@ Collection of scripts for manipulating e-mail on Debian
 
 Copyright (C)2017 Aurelien Aptel
 Copyright (C)2017-2019 Sean Whitton
+Copyright (C)2019 Daniel Kahn Gillmor
 
 These programs are free software: you can redistribute it and/or
 modify it under the terms of the GNU General Public License as
diff --git a/email-extract-openpgp-certs b/email-extract-openpgp-certs
new file mode 100755
index 0000000..dfe6138
--- /dev/null
+++ b/email-extract-openpgp-certs
@@ -0,0 +1,99 @@
+#!/usr/bin/python3
+
+'''Extract all OpenPGP certificates from an e-mail message
+
+This is a simple script that is designed to take an e-mail
+(rfc822/message) on standard input, and produces a series of
+ASCII-armored OpenPGP certificates on standard output.
+
+It currently tries to find OpenPGP certificates based on MIME types of
+attachments (application/pgp-keys), and by pulling out anything that
+looks like an Autocrypt: or Autocrypt-Gossip: header (see
+https://autocrypt.org).
+
+'''
+
+import email
+import sys
+import base64
+import binascii
+import codecs
+from typing import Optional, Generator
+
+# parse email from stdin
+message = email.message_from_binary_file(sys.stdin.buffer)
+
+def openpgp_ascii_armor_checksum(data: bytes) -> bytearray:
+    '''OpenPGP ASCII-armor checksum
+
+(see https://tools.ietf.org/html/rfc4880#section-6.1)'''
+
+    init = 0xB704CE
+    poly = 0x1864CFB
+    crc = init
+    for b in data:
+        crc ^= b << 16
+        for i in range(8):
+            crc <<= 1
+            if crc & 0x1000000:
+                crc ^= poly
+    val = crc & 0xFFFFFF
+    out = bytearray(3)
+    out[0] = (val >> 16) & 0xFF
+    out[1] = (val >> 8) & 0xFF
+    out[2] = val & 0xFF
+    return out
+
+def enarmor_certificate(data: bytes) -> str:
+    '''OpenPGP ASCII-armor
+
+(see https://tools.ietf.org/html/rfc4880#section-6.2)'''
+
+    cksum = openpgp_ascii_armor_checksum(data)
+    key = codecs.decode(base64.b64encode(data), 'ascii')
+    linelen = 64
+    key = '\n'.join([key[i:i+linelen] for i in range(0, len(key), linelen)])
+    return '-----BEGIN PGP PUBLIC KEY BLOCK-----\n\n' +\
+        key + \
+        '\n=' + codecs.decode(base64.b64encode(cksum), 'ascii') +\
+        '\n-----END PGP PUBLIC KEY BLOCK-----\n'
+
+def get_autocrypt_keys(m: email.message.Message) -> Generator[str, None, None]:
+    '''Extract all Autocrypt headers from message
+
+Note that we ignore the addr= property.
+'''
+    hdrs = m.get_all('Autocrypt')
+    if hdrs is None: # the email.get_all() api is kindn of sad.
+        hdrs = []
+    ghdrs = m.get_all('Autocrypt-Gossip')
+    if ghdrs is None: # the email.get_all() api is kindn of sad.
+        ghdrs = []
+    for ac in hdrs + ghdrs:
+        # parse the base64 part
+        try:
+            keydata = str(ac).split('keydata=')[1].strip()
+            keydata = keydata.replace(' ', '').replace('\t', '')
+            keydatabin = base64.b64decode(keydata)
+            yield enarmor_certificate(keydatabin)
+        except (binascii.Error, IndexError) as e:
+            print("failure to parse Autocrypt header: %s" % e,
+                  file=sys.stderr)
+
+def extract_attached_keys(m: email.message.Message) -> Generator[str, None, None]:
+    for part in m.walk():
+        if part.get_content_type() == 'application/pgp-keys':
+            p = part.get_payload(decode=True)
+            if not isinstance(p, bytes):
+                raise TypeError('Expected part payload to be bytes')
+            if p.startswith(b'-----BEGIN PGP PUBLIC KEY BLOCK-----\n'):
+                yield codecs.decode(p, 'ascii')
+            else: # this is probably binary-encoded, let's pretend that it is!
+                yield enarmor_certificate(p)
+
+# FIXME: should we try to decrypt encrypted messages as well?
+
+for a in get_autocrypt_keys(message):
+    print(a, end='')
+for a in extract_attached_keys(message):
+    print(a, end='')
diff --git a/email-extract-openpgp-certs.1.pod b/email-extract-openpgp-certs.1.pod
new file mode 100644
index 0000000..8b7916e
--- /dev/null
+++ b/email-extract-openpgp-certs.1.pod
@@ -0,0 +1,57 @@
+=head1 NAME
+
+email-extract-openpgp-certs - extract OpenPGP certificates from an e-mail
+
+=head1 SYNOPSIS
+
+B<email-extract-openpgp-certs> < B<message.eml> | B<gpg> B<--import>
+
+=head1 DESCRIPTION
+
+B<email-extract-openpgp-certs> extracts all the things it can find
+that look like they might be OpenPGP certificates in an e-mail, and
+produces them on standard output.
+
+It currently knows about how to find OpenPGP certificates as
+attachments of MIME type application/pgp-keys, and Autocrypt: style
+headers.
+
+=head1 OPTIONS
+
+None.
+
+=head1 EXAMPLE
+
+=over 4
+
+    $ notmuch show --format-raw id:b7e48905-8...@example.net > test.eml
+    $ email-extract-openpgp-certs < test.eml | gpg --import
+
+=back
+
+=head1 LIMITATIONS
+
+B<email-extract-openpgp-certs> currently does not try to decrypt
+encrypted e-mails, so it cannot find certificates that are inside the
+message's cryptographic envelope.
+
+B<email-extract-openpgp-certs> does not attempt to validate the
+certificates it finds in any way.  It does not ensure that they are
+valid OpenPGP certificates, or even that they are of a sane size. It
+doeds not try to establish any relationship between the extracted
+certificates and the messages in which they are sent.  For example, it
+does not check the Autocrypt addr= attribute against the message's From:
+header.
+
+Importing certificates extracted from an arbitrary e-mail in this way
+into a curated keyring is not a good idea.  Better to extract into an
+ephemeral location, inspect, filter, and then selectively import.
+
+=head1 SEE ALSO
+
+gpg(1), https://autocrypt.org, https://tools.ietf.org/html/rfc4880, https://tools.ietf.org/html/rfc3156
+
+=head1 AUTHOR
+
+B<email-extract-openpgp-certs> and this manpage were written by Daniel
+Kahn Gillmor, with guidance and advice from many others.
-- 
2.20.1

Attachment: signature.asc
Description: PGP signature

Reply via email to