Package: pandoc Version: 2.17.1.1-1.1 Severity: important Tags: security upstream patch Control: found -1 2.2.1-3 Control: found -1 2.9.2.1-1 X-Debbugs-Cc: guil...@debian.org
Hi, The following vulnerability was published for pandoc. CVE-2023-35936[0]: | Starting in version 1.13 and prior to version 3.1.4, Pandoc is | susceptible to an arbitrary file write vulnerability, which can be | triggered by providing a specially crafted image element in the input | when generating files using the `--extract-media` option or outputting | to PDF format. This vulnerability allows an attacker to create or | overwrite arbitrary files on the system, depending on the privileges of | the process running pandoc. It only affects systems that pass untrusted | user input to pandoc and allow pandoc to be used to produce a PDF or | with the `--extract-media` option. […] Note that the `--sandbox` | option, which only affects IO done by readers and writers themselves, | does not block this vulnerability. I discovered that the upstream fix was incomplete while backporting it to buster (LTS). Reported the finding upstream who promptly fixed it in 3.1.6 [1]. Another CVE ID was assigned for this, namely CVE-2023-38745 [2]. The Security Team decided not to issue a DSA for these vulnerabilities, but given they're about to be patched in buster it makes sense to patch other suites, too. Please consider MR !3 for unstable: https://salsa.debian.org/haskell-team/pandoc/-/merge_requests/3 . debdiff attached for convenience. I've also prepared (and tested) a fix for bullseye [3] which I'm planing to submit to -pu once sid is patched. Also planing to rebuild the targeted fix for bookworm and submit it to s-pu. Let me know if you object :-) Cheers, -- Guilhem. [0] https://security-tracker.debian.org/tracker/CVE-2023-35936 https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g [1] https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625 [2] https://security-tracker.debian.org/tracker/CVE-2023-38745 https://nvd.nist.gov/vuln/detail/CVE-2023-38745 [3] https://salsa.debian.org/lts-team/packages/pandoc/-/compare/debian%2F2.9.2.1-1...debian%2Fbullseye?from_project_id=22949&straight=false
diffstat for pandoc-2.17.1.1 pandoc-2.17.1.1 changelog | 9 + patches/CVE-2023-35936.patch | 205 +++++++++++++++++++++++++++++++++++++++++++ patches/CVE-2023-38745.patch | 98 ++++++++++++++++++++ patches/series | 2 4 files changed, 314 insertions(+) diff -Nru pandoc-2.17.1.1/debian/changelog pandoc-2.17.1.1/debian/changelog --- pandoc-2.17.1.1/debian/changelog 2022-11-19 14:13:51.000000000 +0100 +++ pandoc-2.17.1.1/debian/changelog 2023-07-21 20:22:42.000000000 +0200 @@ -1,3 +1,12 @@ +pandoc (2.17.1.1-1.2) unstable; urgency=high + + * Non-maintainer upload. + * Cherry-pick upstream fixes for CVE-2023-35936 from 3.1.4 release. (Closes: + #-1) + * Cherry-pick upstream fix for CVE-2023-35936 from 3.1.6 release. + + -- Guilhem Moulin <guil...@debian.org> Fri, 21 Jul 2023 20:22:42 +0200 + pandoc (2.17.1.1-1.1) unstable; urgency=low * Non-maintainer upload. diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch --- pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch 1970-01-01 01:00:00.000000000 +0100 +++ pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch 2023-07-21 20:22:42.000000000 +0200 @@ -0,0 +1,205 @@ +From: John MacFarlane <j...@berkeley.edu> +Date: Tue, 20 Jun 2023 13:50:13 -0700 +Subject: Fix a security vulnerability in MediaBag and + T.P.Class.IO.writeMedia. + +This vulnerability, discovered by Entroy C, allows users to write +arbitrary files to any location by feeding pandoc a specially crafted +URL in an image element. The vulnerability is serious for anyone +using pandoc to process untrusted input. + +Origin: https://github.com/jgm/pandoc/commit/5e381e3878b5da87ee7542f7e51c3c1a7fd84b89 +Origin: https://github.com/jgm/pandoc/commit/54561e9a6667b36a8452b01d2def9e3642013dd6 +Origin: https://github.com/jgm/pandoc/commit/df4f13b262f7be5863042f8a5a1c365282c81f07 +Origin: https://github.com/jgm/pandoc/commit/fe62da61dfd33e6b4c0c03895c528a47a0405bf7 +Origin: https://github.com/jgm/pandoc/commit/5246f02f0bb9c176a6d2f6e3d0c03407d8a67445 +Bug: https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g +Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-35936 +--- + pandoc.cabal | 2 ++ + src/Text/Pandoc/Class/IO.hs | 12 ++++++------ + src/Text/Pandoc/MediaBag.hs | 27 ++++++++++++++++----------- + test/Tests/MediaBag.hs | 37 +++++++++++++++++++++++++++++++++++++ + test/test-pandoc.hs | 2 ++ + 5 files changed, 63 insertions(+), 17 deletions(-) + create mode 100644 test/Tests/MediaBag.hs + +diff --git a/pandoc.cabal b/pandoc.cabal +index 52506e3..c5129a8 100644 +--- a/pandoc.cabal ++++ b/pandoc.cabal +@@ -791,6 +791,7 @@ test-suite test-pandoc + tasty-lua >= 1.0 && < 1.1, + tasty-quickcheck >= 0.8 && < 0.11, + text >= 1.1.1.0 && < 2.1, ++ temporary >= 1.1 && < 1.4, + time >= 1.5 && < 1.14, + xml >= 1.3.12 && < 1.4, + zip-archive >= 0.2.3.4 && < 0.5 +@@ -800,6 +801,7 @@ test-suite test-pandoc + Tests.Lua + Tests.Lua.Module + Tests.Shared ++ Tests.MediaBag + Tests.Readers.LaTeX + Tests.Readers.HTML + Tests.Readers.JATS +diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs +index 5d4dbc7..5043266 100644 +--- a/src/Text/Pandoc/Class/IO.hs ++++ b/src/Text/Pandoc/Class/IO.hs +@@ -49,7 +49,7 @@ import Network.HTTP.Client.Internal (addProxy) + import Network.HTTP.Client.TLS (mkManagerSettings) + import Network.HTTP.Types.Header ( hContentType ) + import Network.Socket (withSocketsDo) +-import Network.URI (unEscapeString) ++import Network.URI (URI(..), parseURI, unEscapeString) + import System.Directory (createDirectoryIfMissing) + import System.Environment (getEnv) + import System.FilePath ((</>), takeDirectory, normalise) +@@ -120,11 +120,11 @@ newUniqueHash = hashUnique <$> liftIO Data.Unique.newUnique + + openURL :: (PandocMonad m, MonadIO m) => Text -> m (B.ByteString, Maybe MimeType) + openURL u +- | Just u'' <- T.stripPrefix "data:" u = do +- let mime = T.takeWhile (/=',') u'' +- let contents = UTF8.fromString $ +- unEscapeString $ T.unpack $ T.drop 1 $ T.dropWhile (/=',') u'' +- return (decodeLenient contents, Just mime) ++ | Just (URI{ uriScheme = "data:", ++ uriPath = upath }) <- parseURI (T.unpack u) = do ++ let (mime, rest) = break (== ',') $ unEscapeString upath ++ let contents = UTF8.fromString $ drop 1 rest ++ return (decodeLenient contents, Just (T.pack mime)) + | otherwise = do + let toReqHeader (n, v) = (CI.mk (UTF8.fromText n), UTF8.fromText v) + customHeaders <- map toReqHeader <$> getsCommonState stRequestHeaders +diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs +index df71ff8..45b74b5 100644 +--- a/src/Text/Pandoc/MediaBag.hs ++++ b/src/Text/Pandoc/MediaBag.hs +@@ -28,12 +28,14 @@ import Data.Data (Data) + import qualified Data.Map as M + import Data.Maybe (fromMaybe, isNothing) + import Data.Typeable (Typeable) ++import Network.URI (unEscapeString) + import System.FilePath + import Text.Pandoc.MIME (MimeType, getMimeTypeDef, extensionFromMimeType) + import Data.Text (Text) + import qualified Data.Text as T + import Data.Digest.Pure.SHA (sha1, showDigest) +-import Network.URI (URI (..), parseURI) ++import Network.URI (URI (..), parseURI, isURI) ++import Data.List (isInfixOf) + + data MediaItem = + MediaItem +@@ -52,9 +54,12 @@ newtype MediaBag = MediaBag (M.Map Text MediaItem) + instance Show MediaBag where + show bag = "MediaBag " ++ show (mediaDirectory bag) + +--- | We represent paths with /, in normalized form. ++-- | We represent paths with /, in normalized form. Percent-encoding ++-- is not resolved. + canonicalize :: FilePath -> Text +-canonicalize = T.replace "\\" "/" . T.pack . normalise ++canonicalize fp ++ | isURI fp = T.pack fp ++ | otherwise = T.replace "\\" "/" . T.pack . normalise $ fp + + -- | Delete a media item from a 'MediaBag', or do nothing if no item corresponds + -- to the given path. +@@ -77,22 +82,22 @@ insertMedia fp mbMime contents (MediaBag mediamap) = + , mediaContents = contents + , mediaMimeType = mt } + fp' = canonicalize fp ++ fp'' = unEscapeString $ T.unpack fp' + uri = parseURI fp +- newpath = if isRelative fp ++ newpath = if isRelative fp'' + && isNothing uri +- && ".." `notElem` splitDirectories fp +- then T.unpack fp' ++ && not (".." `isInfixOf` fp'') ++ then fp'' + else showDigest (sha1 contents) <> "." <> ext +- fallback = case takeExtension fp of +- ".gz" -> getMimeTypeDef $ dropExtension fp +- _ -> getMimeTypeDef fp ++ fallback = case takeExtension fp'' of ++ ".gz" -> getMimeTypeDef $ dropExtension fp'' ++ _ -> getMimeTypeDef fp'' + mt = fromMaybe fallback mbMime +- path = maybe fp uriPath uri ++ path = maybe fp'' (unEscapeString . uriPath) uri + ext = case takeExtension path of + '.':e -> e + _ -> maybe "" T.unpack $ extensionFromMimeType mt + +- + -- | Lookup a media item in a 'MediaBag', returning mime type and contents. + lookupMedia :: FilePath + -> MediaBag +diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs +new file mode 100644 +index 0000000..b44232b +--- /dev/null ++++ b/test/Tests/MediaBag.hs +@@ -0,0 +1,37 @@ ++{-# LANGUAGE OverloadedStrings #-} ++module Tests.MediaBag (tests) where ++ ++import Test.Tasty ++import Test.Tasty.HUnit ++-- import Tests.Helpers ++import Text.Pandoc.Class (extractMedia, fillMediaBag, runIOorExplode) ++import System.IO.Temp (withTempDirectory) ++import Text.Pandoc.Shared (inDirectory) ++import System.FilePath ++import Text.Pandoc.Builder as B ++import System.Directory (doesFileExist, copyFile) ++ ++tests :: [TestTree] ++tests = [ ++ testCase "test fillMediaBag & extractMedia" $ ++ withTempDirectory "." "extractMediaTest" $ \tmpdir -> inDirectory tmpdir $ do ++ copyFile "../../test/lalune.jpg" "moon.jpg" ++ let d = B.doc $ ++ B.para (B.image "../../test/lalune.jpg" "" mempty) <> ++ B.para (B.image "moon.jpg" "" mempty) <> ++ B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <> ++ B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty) ++ runIOorExplode $ do ++ fillMediaBag d ++ extractMedia "foo" d ++ exists1 <- doesFileExist ("foo" </> "moon.jpg") ++ assertBool "file in directory is not extracted with original name" exists1 ++ exists2 <- doesFileExist ("foo" </> "f9d88c3dbe18f6a7f5670e994a947d51216cdf0e.jpg") ++ assertBool "file above directory is not extracted with hashed name" exists2 ++ exists3 <- doesFileExist ("foo" </> "2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua") ++ exists4 <- doesFileExist "a.lua" ++ assertBool "data uri with malicious payload gets written outside of destination dir" ++ (exists3 && not exists4) ++ exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif") ++ assertBool "data uri with gif is not properly decoded" exists5 ++ ] +diff --git a/test/test-pandoc.hs b/test/test-pandoc.hs +index fcb157f..7d622eb 100644 +--- a/test/test-pandoc.hs ++++ b/test/test-pandoc.hs +@@ -51,6 +51,7 @@ import qualified Tests.Writers.RST + import qualified Tests.Writers.AnnotatedTable + import qualified Tests.Writers.TEI + import qualified Tests.Writers.Markua ++import qualified Tests.MediaBag + import Text.Pandoc.Shared (inDirectory) + + tests :: FilePath -> TestTree +@@ -58,6 +59,7 @@ tests pandocPath = testGroup "pandoc tests" + [ Tests.Command.tests + , testGroup "Old" (Tests.Old.tests pandocPath) + , testGroup "Shared" Tests.Shared.tests ++ , testGroup "MediaBag" Tests.MediaBag.tests + , testGroup "Writers" + [ testGroup "Native" Tests.Writers.Native.tests + , testGroup "ConTeXt" Tests.Writers.ConTeXt.tests diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch --- pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch 1970-01-01 01:00:00.000000000 +0100 +++ pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch 2023-07-21 20:22:42.000000000 +0200 @@ -0,0 +1,98 @@ +From: John MacFarlane <j...@berkeley.edu> +Date: Thu, 20 Jul 2023 09:26:38 -0700 +Subject: Fix new variant of the vulnerability in CVE-2023-35936. + +Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete. +An attacker could get around it by double-encoding the malicious +extension to create or override arbitrary files. + + $ echo '' >b.md + $ .cabal/bin/pandoc b.md --extract-media=bar + <p><img + src="bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+%2f%2e%2e%2f%2e%2e%2fb%2elua" /></p> + $ cat b.lua + print "hello" + $ find bar + bar/ + bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+ + +This commit adds a test case for this more complex attack and fixes +the vulnerability. (The fix is quite simple: if the URL-unescaped +filename or extension contains a '%', we just use the sha1 hash of the +contents as the canonical name, just as we do if the filename contains +'..'.) + +Origin: https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625 +Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-38745 +--- + src/Text/Pandoc/Class/IO.hs | 2 ++ + src/Text/Pandoc/MediaBag.hs | 7 ++++--- + test/Tests/MediaBag.hs | 12 +++++++++++- + 3 files changed, 17 insertions(+), 4 deletions(-) + +diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs +index 5043266..b3f2a32 100644 +--- a/src/Text/Pandoc/Class/IO.hs ++++ b/src/Text/Pandoc/Class/IO.hs +@@ -222,6 +222,8 @@ writeMedia :: (PandocMonad m, MonadIO m) + -> m () + writeMedia dir (fp, _mt, bs) = do + -- we normalize to get proper path separators for the platform ++ -- we unescape URI encoding, but given how insertMedia ++ -- is written, we shouldn't have any % in a canonical media name... + let fullpath = normalise $ dir </> unEscapeString fp + liftIOError (createDirectoryIfMissing True) (takeDirectory fullpath) + logIOError $ BL.writeFile fullpath bs +diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs +index 45b74b5..e02fc1a 100644 +--- a/src/Text/Pandoc/MediaBag.hs ++++ b/src/Text/Pandoc/MediaBag.hs +@@ -87,16 +87,17 @@ insertMedia fp mbMime contents (MediaBag mediamap) = + newpath = if isRelative fp'' + && isNothing uri + && not (".." `isInfixOf` fp'') ++ && '%' `notElem` fp'' + then fp'' +- else showDigest (sha1 contents) <> "." <> ext ++ else showDigest (sha1 contents) <> ext + fallback = case takeExtension fp'' of + ".gz" -> getMimeTypeDef $ dropExtension fp'' + _ -> getMimeTypeDef fp'' + mt = fromMaybe fallback mbMime + path = maybe fp'' (unEscapeString . uriPath) uri + ext = case takeExtension path of +- '.':e -> e +- _ -> maybe "" T.unpack $ extensionFromMimeType mt ++ '.':e | '%' `notElem` e -> '.':e ++ _ -> maybe "" (\x -> '.':T.unpack x) $ extensionFromMimeType mt + + -- | Lookup a media item in a 'MediaBag', returning mime type and contents. + lookupMedia :: FilePath +diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs +index b44232b..c27a29b 100644 +--- a/test/Tests/MediaBag.hs ++++ b/test/Tests/MediaBag.hs +@@ -19,7 +19,7 @@ tests = [ + let d = B.doc $ + B.para (B.image "../../test/lalune.jpg" "" mempty) <> + B.para (B.image "moon.jpg" "" mempty) <> +- B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <> ++ B.para (B.image "data:image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <> + B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty) + runIOorExplode $ do + fillMediaBag d +@@ -34,4 +34,14 @@ tests = [ + (exists3 && not exists4) + exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif") + assertBool "data uri with gif is not properly decoded" exists5 ++ -- double-encoded version: ++ let e = B.doc $ ++ B.para (B.image "data:image/png;base64,cHJpbnQgInB3bmVkIgo=;.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua" "" mempty) ++ runIOorExplode $ do ++ fillMediaBag e ++ extractMedia "bar" e ++ exists6 <- doesFileExist ("bar" </> "772ceca21a2751863ec46cb23db0e7fc35b9cff8.png") ++ exists7 <- doesFileExist "b.lua" ++ assertBool "data uri with double-encoded malicious payload gets written outside of destination dir" ++ (exists6 && not exists7) + ] diff -Nru pandoc-2.17.1.1/debian/patches/series pandoc-2.17.1.1/debian/patches/series --- pandoc-2.17.1.1/debian/patches/series 2022-08-13 16:27:42.000000000 +0200 +++ pandoc-2.17.1.1/debian/patches/series 2023-07-21 20:22:42.000000000 +0200 @@ -2,3 +2,5 @@ 020220531~9aff861.patch 2001_templates_avoid_privacy_breach.patch 2002_program_package_hint.patch +CVE-2023-35936.patch +CVE-2023-38745.patch
signature.asc
Description: PGP signature