* Fay Stegerman <[email protected]> [2024-08-23 17:38]:
> * Fay Stegerman <[email protected]> [2024-08-22 23:12]:
> > * Alan Coopersmith <[email protected]> [2024-08-22 20:56]:
> > > -------- Forwarded Message --------
> > > Subject: [Security-announce][CVE-2024-8088] Infinite loop when iterating
> > > over zip archive entry names
> > > Date: Thu, 22 Aug 2024 13:40:20 -0500
> > > From: Seth Larson <[email protected]>
> > > Reply-To: [email protected]
> > > To: [email protected]
> > >
> > > There is a HIGH severity vulnerability affecting the CPython "zipfile"
> > > module.
> > >
> > > When iterating over names of entries in a zip archive (for example,
> > > methods
> > > of "zipfile.ZipFile" like "namelist()", "iterdir()", "extractall()", etc)
> > > the process can be put into an infinite loop with a maliciously crafted
> > > zip archive. This defect applies when reading only metadata or extracting
> > > the contents of the zip archive. Programs that are not handling
> > > user-controlled zip archives are not affected.
> > >
> > > Please see the linked CVE ID for the latest information on affected
> > > versions:
> > >
> > > * https://www.cve.org/CVERecord?id=CVE-2024-8088
> > > * https://github.com/python/cpython/pull/122906
> > > * https://github.com/python/cpython/issues/122905
> >
> > A small correction/addendum based on reading the vulnerability report and
> > the PR
> > that fixes this (as well as being quite familiar with Python zipfile.ZipFile
> > internals and confused how this would affect it): it's not zipfile.ZipFile
> > and
> > its methods that are affected, at least not directly, but zipfile.Path. The
> > issue being this code in zipfile._path._ancestry():
> >
> > path = path.rstrip(posixpath.sep)
> > while path and path != posixpath.sep:
> > yield path
> > path, tail = posixpath.split(path)
> >
> > Which results in an infinite loop because for example posixpath.split("//")
> > ==
> > ("//", "") but "//" != posixpath.sep:
> >
> > >>> it = zipfile._path._parents("//foo")
> > >>> next(it)
> > '//'
> > >>> next(it)
> > '//'
> > >>> next(it)
> > '//'
> >
> > The infinite loop has been fixed by sanitising the paths.
>
> Forgot to mention this: the infinite loop is triggered when zipfile.Path adds
> "implied directories" -- using _parents(), which calls _ancestry() -- in the
> overridden .namelist() for the custom zipfile.ZipFile subclass it wraps.
> Which
> is (indirectly) used by almost all of the zipfile.Path methods like
> .iterdir(),
> .glob(), .exists(), .joinpath() etc.
>
> >>> zf = zipfile.ZipFile(io.BytesIO(), "w")
> >>> zf.filename = "foo.zip"
> >>> zf.writestr("a/b/c", "abc")
> >>> zf.writestr("d/e", "de")
> >>> zf.namelist()
> ['a/b/c', 'd/e']
> >>> zf.__class__
> <class 'zipfile.ZipFile'>
>
> >>> p = zipfile.Path(zf)
> >>> p.root.namelist()
> ['a/b/c', 'd/e', 'a/b/', 'a/', 'd/']
> >>> list(p.iterdir())
> [Path('foo.zip', 'a/'), Path('foo.zip', 'd/')]
> >>> zf.__class__
> <class 'zipfile._path.CompleteDirs'>
>
> >>> zf.writestr("//oops", "oops")
> >>> # infinite loop via joinpath -> resolve_dir -> _name_set -> namelist ->
> >>> # _implied_dirs -> _ancestry
> >>> p / "foo"
>
> As zipfile.Path modifies the class of the original ZipFile, calling
> .namelist()
> or .extractall() on the original ZipFile used to create the Path afterwards is
> also affected even though zipfile.ZipFile as such is not.
I just reported [1] that the patch introduced a regression:
>>> import io, zipfile
>>> zf = zipfile.ZipFile(io.BytesIO(), "w")
>>> zf.writestr("d:/foo", "bar")
>>> zf.extractall("a")
>>> open("a/d:/foo").read()
'bar'
>>> p = zipfile.Path(zf)
>>> x = p / "d" / "foo"
>>> y = p / "d:" / "foo"
>>> list(p.iterdir()) # before: [Path(None, 'd:/')]
[Path(None, 'd/')]
>>> p.root.namelist() # before: ['d:/foo', 'd:/']
['d/foo', 'd/']
>>> x.exists() # before: False
True
>>> y.exists() # before: True
False
>>> zf.extractall("b") # before: worked like above
KeyError: "There is no item named 'd/foo' in the archive"
>>> x.read_text() # before: FileNotFoundError
KeyError: "There is no item named 'd/foo' in the archive"
>>> y.read_text() # before: worked
FileNotFoundError: ...
- Fay
[1] https://github.com/python/cpython/issues/123270