[Bug libstdc++/102259] New: ifstream::read(…, count) fails when count >= 2^31 on darwin

2021-09-09 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

Bug ID: 102259
   Summary: ifstream::read(…, count) fails when count >= 2^31 on
darwin
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mimomorin at gmail dot com
  Target Milestone: ---

Created attachment 51431
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51431&action=edit
Testcase for ifstream::read(…, count >= 2^31)

I tried to read a large file using `ifstream::read` on Mac, but it fails to
read any byte when count >= 2^31. Note that the system is 64-bit and
`std::streamsize` has 8 bytes.
Here is a testcase.

#include 
#include 
int main() {
std::ifstream is{"2GB.bin", std::ios::binary}; // filesize >= 2^31 bytes
auto buffer = new char[1LL << 31];
is.read(buffer, 1LL << 31);
std::cout << is.good() << " (" << is.gcount() << " bytes)\n";
// Expected output: "1 (2147483648 bytes)"
// Actual output (on Mac): "0 (0 bytes)"
}

My system is macOS 10.15 running on x86_64 Mac. The testcase failed on
Homebrew's GCC (ver. 6, 9, 10, 11) and MacPorts' GCC (ver. 6), but it succeeded
on LLVM Clang (trunk) and Apple Clang (ver. 12).

`ifstream::read(…, count)` works fine when count < 2^31. So if we split 
is.read(buffer, 1LL << 31);
into
is.read(buffer, (1LL << 31) - 1);
is.read(buffer + (1LL << 31) - 1, 1);
then everything goes OK.
Additionally, `istringstream::read(…, count >= 2^31)` works fine both on GCC
and Clang.

I don't think this simple issue went unnoticed, so maybe I've missed something.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2021-09-10 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #2 from Michel Morin  ---
Whoa, darwin's (and FreeBSD's too?) `read(…, …, nbyte)` fails when nbyte >=
2^31! This is the culprit, I think. 

I also found the following description in FreeBSD's manpage of read
(https://www.unix.com/man-page/FreeBSD/2/read/):

ERRORS
[EINVAL] The value nbytes is greater than INT_MAX.

Given that the testcase works file when compiled with Clang, libcxx would have
some workround for it.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2021-09-10 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #4 from Michel Morin  ---
I googled and found that Rust and Python had the same issue (and fixed it): 
[Rust]
https://github.com/rust-lang/rust/issues/38590
(PR: https://github.com/ziglang/zig/pull/6333)
[Python]
https://bugs.python.org/issue24658
(PR: https://github.com/python/cpython/pull/1705)

These bug reports also says that the darwin's `write(…, …, nbyte)` fails when
nbyte > INT_MAX, and I confirmed that.

> Maybe they do a loop around the read for sizes >= INT_MAX.

Sounds good to me.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2021-09-10 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #5 from Michel Morin  ---
I put a wrong link for Rust's PR. 
The correct link is https://github.com/rust-lang/rust/pull/38622 .

[Bug c++/77565] `typdef int Int;` --> did you mean `typeof`?

2021-09-10 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77565

--- Comment #3 from Michel Morin  ---
There is a typo in this PR's Description. Here is a more readable one:

When we enable `typeof` GCC extension (e.g. using `-std=gnu++**` options), we
get strange did-you-mean suggestions.

`typdef int Int;` ->
error: 'typdef' does not name a type; did you mean 'typeof'?

`typedeff int Int;` ->
error: 'typedeff' does not name a type; did you mean 'typeof'?

Confirmed on GCC 11.2.

[Bug c++/77565] `typdef int Int;` --> did you mean `typeof`?

2021-09-11 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77565

--- Comment #4 from Michel Morin  ---
It seems that the reason is:

`cp_keyword_starts_decl_specifier_p` in `cp/parser.c` does not include
`RID_TYPENAME`.

Note that `typedef` is a decl-specifier ([dcl.spec] p.1 in the Standard).

[Bug c++/77565] `typdef int Int;` --> did you mean `typeof`?

2021-09-13 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77565

--- Comment #5 from Michel Morin  ---
Confirmed the fix. Will send a patch to ML.

> I had use -std=c++98

This comment helps me a lot to understand what's going on. Thanks!

[Bug libstdc++/109891] New: Null pointer special handling in ostream's operator << for C-strings

2023-05-17 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891

Bug ID: 109891
   Summary: Null pointer special handling in ostream's operator <<
for C-strings
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mimomorin at gmail dot com
  Target Milestone: ---

This code

#include 
int main() { std::cout << (char*)nullptr; }

does not cause any bad things (like SEGV), because libstdc++'s
operator<<(ostream, char const*) has special handling of null pointers: 

template
inline basic_ostream<_CharT, _Traits>&
operator<<(basic_ostream<_CharT, _Traits>& __out, const _CharT* __s)
{
if (!__s)
__out.setstate(ios_base::badbit);
else
__ostream_insert(...);
return __out;
}

Passing a null pointer to this operator is a precondition violation, so the
current implementation perfectly conforms to the C++ standard. But, why don't
we remove this special handling? By doing so, we get
- better interoperability with toolings (i.e. sanitizers can find the bug
easily)
- unnoticeable performace improvement
and we lose
- deterministic behaviors (of poor codes) on a particular stdlib
I believe the first point makes more sense than the last point.

It seems that old special handling `if (s == NULL) s = "(null)";`
(https://github.com/gcc-mirror/gcc/blob/6599da0/libio/iostream.cc#L638) was
removed in GCC 3.0, but reintroduced (in the current form) in GCC 3.2 in
response to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6518 .

[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings

2023-05-17 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891

--- Comment #3 from Michel Morin  ---
>From the safety point of view, I agree with you. But, at the same time, I
thought that detectable UB (with the help of sanitizers) is useful than silent
bug. 

How about `throw`ing as in std::string's constructor?

[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings

2023-05-18 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891

--- Comment #6 from Michel Morin  ---
True. Detectable is not correct — that's "maybe-detectable" at most, and the
bug is not silent. In a code that I checked, the buggy code (`std::cout <<
NullCharPtr;`) is the last printing call to std::cout, so I failed to see the
side-effect.

The patchlet using `_GLIBCXX_DEBUG_PEDASSERT` works fine. Actually I would like
`_GLIBCXX_DEBUG_ASSERT` (because I've been using `_GLIBCXX_DEBUG` but never
`_GLIBCXX_DEBUG_PEDANTIC`), but I guess using `_GLIBCXX_DEBUG_PEDASSERT` rather
than `_GLIBCXX_DEBUG_ASSERT` in this case is a delibarate choice.

[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings

2023-05-20 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891

--- Comment #9 from Michel Morin  ---
> (which even mentions the std::string((const char*)nullptr) case):
> https://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode_semantics.html

Oh, that's good to know. Understood that PEDASSERT fits better.

> can we add a "pednonnull" attribute or something to produce a -Wnonnull 
> warning like the nonnull attribute but w/o affecting code generation as well?

I think such an attribute (like Clang's _Nonnull) would be a nice addition. So
I grepped Nonnull on libc++, but strangely there are __no__ uses of
_Nonnull/__nonnull. I only found a few __gnu__::__nonnull__ in
__memory_resource/memory_resource.h. In libc++, std::string constructors have
assertions for nullptr check, but there are no attributes.

[Bug libstdc++/110190] New: regex: incorrect match results on DFA engines

2023-06-09 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110190

Bug ID: 110190
   Summary: regex: incorrect match results on DFA engines
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mimomorin at gmail dot com
  Target Milestone: ---

libstdc++ makes incorrect matches with the sample code in
https://en.cppreference.com/w/cpp/regex/syntax_option_type . (Though the
description of the "leftmost longest rule" is not correct in that page, their
expected results are fine).

Here is a slightly shorter version:
#include 
#include 
#include 

int main()
{
std::string text = "regexp";
std::regex re(".*(ex|gexp)", std::regex::extended);
std::smatch m;
std::regex_search(text, m, re);
std::cout << m[0] << '\n'; // => should be "regexp" on DFA engines
}
This should print "regexp", but libstdc++ prints "regex". (libc++ works fine.)

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2024-12-07 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #11 from Michel Morin  ---
Brilliant, I appreciate it!
I tested with an 8 GB file and confirmed that this fixes the issue on both
Intel and Apple silicon Macs.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2024-12-20 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #14 from Michel Morin  ---
Thanks, the committed version works fine too.
Note that `read` will fail when n > INT_MAX (without equality), 
so we can define _GLIBCXX_MAX_READ_SIZE simply as __INT_MAX__.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2024-12-20 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #15 from Michel Morin  ---
FreeBSD's `read` manpage has been updated recently:

  https://github.com/freebsd/freebsd-src/commit/3e95158
  [2024-02-10] read.2: Describe debug.iosize_max_clamp
  … read() … will succeed unless:
  - The value nbytes is greater than INT_MAX.
  + The value nbytes is greater than SSIZE_MAX
  + (or greater than INT_MAX, if the sysctl debug.iosize_max_clamp is
non-zero).

Then, I checked the source code to find the related changes. It turns out that 
the manual hadn't been updated to reflect the code changes over ten years.

The configuration `iosize_max_clamp` (default to 1) was added in FreeBSD ver.
10:
  https://github.com/freebsd/freebsd-src/commit/526d0bd
  [2012-02-21] Fix found places where uio_resid is truncated to int.

The default was changed to 0 in FreeBSD ver. 11:
https://github.com/freebsd/freebsd-src/commit/cd4dd44
[2013-10-15] By default, allow up to SSIZE_MAX i/o for non-devfs files.

While the default becomes "don't clamp to INT_MAX", users can set
`iosize_max_clamp` to 1 through sysctl. So I think applying the fix without
conditioning on FreeBSD versions (i.e. the current fix) makes sense!

[Bug libstdc++/118162] New: ofstream::write(…, count) fails when count >= 2^31 on darwin

2024-12-20 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118162

Bug ID: 118162
   Summary: ofstream::write(…, count) fails when count >= 2^31 on
darwin
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mimomorin at gmail dot com
  Target Milestone: ---

Created attachment 59937
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59937&action=edit
Testcase for ofstream::write(…, count >= 2^31)

This is a companion PR to PR102259.
On macOS, `ofstream::write(…, count >= 2^31)` fails without partial write.
Here is the attached test case:
#include 
#include 
int main() {
auto buffer = new char[1LL << 31];
std::ofstream os{"2GiB.bin", std::ios::binary};
os.write(buffer, 1LL << 31);
std::cout << os.good() << " (" << os.tellp() << " bytes)\n";
// Expected output: "1 (2147483648 bytes)"
// Actual output on macOS 11 or newer: "0 (-1 bytes)"
}

Here are the manpages for `write` and `writev` in macOS and FreeBSD:

[macOS manpage]
https://keith.github.io/xcode-man-pages/write.2.html
  … write() … will fail and the file pointer will remain unchanged if:
  The value provided for nbyte exceeds INT_MAX.
  … writev() … may also return the following errors:
  The sum of the iov_len values in the iov array overflows a 32-bit integer.

[FreeBSD manpage]
https://man.freebsd.org/cgi/man.cgi?query=writev
  … write(), writev() … will fail and the file pointer will remain unchanged:
  The value nbytes is greater than SSIZE_MAX
  (or greater than INT_MAX, if the sysctl debug.iosize_max_clamp is non-zero).
  … writev() … may return one of the following errors:
  The sum of the iov_len values is greater than SSIZE_MAX
  (or greater than INT_MAX, if the sysctl debug.iosize_max_clamp is non-zero).

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2024-12-23 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #18 from Michel Morin  ---
I tested on old mac systems, including the 32-bit version of MacOS X 10.5, and
confirmed that the `read` syscall with count = INT_MAX does not trigger EINVAL.
(Additionally, the same applies to the `write` syscall.)

[Bug libstdc++/118162] ofstream::write(…, count) fails when count >= 2^31 on darwin

2024-12-20 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118162

--- Comment #1 from Michel Morin  ---
Strictly speaking the `writev` syscall with large counts (and the attached
testcase) succeed on macOS 10.xx. It seems that the restriction described in
manpage ("… return the following errors … the sum of the iov_len values in the
iov array overflows a 32-bit integer") is only implemented on macOS 11 and
later.
But I think applying the fix without conditioning on macOS versions is
beneficial, since the `write` syscall has the restriction (i.e. "… will fail …
if the value provided for nbyte exceeds INT_MAX") on any macOS versions.

[Bug libstdc++/102259] ifstream::read(…, count) fails when count >= 2^31 on darwin

2024-12-21 Thread mimomorin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102259

--- Comment #17 from Michel Morin  ---
> I thought I saw some docs saying >= INT_MAX fails, but maybe I'm wrong. 
> The Rust change uses INT_MAX - 1

The comment in the Rust code says
  On OSX ... by rejecting any read with a size larger than or equal to INT_MAX

But at least on my tested mac systems (from 10.15 to 14) the read syscall
and istream::read works fine for count = INT_MAX.
If you'd like me to test it in the old mac (e.g. 10.7), please let me know (I
can test it after the weekend).