[Numpy-discussion] Changes to `np.loadtxt`

Sebastian Berg Tue, 08 Feb 2022 06:09:25 -0800

Hi all,

just a brief heads up that:


    https://github.com/numpy/numpy/pull/20580

is now merged.  This moves `np.loadtxt` to C.  Mainly making it much
faster.  There are also some other improvements and changes though:

* It now supports `quotechar='"'` to support Excel dialect CSV.
* Parsing some numbers is stricter (e.g. removed support for `_`
  or hex float parsing by default).
* `max_rows` now actually counts rows and not lines.  A warning
  is given if this makes a difference (blank lines).
* Some exception will change, parsing failures now (almost) always
  give an informative `ValueError`.
* `converters=callable` is now valid to provide a single converter
  for all columns.

Please test, and let us know if there is any issue or followup you
would like to see.  

We do have possible followups planned
* Consider deprecating the `encoding="bytes"` default which exists
  for Python 2 compatibility.
* Consider renaming `skip_rows` to the more precise `skip_lines`.

Moving to C unlocks possible further improvement, such as full
`csv.Dialect` support.  We do not have this on the roadmap, but such
contributions are possible now.
Similarly, it should be possible to rewrite `genfromtxt` based on this
work.

Cheers,

Sebastian

signature.asc
Description: This is a digitally signed message part

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]

[Numpy-discussion] Changes to `np.loadtxt`

Reply via email to