Path to this page:
Subject: CVS commit: pkgsrc/math/py-pandas
From: Adam Ciarcinski
Date: 2023-08-28 12:34:02
Message id: 20230828103402.E05CEFBDB@cvs.NetBSD.org
Log Message:
py-pandas: updated to 2.0.3
2.0.3
Fixed regressions
Bug in Timestamp.weekday`() was returning incorrect results before '0000-02-29' \
(GH53738)
Fixed performance regression in merging on datetime-like columns (GH53231)
Fixed regression when DataFrame.to_string() creates extra space for string \
dtypes (GH52690)
Bug fixes
Bug in DataFrame.convert_dtype() and Series.convert_dtype() when trying to \
convert ArrowDtype with dtype_backend="nullable_numpy" (GH53648)
Bug in RangeIndex.union() when using sort=True with another RangeIndex (GH53490)
Bug in Series.reindex() when expanding a non-nanosecond datetime or timedelta \
Series would not fill with NaT correctly (GH53497)
Bug in read_csv() when defining dtype with bool[pyarrow] for the "c" \
and "python" engines (GH53390)
Bug in Series.str.split() and Series.str.rsplit() with expand=True for \
ArrowDtype with pyarrow.string (GH53532)
Bug in indexing methods (e.g. DataFrame.__getitem__()) where taking the entire \
DataFrame/Series would raise an OverflowError when Copy on Write was enabled and \
the length of the array was over the maximum size a 32-bit integer can hold \
(GH53616)
Bug when constructing a DataFrame with columns of an ArrowDtype with a \
pyarrow.dictionary type that reindexes the data (GH53617)
Bug when indexing a DataFrame or Series with an Index with a timestamp \
ArrowDtype would raise an AttributeError (GH53644)
2.0.2
Fixed regressions
Fixed performance regression in GroupBy.apply() (GH53195)
Fixed regression in merge() on Windows when dtype is np.intc (GH52451)
Fixed regression in read_sql() dropping columns with duplicated column names \
(GH53117)
Fixed regression in DataFrame.loc() losing MultiIndex name when enlarging object \
(GH53053)
Fixed regression in DataFrame.to_string() printing a backslash at the end of the \
first row of data, instead of headers, when the DataFrame doesn’t fit the line \
width (GH53054)
Fixed regression in MultiIndex.join() returning levels in wrong order (GH53093)
Bug fixes
Bug in arrays.ArrowExtensionArray incorrectly assigning dict instead of list for \
.type with pyarrow.map_ and raising a NotImplementedError with pyarrow.struct \
(GH53328)
Bug in api.interchange.from_dataframe() was raising IndexError on empty \
categorical data (GH53077)
Bug in api.interchange.from_dataframe() was returning DataFrame’s of incorrect \
sizes when called on slices (GH52824)
Bug in api.interchange.from_dataframe() was unnecessarily raising on bitmasks \
(GH49888)
Bug in merge() when merging on datetime columns on different resolutions (GH53200)
Bug in read_csv() raising OverflowError for engine="pyarrow" and \
parse_dates set (GH53295)
Bug in to_datetime() was inferring format to contain "%H" instead of \
"%I" if date contained “AM” / “PM” tokens (GH53147)
Bug in DataFrame.convert_dtypes() ignores convert_* keywords when set to False \
dtype_backend="pyarrow" (GH52872)
Bug in DataFrame.convert_dtypes() losing timezone for tz-aware dtypes and \
dtype_backend="pyarrow" (GH53382)
Bug in DataFrame.sort_values() raising for PyArrow dictionary dtype (GH53232)
Bug in Series.describe() treating pyarrow-backed timestamps and timedeltas as \
categorical data (GH53001)
Bug in Series.rename() not making a lazy copy when Copy-on-Write is enabled when \
a scalar is passed to it (GH52450)
Bug in pd.array() raising for NumPy array and pa.large_string or pa.large_binary \
(GH52590)
Bug in DataFrame.__getitem__() not preserving dtypes for MultiIndex partial keys \
(GH51895)
2.0.1
Fixed regressions
Fixed regression for subclassed Series when constructing from a dictionary (GH52445)
Fixed regression in SeriesGroupBy.agg() failing when grouping with categorical \
data, multiple groupings, as_index=False, and a list of aggregations (GH52760)
Fixed regression in DataFrame.pivot() changing Index name of input object (GH52629)
Fixed regression in DataFrame.resample() raising on a DataFrame with no columns \
(GH52484)
Fixed regression in DataFrame.sort_values() not resetting index when DataFrame \
is already sorted and ignore_index=True (GH52553)
Fixed regression in MultiIndex.isin() raising TypeError for Generator (GH52568)
Fixed regression in Series.describe() showing RuntimeWarning for extension dtype \
Series with one element (GH52515)
Fixed regression when adding a new column to a DataFrame when the \
DataFrame.columns was a RangeIndex and the new key was hashable but not a scalar \
(GH52652)
Bug fixes
Bug in Series.dt.days that would overflow int32 number of days (GH52391)
Bug in arrays.DatetimeArray constructor returning an incorrect unit when passed \
a non-nanosecond numpy datetime array (GH52555)
Bug in ArrowExtensionArray with duration dtype overflowing when constructed from \
data containing numpy NaT (GH52843)
Bug in Series.dt.round() when passing a freq of equal or higher resolution \
compared to the Series would raise a ZeroDivisionError (GH52761)
Bug in Series.median() with ArrowDtype returning an approximate median (GH52679)
Bug in api.interchange.from_dataframe() was unnecessarily raising on categorical \
dtypes (GH49889)
Bug in api.interchange.from_dataframe() was unnecessarily raising on large \
string dtypes (GH52795)
Bug in pandas.testing.assert_series_equal() where check_dtype=False would still \
raise for datetime or timedelta types with different resolutions (GH52449)
Bug in read_csv() casting PyArrow datetimes to NumPy when \
dtype_backend="pyarrow" and parse_dates is set causing a performance \
bottleneck in the process (GH52546)
Bug in to_datetime() and to_timedelta() when trying to convert numeric data with \
a ArrowDtype (GH52425)
Bug in to_numeric() with errors='coerce' and dtype_backend='pyarrow' with \
ArrowDtype data (GH52588)
Bug in ArrowDtype.__from_arrow__() not respecting if dtype is explicitly given \
(GH52533)
Bug in DataFrame.describe() not respecting ArrowDtype in include and exclude \
(GH52570)
Bug in DataFrame.max() and related casting different Timestamp resolutions \
always to nanoseconds (GH52524)
Bug in Series.describe() not returning ArrowDtype with pyarrow.float64 type with \
numeric data (GH52427)
Bug in Series.dt.tz_localize() incorrectly localizing timestamps with ArrowDtype \
(GH52677)
Bug in arithmetic between np.datetime64 and np.timedelta64 NaT scalars with \
units always returning nanosecond resolution (GH52295)
Bug in logical and comparison operations between ArrowDtype and numpy masked \
types (e.g. "boolean") (GH52625)
Fixed bug in merge() when merging with ArrowDtype one one and a NumPy dtype on \
the other side (GH52406)
Fixed segfault in Series.to_numpy() with null[pyarrow] dtype (GH52443)
Other
DataFrame created from empty dicts had columns of dtype object. It is now a \
RangeIndex (GH52404)
Series created from empty dicts had index of dtype object. It is now a \
RangeIndex (GH52404)
Implemented Series.str.split() and Series.str.rsplit() for ArrowDtype with \
pyarrow.string (GH52401)
Implemented most str accessor methods for ArrowDtype with pyarrow.string (GH52401)
Supplying a non-integer hashable key that tests False in api.types.is_scalar() \
now raises a KeyError for RangeIndex.get_loc(), like it does for \
Index.get_loc(). Previously it raised an InvalidIndexError (GH52652).
Files: