./math/py-pandas, Python Data Analysis Library

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 2.1.3, Package name: py311-pandas-2.1.3, Maintainer: bad

pandas is an open source, BSD-licensed library providing
high-performance, easy-to-use data structures and data analysis tools
for the Python programming language.

Required to run:
[graphics/py-matplotlib] [devel/py-setuptools] [time/py-dateutil] [time/py-pytz] [databases/py-sqlite3] [math/py-scipy] [math/py-numpy] [math/py-numexpr] [math/py-bottleneck] [lang/python37] [math/py-tables]

Required to build:
[pkgtools/cwrappers] [devel/py-test-runner]

Master sites:

Filesize: 4172.71 KB

Version history: (Expand)

CVS history: (Expand)

   2023-11-11 11:04:38 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pandas: updated to 2.1.3

Pandas 2.1.3

This is a patch release in the 2.1.x series and includes some regression and bug \ 
fixes, and a security fix. We recommend that all users upgrade to this version.
   2023-10-29 18:39:51 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pandas: updated to 2.1.2



Reverted deprecation of fill_method=None in DataFrame.pct_change(), \ 
Series.pct_change(), DataFrameGroupBy.pct_change(), and \ 
SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and 'ffill' \ 
are still deprecated (GH 53491)

Fixed regressions

Fixed regression in DataFrame.join() where result has missing values and dtype \ 
is arrow backed string (GH 55348)
Fixed regression in rolling() where non-nanosecond index or on column would \ 
produce incorrect results (GH 55026, GH 55106, GH 55299)
Fixed regression in DataFrame.resample() which was extrapolating back to origin \ 
when origin was outside its bounds (GH 55064)
Fixed regression in DataFrame.sort_index() which was not sorting correctly when \ 
the index was a sliced MultiIndex (GH 55379)
Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg() where if the \ 
option compute.use_numba was set to True, groupby methods not supported by the \ 
numba engine would raise a TypeError (GH 55520)
Fixed performance regression with wide DataFrames, typically involving methods \ 
where all columns were accessed individually (GH 55256, GH 55245)
Fixed regression in merge_asof() raising TypeError for by with datetime and \ 
timedelta dtypes (GH 55453)
Fixed regression in read_parquet() when reading a file with a string column \ 
consisting of more than 2 GB of string data and using the "string" \ 
dtype (GH 55606)
Fixed regression in DataFrame.to_sql() not roundtripping datetime columns \ 
correctly for sqlite when using detect_types (GH 55554)
Fixed regression in construction of certain DataFrame or Series subclasses (GH 54922)

Bug fixes

Fixed bug in DataFrameGroupBy reductions not preserving object dtype when \ 
infer_string is set (GH 55620)
Fixed bug in SeriesGroupBy.value_counts() returning incorrect dtype for string \ 
columns (GH 55627)
Fixed bug in Categorical.equals() if other has arrow backed string dtype (GH 55364)
Fixed bug in DataFrame.__setitem__() not inferring string dtype for \ 
zero-dimensional array with infer_string=True (GH 55366)
Fixed bug in DataFrame.idxmin() and DataFrame.idxmax() raising for arrow dtypes \ 
(GH 55368)
Fixed bug in DataFrame.interpolate() raising incorrect error message (GH 55347)
Fixed bug in Index.insert() raising when inserting None into Index with \ 
dtype="string[pyarrow_numpy]" (GH 55365)
Fixed bug in Series.all() and Series.any() not treating missing values correctly \ 
for dtype="string[pyarrow_numpy]" (GH 55367)
Fixed bug in Series.floordiv() for ArrowDtype (GH 55561)
Fixed bug in Series.mode() not sorting values for arrow backed string dtype (GH \ 
Fixed bug in Series.rank() for string[pyarrow_numpy] dtype (GH 55362)
Fixed bug in Series.str.extractall() for ArrowDtype dtype being converted to \ 
object (GH 53846)
Fixed bug where PDEP-6 warning about setting an item of an incompatible dtype \ 
was being shown when creating a new conditional column (GH 55025)
Silence Period[B] warnings introduced by GH 53446 during normal plotting \ 
activity (GH 55138)
Fixed bug in Series constructor not inferring string dtype when NA is the first \ 
value and infer_string is set (:issue:` 55655`)


Fixed non-working installation of optional dependency group output_formatting. \ 
Replacing underscore _ with a dash - fixes broken dependency resolution. A \ 
correct way to use now is pip install pandas[output-formatting].
   2023-10-28 21:57:26 by Thomas Klausner | Files touched by this commit (516) | Package updated
Log message:
python/wheel.mk: simplify a lot, and switch to 'installer' for installation

This follows the recommended bootstrap method (flit_core, build, installer).

However, installer installs different files than pip, so update PLISTs
for all packages using wheel.mk and bump their PKGREVISIONs.
   2023-10-15 02:05:44 by David H. Gutteridge | Files touched by this commit (1)
Log message:
py-pandas: fix minimum meson dependency pattern

We need to force a minimum with the most recent Python multi-version
   2023-10-05 06:46:05 by David H. Gutteridge | Files touched by this commit (3) | Package updated
Log message:
py-pandas: fix (sandboxed) non-default Python builds

Another issue where Meson isn't versioned in pkgsrc, so we end up with
it "helpfully" supplying the path to Python it believes is correct,
which is wrong for any non-default Python version. (The 2.1.0 version
of this package carried a similar fix, which was removed in the update
to 2.1.1; a variation of it is restored here.)

Separately, this package directly expresses a minimum Meson version, so
reflect that as well.
   2023-09-28 18:01:24 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-pandas: updated to 2.1.1

What’s new in 2.1.1 (September 20, 2023)

These are the changes in pandas 2.1.1. See Release notes for a full changelog \ 
including other versions of pandas.

Fixed regressions

Fixed regression in concat() when DataFrame ‘s have two different extension \ 
dtypes (GH 54848)
Fixed regression in merge() when merging over a PyArrow string index (GH 54894)
Fixed regression in read_csv() when usecols is given and dtypes is a dict for \ 
engine="python" (GH 54868)
Fixed regression in read_csv() when delim_whitespace is True (GH 54918, GH 54931)
Fixed regression in GroupBy.get_group() raising for axis=1 (GH 54858)
Fixed regression in DataFrame.__setitem__() raising AssertionError when setting \ 
a Series with a partial MultiIndex (GH 54875)
Fixed regression in DataFrame.filter() not respecting the order of elements for \ 
filter (GH 54980)
Fixed regression in DataFrame.to_sql() not roundtripping datetime columns \ 
correctly for sqlite (GH 54877)
Fixed regression in DataFrameGroupBy.agg() when aggregating a DataFrame with \ 
duplicate column names using a dictionary (GH 55006)
Fixed regression in MultiIndex.append() raising when appending overlapping \ 
IntervalIndex levels (GH 54934)
Fixed regression in Series.drop_duplicates() for PyArrow strings (GH 54904)
Fixed regression in Series.interpolate() raising when fill_value was given (GH 54920)
Fixed regression in Series.value_counts() raising for numeric data if bins was \ 
specified (GH 54857)
Fixed regression in comparison operations for PyArrow backed columns not \ 
propagating exceptions correctly (GH 54944)
Fixed regression when comparing a Series with datetime64 dtype with None (GH 54870)

Bug fixes

Fixed bug for ArrowDtype raising NotImplementedError for fixed-size list (GH 55000)
Fixed bug in DataFrame.stack() with future_stack=True and columns a \ 
non-MultiIndex consisting of tuples (GH 54948)
Fixed bug in Series.dt.tz() with ArrowDtype where a string was returned instead \ 
of a tzinfo object (GH 55003)
Fixed bug in Series.pct_change() and DataFrame.pct_change() showing unnecessary \ 
FutureWarning (GH 54981)


Reverted the deprecation that disallowed Series.apply() returning a DataFrame \ 
when the passed-in callable returns a Series object (GH 52116)
   2023-09-02 09:19:56 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-pandas: updated to 2.1.0

   2023-08-28 12:34:02 by Adam Ciarcinski | Files touched by this commit (4) | Package updated
Log message:
py-pandas: updated to 2.0.3


Fixed regressions

Bug in Timestamp.weekday`() was returning incorrect results before '0000-02-29' \ 
Fixed performance regression in merging on datetime-like columns (GH53231)
Fixed regression when DataFrame.to_string() creates extra space for string \ 
dtypes (GH52690)

Bug fixes

Bug in DataFrame.convert_dtype() and Series.convert_dtype() when trying to \ 
convert ArrowDtype with dtype_backend="nullable_numpy" (GH53648)
Bug in RangeIndex.union() when using sort=True with another RangeIndex (GH53490)
Bug in Series.reindex() when expanding a non-nanosecond datetime or timedelta \ 
Series would not fill with NaT correctly (GH53497)
Bug in read_csv() when defining dtype with bool[pyarrow] for the "c" \ 
and "python" engines (GH53390)
Bug in Series.str.split() and Series.str.rsplit() with expand=True for \ 
ArrowDtype with pyarrow.string (GH53532)
Bug in indexing methods (e.g. DataFrame.__getitem__()) where taking the entire \ 
DataFrame/Series would raise an OverflowError when Copy on Write was enabled and \ 
the length of the array was over the maximum size a 32-bit integer can hold \ 
Bug when constructing a DataFrame with columns of an ArrowDtype with a \ 
pyarrow.dictionary type that reindexes the data (GH53617)
Bug when indexing a DataFrame or Series with an Index with a timestamp \ 
ArrowDtype would raise an AttributeError (GH53644)


Fixed regressions

Fixed performance regression in GroupBy.apply() (GH53195)
Fixed regression in merge() on Windows when dtype is np.intc (GH52451)
Fixed regression in read_sql() dropping columns with duplicated column names \ 
Fixed regression in DataFrame.loc() losing MultiIndex name when enlarging object \ 
Fixed regression in DataFrame.to_string() printing a backslash at the end of the \ 
first row of data, instead of headers, when the DataFrame doesn’t fit the line \ 
width (GH53054)
Fixed regression in MultiIndex.join() returning levels in wrong order (GH53093)

Bug fixes

Bug in arrays.ArrowExtensionArray incorrectly assigning dict instead of list for \ 
.type with pyarrow.map_ and raising a NotImplementedError with pyarrow.struct \ 
Bug in api.interchange.from_dataframe() was raising IndexError on empty \ 
categorical data (GH53077)
Bug in api.interchange.from_dataframe() was returning DataFrame’s of incorrect \ 
sizes when called on slices (GH52824)
Bug in api.interchange.from_dataframe() was unnecessarily raising on bitmasks \ 
Bug in merge() when merging on datetime columns on different resolutions (GH53200)
Bug in read_csv() raising OverflowError for engine="pyarrow" and \ 
parse_dates set (GH53295)
Bug in to_datetime() was inferring format to contain "%H" instead of \ 
"%I" if date contained “AM” / “PM” tokens (GH53147)
Bug in DataFrame.convert_dtypes() ignores convert_* keywords when set to False \ 
dtype_backend="pyarrow" (GH52872)
Bug in DataFrame.convert_dtypes() losing timezone for tz-aware dtypes and \ 
dtype_backend="pyarrow" (GH53382)
Bug in DataFrame.sort_values() raising for PyArrow dictionary dtype (GH53232)
Bug in Series.describe() treating pyarrow-backed timestamps and timedeltas as \ 
categorical data (GH53001)
Bug in Series.rename() not making a lazy copy when Copy-on-Write is enabled when \ 
a scalar is passed to it (GH52450)
Bug in pd.array() raising for NumPy array and pa.large_string or pa.large_binary \ 
Bug in DataFrame.__getitem__() not preserving dtypes for MultiIndex partial keys \ 


Fixed regressions

Fixed regression for subclassed Series when constructing from a dictionary (GH52445)
Fixed regression in SeriesGroupBy.agg() failing when grouping with categorical \ 
data, multiple groupings, as_index=False, and a list of aggregations (GH52760)
Fixed regression in DataFrame.pivot() changing Index name of input object (GH52629)
Fixed regression in DataFrame.resample() raising on a DataFrame with no columns \ 
Fixed regression in DataFrame.sort_values() not resetting index when DataFrame \ 
is already sorted and ignore_index=True (GH52553)
Fixed regression in MultiIndex.isin() raising TypeError for Generator (GH52568)
Fixed regression in Series.describe() showing RuntimeWarning for extension dtype \ 
Series with one element (GH52515)
Fixed regression when adding a new column to a DataFrame when the \ 
DataFrame.columns was a RangeIndex and the new key was hashable but not a scalar \ 

Bug fixes

Bug in Series.dt.days that would overflow int32 number of days (GH52391)
Bug in arrays.DatetimeArray constructor returning an incorrect unit when passed \ 
a non-nanosecond numpy datetime array (GH52555)
Bug in ArrowExtensionArray with duration dtype overflowing when constructed from \ 
data containing numpy NaT (GH52843)
Bug in Series.dt.round() when passing a freq of equal or higher resolution \ 
compared to the Series would raise a ZeroDivisionError (GH52761)
Bug in Series.median() with ArrowDtype returning an approximate median (GH52679)
Bug in api.interchange.from_dataframe() was unnecessarily raising on categorical \ 
dtypes (GH49889)
Bug in api.interchange.from_dataframe() was unnecessarily raising on large \ 
string dtypes (GH52795)
Bug in pandas.testing.assert_series_equal() where check_dtype=False would still \ 
raise for datetime or timedelta types with different resolutions (GH52449)
Bug in read_csv() casting PyArrow datetimes to NumPy when \ 
dtype_backend="pyarrow" and parse_dates is set causing a performance \ 
bottleneck in the process (GH52546)
Bug in to_datetime() and to_timedelta() when trying to convert numeric data with \ 
a ArrowDtype (GH52425)
Bug in to_numeric() with errors='coerce' and dtype_backend='pyarrow' with \ 
ArrowDtype data (GH52588)
Bug in ArrowDtype.__from_arrow__() not respecting if dtype is explicitly given \ 
Bug in DataFrame.describe() not respecting ArrowDtype in include and exclude \ 
Bug in DataFrame.max() and related casting different Timestamp resolutions \ 
always to nanoseconds (GH52524)
Bug in Series.describe() not returning ArrowDtype with pyarrow.float64 type with \ 
numeric data (GH52427)
Bug in Series.dt.tz_localize() incorrectly localizing timestamps with ArrowDtype \ 
Bug in arithmetic between np.datetime64 and np.timedelta64 NaT scalars with \ 
units always returning nanosecond resolution (GH52295)
Bug in logical and comparison operations between ArrowDtype and numpy masked \ 
types (e.g. "boolean") (GH52625)
Fixed bug in merge() when merging with ArrowDtype one one and a NumPy dtype on \ 
the other side (GH52406)
Fixed segfault in Series.to_numpy() with null[pyarrow] dtype (GH52443)


DataFrame created from empty dicts had columns of dtype object. It is now a \ 
RangeIndex (GH52404)
Series created from empty dicts had index of dtype object. It is now a \ 
RangeIndex (GH52404)
Implemented Series.str.split() and Series.str.rsplit() for ArrowDtype with \ 
pyarrow.string (GH52401)
Implemented most str accessor methods for ArrowDtype with pyarrow.string (GH52401)
Supplying a non-integer hashable key that tests False in api.types.is_scalar() \ 
now raises a KeyError for RangeIndex.get_loc(), like it does for \ 
Index.get_loc(). Previously it raised an InvalidIndexError (GH52652).