Skip to content

Micro-optimizations for the statistics module#152618

Merged
rhettinger merged 2 commits into
python:mainfrom
rhettinger:opt_stats
Jun 29, 2026
Merged

Micro-optimizations for the statistics module#152618
rhettinger merged 2 commits into
python:mainfrom
rhettinger:opt_stats

Conversation

@rhettinger

@rhettinger rhettinger commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Remove boundmethod optimizations that are no longer performant:

Baseline:
% ./python.exe -m timeit -s 'from statistics import _sum as sm, NormalDist as ND' -s 'data=ND().samples(1000
)' 'sm(data)'
1000 loops, best of 5: 293 usec per loop

Patched:
% ./python.exe -m timeit -s 'from statistics import _sum as sm, NormalDist as ND' -s 'data=ND().samples(1000)' 'sm(data)'
1000 loops, best of 5: 282 usec per loop

Replace x ** 2 with x * x:

Baseline:
% ./python.exe -m timeit -s 'from statistics import _kernel_specs as ks' -s 'pdf = ks["quartic"]["pdf"]' 'pdf(0.1234)'
5000000 loops, best of 5: 73 nsec per loop

Patched:
% ./python.exe -m timeit -s 'from statistics import _kernel_specs as ks' -s 'pdf = ks["quartic"]["pdf"]' 'pdf(0.1234)'
5000000 loops, best of 5: 50.2 nsec per loop

Baseline:
% ./python.exe -m timeit -s 'from statistics import _sum as sm, NormalDist as ND' -s 'data=ND().samples(1000
)' 'sm(data)'
1000 loops, best of 5: 293 usec per loop

Patched:
% ./python.exe -m timeit -s 'from statistics import _sum as sm, NormalDist as ND' -s 'data=ND().samples(1000)' 'sm(data)'
1000 loops, best of 5: 282 usec per loop
Reduction in strength from `x ** 2` to `x * x'.
The latter is faster and more accurate.

Baseline:
% ./python.exe -m timeit -s 'from statistics import _kernel_specs as ks' -s 'pdf = ks["quartic"]["pdf"]' 'pdf(0.1234)'
5000000 loops, best of 5: 73 nsec per loop

Patched:
% ./python.exe -m timeit -s 'from statistics import _kernel_specs as ks' -s 'pdf = ks["quartic"]["pdf"]' 'pdf(0.1234)'
5000000 loops, best of 5: 50.2 nsec per loop
@rhettinger rhettinger self-assigned this Jun 29, 2026
@rhettinger rhettinger added performance Performance or resource usage skip issue skip news labels Jun 29, 2026
Comment thread Lib/statistics.py
@register('quartic', 'biweight')
def quartic_kernel():
pdf = lambda t: 15/16 * (1.0 - t * t) ** 2
pdf = lambda t: 15/16 * (u := 1.0 - t * t) * u

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not improve readability. Do you have a bench ark for the gain of this change?

@rhettinger rhettinger Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The timings were in the individual commit messages. I just moved them to the top message in Conversation.

It is not a beautiful edit, but it isn't terrible either. Note, besides giving a nice speed-up, the reduction in strength from x ** 2 to x * x also improves accuracy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhettinger If it improves accuracy, it changes behavior. Can you add a news entry and a test?

(since you already merged the PR, this should be a followup PR)

eendebakpt added a commit to eendebakpt/cpython that referenced this pull request Jun 29, 2026
…patch

Builds on PR python#152618 (quartic walrus square; drop bound-method caching
in _sum/_ss). groupby() yields type-homogeneous groups, so resolve
typ.as_integer_ratio once per group instead of calling _exact_ratio()
on every value. The dispatch lives in a new _exact_ratios(typ, values)
generator so the accumulation loops stay unchanged; both _sum and _ss
use it. Rare types without as_integer_ratio fall back to _exact_ratio
(preserving its TypeError and Integral-ABC path); NAN/INF handled
inline. ~4% faster than the PR on _sum, all 400 test_statistics pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rhettinger rhettinger merged commit 55a09ed into python:main Jun 29, 2026
56 checks passed
@rhettinger rhettinger deleted the opt_stats branch June 29, 2026 19:12
eendebakpt added a commit to eendebakpt/cpython that referenced this pull request Jun 29, 2026
…patch

Builds on PR python#152618 (quartic walrus square; drop bound-method caching
in _sum/_ss). groupby() yields type-homogeneous groups, so resolve
typ.as_integer_ratio once per group instead of calling _exact_ratio()
on every value. The dispatch lives in a new _exact_ratios(typ, values)
generator used by both _sum and _ss, so their accumulation loops are
unchanged. _exact_ratios delegates all conversion semantics back to
_exact_ratio (whole-group fallback for types without as_integer_ratio;
per-element for float/Decimal NAN/INF), so no logic is duplicated.

~4% faster than the PR on _sum; all 400 test_statistics pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage skip issue skip news

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants