Add statistics.median_absolute_deviation#152227
Open
AnandSundar wants to merge 5 commits into
Open
Conversation
Add the median absolute deviation (MAD) function to the statistics module. MAD is a robust measure of statistical dispersion: the median of the absolute deviations from the median, optionally scaled by a consistency constant. The default scale=1.4826 (the consistency constant for the normal distribution) produces an estimator of the population standard deviation that is consistent with statistics.stdev. Pass scale=1.0 for the raw value. * data: a sequence or iterable of real-valued numbers * scale: int or float (Decimal/Fraction raise TypeError) Result type follows the data type (int/float input yields float; Decimal input yields Decimal; Fraction input yields Fraction). NaN propagates when at least one non-NaN value is present; all-NaN input raises StatisticsError (matching statistics.median()). Includes module docstring updates and __all__ entry.
Add TestMedianAbsoluteDeviation class in Lib/test/test_statistics.py, following the TestMedian / TestStdev pattern. Coverage: - Happy path: known answers for ints and floats, default / scale=1.0 / scale=2.0 / scale=3.0 / negative scale - Edge cases: empty (StatisticsError), single value, two-value symmetric, all-same, even-count averaging, generator input, tuple input - Error paths: non-numeric data (TypeError), all-NaN (StatisticsError), partial NaN (propagates) - Type acceptance: Decimal input -> Decimal result; Fraction input -> Fraction result (preserving precision); int input -> float result; mixed int+float -> float result - Scale type guard: Decimal / Fraction / str / list / None scale all raise TypeError Reuses NumericTestCase.assertApproxEqual for floating-point comparisons.
Document the new median_absolute_deviation function in Doc/library/statistics.rst: - Add :func:`median_absolute_deviation` to the 'Measures of spread' autosummary table - Add a new .. function:: directive with full description, doctest examples (int, Decimal, Fraction), and a .. versionadded:: 3.16 annotation
Add a new 'statistics' section under 'Improved modules' in Doc/whatsnew/3.16.rst, between shlex and tkinter (alphabetical order). Bullet describes the new statistics.median_absolute_deviation function, its default scale=1.4826 consistency constant, and the alternative scale=1 for the raw value. TODO: replace [REPLACE WITH CONTRIBUTOR NAME] and [REPLACE WITH PR NUMBER] placeholders with the actual contributor attribution when the PR is filed.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Replace [REPLACE WITH CONTRIBUTOR NAME] and [REPLACE WITH PR NUMBER] placeholders with Anand Sundar and pythongh-152227 now that the PR is open.
AnandSundar
added a commit
to AnandSundar/cpython
that referenced
this pull request
Jun 25, 2026
Replace [REPLACE WITH CONTRIBUTOR NAME] and [REPLACE WITH PR NUMBER] placeholders with Anand Sundar and pythongh-152227 now that the PR is open.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
0fd689c to
e2d8233
Compare
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
statistics.median_absolute_deviation(data, *, scale=1.4826)to the CPython standard library. The median absolute deviation (MAD) is the textbook robust measure of statistical dispersion: the median of the absolute deviations from the median.For normally distributed data, the default
scale=1.4826(the consistency constant) produces an estimator of the population standard deviation that is consistent withstatistics.stdev. Passscale=1.0for the raw value.Why
The
statisticsmodule currently provides traditional spread measures (variance,stdev) but lacks a robust counterpart. Standard deviation is well known to be sensitive to outliers; MAD is the textbook alternative. It is the default "robust scale" in R (mad()) and NumPy (median_abs_deviation). For users who don't want a third-party dependency for a single stat — educators, stdlib-only scripts, Python-first learners — stdlib MAD closes an obvious gap.What changes
Lib/statistics.py— newmedian_absolute_deviation(data, *, scale=1.4826)function. Result type follows input data (int/float → float; Decimal → Decimal; Fraction → Fraction).scalemust be int or float (Decimal/Fraction raiseTypeError). All-NaN input raisesStatisticsError; partial NaN propagates. Top docstring autosummary table and "Measures of spread" table widened to fit the 25-character name.__all__updated.Lib/test/test_statistics.py— newTestMedianAbsoluteDeviationclass with 28 test methods covering happy path, edge cases, error paths, type acceptance, and scale type guard. FollowsTestMedian/TestStdevpatterns.Doc/library/statistics.rst— autosummary table entry + full function directive with int/Decimal/Fraction doctest examples +.. versionadded:: 3.16.Doc/whatsnew/3.16.rst— new "statistics" section under "Improved modules".Out of scope (deliberately)
Lib/statistics.pyi— that file does not exist in CPython main. Type stubs for stdlib modules live in the separate typeshed repo and will be filed there as a follow-up. (grep -L '\.pyi' Lib/*.pyireturns zero matches inLib/.)NormalDistintegration — could be added asNormalDist.from_mad()in a follow-up PR if requested.TODO before merge
[REPLACE WITH CONTRIBUTOR NAME]and[REPLACE WITH PR NUMBER]placeholders in theDoc/whatsnew/3.16.rstentry.gh-XXXXXXplaceholders in the four commit messages with the actual issue/PR number (viagit rebase -i origin/mainandreword).median_absolute_deviation.