Mann-Kendall#
The Mann-Kendall test is a non-parametric statistical test used to detect trends in time series data. This module provides optimized implementations for both single time series and multidimensional arrays, making it ideal for climate data analysis.
Key Features#
Fast Multidimensional Workloads: Often tens to hundreds of times faster than looping over
pymannkendallon dense gridded climate workloadsClimate Data Optimized: Efficient processing of typical climate grids (40×192×288)
Memory Efficient: Smart chunking with minimal memory overhead (~25MB for full climate datasets)
Robust Statistics: Handles missing data and provides comprehensive trend statistics
Multiple Interfaces: Support for NumPy arrays, xarray DataArrays, and Dask arrays
Supported Test Families#
The public API uses test=... to select the Mann-Kendall test family:
test="original": original Mann-Kendall testtest="yue_wang": Yue-Wang (2004) modified variance correctiontest="seasonal": Hirsch-Slack (1984) seasonal Mann-Kendall testtest="correlated_seasonal": Hipel (1994) correlated seasonal Mann-Kendall testtest="correlated_multivariate": Libiseller-Grimvall correlated multivariate Mann-Kendall testtest="multivariate": grouped multivariate Mann-Kendall testtest="regional": grouped regional Mann-Kendall testtest="hamed_rao": Hamed-Rao (1998) variance correctiontest="pre_whitening": Yue-Wang (2002) pre-whitening modificationtest="trend_free_pre_whitening": Yue-Wang (2002) trend-free pre-whitening
All interfaces default to test="original".
The optional period=... argument is only used by
test="seasonal" and test="correlated_seasonal". When omitted,
these seasonal test families default to 12. Other test families ignore it.
The optional lag=... argument is only used by
test="yue_wang" and test="hamed_rao". Other test families ignore it.
Skyborn vs pymannkendall#
Compared with pymannkendall, Skyborn is designed not only for one
time series at a time, but also for multidimensional climate-analysis
workloads such as (time, lat, lon), (time, level, lat, lon), and
(ensemble, time, lat, lon) as well as labeled xarray inputs.
The practical differences are:
Skyborn supports multidimensional NumPy and xarray inputs directly, while
pymannkendallis primarily a single-series API.Skyborn uses batch-oriented compiled kernels for dense clean-series hot paths, so large gridded workloads do not have to loop through every grid point in Python.
Skyborn keeps the same core MK-family semantics as
pymannkendallfor the supported test families, while extending them to multidimensional analysis.
The table below summarizes the current public correspondence.
Skyborn |
Corresponding |
Status |
Main Skyborn difference |
|---|---|---|---|
|
|
Aligned |
Direct multidimensional NumPy/xarray support plus compiled batch execution |
|
|
Aligned |
Direct multidimensional support plus optional |
|
|
Aligned |
Direct multidimensional support and fast dense-batch execution |
|
|
Aligned |
Direct multidimensional support and fast dense-batch execution |
|
|
Aligned |
Direct multidimensional support plus optional |
|
|
Aligned |
Seasonal multidimensional analysis with grouped compiled kernels |
|
|
Aligned |
Seasonal multidimensional analysis with grouped covariance kernels |
|
|
Aligned |
Grouped multidimensional workflows are supported directly |
|
|
Aligned |
Grouped multidimensional workflows are supported directly |
|
|
Aligned |
Grouped multidimensional workflows are supported directly |
|
|
Aligned |
Separate partial-MK API for one-dimensional, multidimensional, and xarray inputs |
On representative real-data benchmarks in this project, Skyborn has often been
dramatically faster than looping over pymannkendall for gridded climate
data. Exact speedups depend on the test family, time length, missing-data
pattern, and memory layout. The implementation focus in Skyborn is dense
multidimensional workloads rather than one-series-at-a-time Python loops.
Partial Mann-Kendall#
Partial Mann-Kendall is exposed through a separate API because it needs two
inputs: a response series and a covariate series. It is therefore not selected
through test=... in the single-series MK interfaces.
Available functions:
partial_mann_kendall_test: one response series plus one covariate seriespartial_mann_kendall_multidim: multidimensional NumPy arrayspartial_mann_kendall_xarray: xarrayDataArrayinputs
Quick Start#
Basic Trend Detection#
import numpy as np
from skyborn.calc import mann_kendall_test
# Create sample time series with trend
time = np.arange(50)
data = 0.02 * time + np.random.randn(50) * 0.5
# Perform Mann-Kendall test
result = mann_kendall_test(data)
print(f"Trend: {result['trend']:.4f} units/year")
print(f"Significant: {result['h']} (p={result['p']:.3f})")
print(f"Mann-Kendall tau: {result['tau']:.3f}")
Multidimensional Climate Data#
import numpy as np
from skyborn.calc import mann_kendall_multidim
# Climate data: (time, lat, lon)
climate_data = np.random.randn(40, 192, 288)
# Add realistic warming trend
years = np.arange(40)
warming_pattern = np.random.randn(192, 288) * 0.02
for t, year in enumerate(years):
climate_data[t] += warming_pattern * year
# Analyze trends across all grid points
results = mann_kendall_multidim(climate_data, axis=0)
print(f"Grid shape: {results['trend'].shape}") # (192, 288)
print(f"Significant trends: {np.sum(results['h'])} grid points")
print(f"Mean warming: {np.nanmean(results['trend']):.4f} ± {np.nanstd(results['trend']):.4f}")
Using xarray Interface#
import xarray as xr
import pandas as pd
from skyborn.calc import mann_kendall_xarray
# Create xarray DataArray
time = pd.date_range('1980', periods=40, freq='AS')
data = xr.DataArray(
np.random.randn(40, 96, 144),
coords={'time': time, 'lat': np.arange(96), 'lon': np.arange(144)},
dims=['time', 'lat', 'lon']
)
# Perform trend analysis
trends = mann_kendall_xarray(data, dim='time')
# Results are returned as xarray Dataset
print(trends.trend.attrs) # Includes metadata
trends.trend.plot() # Easy visualization
Grouped Multivariate / Regional MK#
Grouped MK families collapse both the analyzed time dimension and one grouping dimension, while preserving the remaining spatial dimensions.
grouped = xr.DataArray(
data,
dims=["time", "member", "lat", "lon"],
)
regional = mann_kendall_xarray(
grouped,
dim="time",
group_dim="member",
test="regional",
)
print(regional.trend.dims) # ('lat', 'lon')
Partial MK With a Covariate#
Partial MK is useful when the trend in a response variable should be tested after accounting for a second covariate, for example precipitation after controlling for an ENSO index or water quality after controlling for streamflow.
import numpy as np
from skyborn.calc import partial_mann_kendall_multidim
# Response field: (time, lat, lon)
response = np.random.randn(40, 96, 144)
# One global covariate shared by all grid points
covariate = np.linspace(-1.0, 1.0, 40)
partial = partial_mann_kendall_multidim(
response,
covariate,
axis=0,
)
print(partial["trend"].shape) # (96, 144)
Performance Optimization#
For Large Datasets#
The implementation automatically optimizes performance for large climate datasets:
# For very large datasets, control memory usage
results = mann_kendall_multidim(
large_climate_data,
axis=0,
chunk_size=2000 # Process 2000 grid points at once
)
Expected Performance#
Typical processing speeds for climate data:
Small grids (50×20×30): ~1,500 grid points/second
Climate grids (40×192×288): ~1,800 grid points/second (~30 seconds total)
Large grids (100×360×720): ~600 grid points/second
Memory usage is minimal (~25MB) regardless of grid size due to intelligent chunking.
API Reference#
Single Time Series Analysis#
- mann_kendall_test(data, alpha=0.05, method='theilslopes', test='original', period=None, lag=None)[source]#
Perform Mann-Kendall test for trend detection on 1D time series.
The Mann-Kendall test is a nonparametric test for detecting monotonic trends in time series data. It makes no assumptions about the distribution of the data.
- Parameters:
data (array-like) – 1D time series data. Can be numpy array or xarray DataArray.
alpha (float, default 0.05) – Significance level for hypothesis testing (Type I error probability).
method (str, default 'theilslopes') – Method for calculating slope: - ‘theilslopes’: Theil-Sen slope estimator (robust, recommended) - ‘linregress’: Linear regression slope (faster but less robust)
test (str, default 'original') – Mann-Kendall test family to apply: - ‘original’: original Mann-Kendall test - ‘yue_wang’: Yue and Wang (2004) modified variance correction - ‘seasonal’: Hirsch-Slack (1984) seasonal Mann-Kendall test - ‘correlated_seasonal’: Hipel (1994) correlated seasonal Mann-Kendall test - ‘correlated_multivariate’: Libiseller-Grimvall correlated multivariate MK test - ‘multivariate’: Hirsch-Slack / Helsel grouped multivariate MK test - ‘regional’: Helsel regional MK test - ‘hamed_rao’: Hamed and Rao (1998) variance correction - ‘pre_whitening’: Yue and Wang (2002) pre-whitening modification - ‘trend_free_pre_whitening’: Yue and Wang (2002) trend-free pre-whitening
period (int, optional) – Seasonal cycle length used by
test="seasonal"andtest="correlated_seasonal". When omitted, these seasonal test families default to12. Other test families ignore this argument.lag (int, optional) – Number of autocorrelation lags to include for
test="yue_wang"andtest="hamed_rao".Noneuses all available lags. This argument has no effect for other test families.
- Returns:
result – Dictionary containing trend analysis results:
- ’trend’float
Slope of the trend (units per time step)
- ’h’bool
Hypothesis test result. True if significant trend exists at the specified alpha level.
- ’p’float
Two-tailed p-value of the test
- ’z’float
Normalized test statistic (z-score)
- ’tau’float
Kendall’s tau correlation coefficient
- ’std_error’float
Standard error of the detrended residuals
- Return type:
Notes
The Mann-Kendall test statistic S is calculated as:
\[S = \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} \text{sign}(x_j - x_i)\]The test assumes that: - Data are independent, unless a serial-correlation-aware test= family is used - Data come from the same distribution - Missing values are handled by removal
References
Mann, H. B. (1945). Nonparametric tests against trend. Econometrica, 13(3), 245-259. Kendall, M. G. (1948). Rank correlation methods. Griffin, London. Yue, S., & Wang, C. (2004). The Mann-Kendall test modified by effective sample size to detect trend in serially correlated hydrological series. Water resources management, 18(3), 201-218.
Examples
>>> import numpy as np >>> from skyborn.calc import mann_kendall_test >>> >>> # Generate trend data >>> t = np.arange(50) >>> data = 0.3 * t + np.random.normal(0, 0.5, 50) >>> >>> # Test for trend >>> result = mann_kendall_test(data) >>> print(f"Trend slope: {result['trend']:.3f}") >>> print(f"P-value: {result['p']:.3f}") >>> print(f"Significant trend: {result['h']}")
Multidimensional Analysis#
- mann_kendall_multidim(data, axis=0, alpha=0.05, method='theilslopes', chunk_size=None, dim=None, test='original', period=None, lag=None, group_axis=None)[source]#
Optimized numpy-based Mann-Kendall test for multidimensional arrays.
This implementation is typically faster than xarray-based versions for large arrays because it avoids xarray overhead and uses optimized numpy operations.
Supports various input dimensions: - 1D: (time,) - single time series - 2D: (time, ensemble) or (ensemble, time) - ensemble time series - 3D: (time, lat, lon) - spatial grid - ND: arbitrary multidimensional arrays
- Parameters:
data (np.ndarray) – Input array with time series along one axis
axis (int or str, default 0) – Axis along which to compute trends. Can be integer index or dimension name. For numpy arrays with string names, the array must have named axes.
alpha (float, default 0.05) – Significance level
method (str, default 'theilslopes') – Slope calculation method (‘theilslopes’ or ‘linregress’)
chunk_size (int, optional) – Process data in chunks to manage memory usage
dim (int or str, optional) – Alternative name for axis parameter (XArray style). Takes precedence over axis.
test (str, default 'original') – Mann-Kendall test family to apply: - ‘original’: original Mann-Kendall test - ‘yue_wang’: Yue and Wang (2004) modified variance correction - ‘seasonal’: Hirsch-Slack (1984) seasonal Mann-Kendall test - ‘correlated_seasonal’: Hipel (1994) correlated seasonal Mann-Kendall test - ‘correlated_multivariate’: Libiseller-Grimvall correlated multivariate MK test - ‘multivariate’: Hirsch-Slack / Helsel grouped multivariate MK test - ‘regional’: Helsel regional MK test - ‘hamed_rao’: Hamed and Rao (1998) variance correction - ‘pre_whitening’: Yue and Wang (2002) pre-whitening modification - ‘trend_free_pre_whitening’: Yue and Wang (2002) trend-free pre-whitening
period (int, optional) – Seasonal cycle length used by
test="seasonal"andtest="correlated_seasonal". When omitted, these seasonal test families default to12. Other test families ignore this argument.lag (int, optional) – Number of autocorrelation lags to include for
test="yue_wang"andtest="hamed_rao".Noneuses all available lags. This argument has no effect for other test families.group_axis (int or str, optional) – Grouping axis used by grouped test families such as
test="multivariate",test="regional", andtest="correlated_multivariate". If omitted, it is inferred only when there is exactly one non-time axis.
- Returns:
result – Dictionary with result arrays for each statistic: - ‘trend’: slope values - ‘h’: significance test results (boolean) - ‘p’: p-values - ‘z’: z-scores - ‘tau’: Kendall’s tau values - ‘std_error’: standard errors
- Return type:
Notes
Parameter priority: dim > axis Both axis and dim can accept integer indices or string dimension names. For numpy arrays, string names require the array to have a ‘dims’ attribute or similar.
XArray Interface#
- mann_kendall_xarray(data, dim='time', alpha=0.05, method='theilslopes', use_dask=True, test='original', period=None, lag=None, group_dim=None)[source]#
Mann-Kendall test for xarray DataArray with intelligent dimension handling.
Automatically handles: - 1D time series (pure time dimension) - Multi-dimensional data with preserved dimension order - Scalar dimensions (size=1) from sel() operations - Correct dimension ordering in output
- Parameters:
data (xr.DataArray) – Input data array
dim (str, default 'time') – Time dimension name to analyze along
alpha (float, default 0.05) – Significance level
method (str, default 'theilslopes') – Slope calculation method
use_dask (bool, default True) – Use dask for computation if available
test (str, default 'original') – Mann-Kendall test family to apply: - ‘original’: original Mann-Kendall test - ‘yue_wang’: Yue and Wang (2004) modified variance correction - ‘seasonal’: Hirsch-Slack (1984) seasonal Mann-Kendall test - ‘correlated_seasonal’: Hipel (1994) correlated seasonal Mann-Kendall test - ‘correlated_multivariate’: Libiseller-Grimvall correlated multivariate MK test - ‘multivariate’: Hirsch-Slack / Helsel grouped multivariate MK test - ‘regional’: Helsel regional MK test - ‘hamed_rao’: Hamed and Rao (1998) variance correction - ‘pre_whitening’: Yue and Wang (2002) pre-whitening modification - ‘trend_free_pre_whitening’: Yue and Wang (2002) trend-free pre-whitening
period (int, optional) – Seasonal cycle length used by
test="seasonal"andtest="correlated_seasonal". When omitted, these seasonal test families default to12. Other test families ignore this argument.lag (int, optional) – Number of autocorrelation lags to include for
test="yue_wang"andtest="hamed_rao".Noneuses all available lags. This argument has no effect for other test families.group_dim (str, optional) – Grouping dimension used by
test="multivariate",test="regional", andtest="correlated_multivariate". If omitted, it is inferred only when there is exactly one non-time dimension.
- Returns:
result – Dataset containing trend analysis results with preserved dimension order
- Return type:
xr.Dataset
Examples
>>> # 3D data (time, lat, lon) -> (lat, lon) >>> ds = mann_kendall_xarray(data_3d, dim='time') >>> >>> # 3D data (year, ensemble, lat) -> (ensemble, lat) >>> ds = mann_kendall_xarray(data_3d, dim='year') >>> >>> # Handle scalar dimension from sel >>> subset = data.sel(lat=0) # lat becomes scalar >>> ds = mann_kendall_xarray(subset, dim='time') # Works correctly
Partial Single-Series Analysis#
Partial Multidimensional Analysis#
Partial XArray Interface#
Unified Interface#
- trend_analysis(data, axis=0, alpha=0.05, method='theilslopes', dim=None, test='original', period=None, lag=None, group_axis=None, group_dim=None, **kwargs)[source]#
Unified interface for Mann-Kendall trend analysis.
Automatically chooses the best implementation based on input type.
- Parameters:
data (np.ndarray or xr.DataArray) – Input data array
axis (int or str, default 0) – Axis along which to compute trends. Can be integer index or dimension name.
alpha (float, default 0.05) – Significance level
method (str, default 'theilslopes') – Slope calculation method
dim (int or str, optional) – Alternative name for axis parameter (XArray style). Takes precedence over axis.
test (str, default 'original') – Mann-Kendall test family to apply: - ‘original’: original Mann-Kendall test - ‘yue_wang’: Yue and Wang (2004) modified variance correction - ‘seasonal’: Hirsch-Slack (1984) seasonal Mann-Kendall test - ‘correlated_seasonal’: Hipel (1994) correlated seasonal Mann-Kendall test - ‘multivariate’: Hirsch-Slack / Helsel grouped multivariate MK test - ‘regional’: Helsel regional MK test - ‘hamed_rao’: Hamed and Rao (1998) variance correction - ‘pre_whitening’: Yue and Wang (2002) pre-whitening modification - ‘trend_free_pre_whitening’: Yue and Wang (2002) trend-free pre-whitening
period (int, optional) – Seasonal cycle length used by
test="seasonal"andtest="correlated_seasonal". When omitted, these seasonal test families default to12. Other test families ignore this argument.lag (int, optional) – Number of autocorrelation lags to include for
test="yue_wang"andtest="hamed_rao".Noneuses all available lags. This argument has no effect for other test families.group_axis (int or str, optional) – Grouping axis used by
test="multivariate"andtest="regional"for NumPy inputs.group_dim (str, optional) – Grouping dimension used by
test="multivariate"andtest="regional"for xarray inputs.**kwargs – Additional arguments passed to underlying functions
- Return type:
Notes
Parameter priority: dim > axis Both axis and dim support integer indices and string dimension names.
Statistical Background#
The Mann-Kendall Test#
The Mann-Kendall test is based on the Mann-Kendall statistic S:
where \(\text{sign}(x)\) is the sign function. Under the null hypothesis of no trend, S follows approximately a normal distribution for large n.
Test Statistics:
S: Mann-Kendall statistic
tau: Kendall’s tau coefficient (normalized correlation)
z: Standardized test statistic
p: Two-tailed p-value
h: Boolean significance test result
Slope Estimation:
The trend magnitude is estimated using either:
Theil-Sen estimator (default): Robust, non-parametric slope estimation
Linear regression: Ordinary least squares for comparison
Serial-Correlation-Aware Tests#
For autocorrelated data, explicit test families can be selected:
result_yw = mann_kendall_test(data, test="yue_wang")
result_seasonal = mann_kendall_test(data, test="seasonal", period=12)
result_corr_seasonal = mann_kendall_test(data, test="correlated_seasonal", period=12)
result_corr_multi = mann_kendall_test(grouped_matrix, test="correlated_multivariate")
result_multi = mann_kendall_test(grouped_matrix, test="multivariate")
result_regional = mann_kendall_xarray(
grouped_da, dim="time", group_dim="member", test="regional"
)
result_hr = mann_kendall_test(data, test="hamed_rao")
result_pw = mann_kendall_test(data, test="pre_whitening")
result_tfpw = mann_kendall_test(data, test="trend_free_pre_whitening")
These variants account for serial correlation using different correction strategies.
Advantages#
Non-parametric: No assumptions about data distribution
Robust: Resistant to outliers and non-linear trends
Missing data: Handles gaps in time series naturally
Significance testing: Built-in statistical inference
Climate appropriate: Designed for geophysical time series
Use Cases#
Climate Science Applications#
Temperature trends: Global and regional warming patterns
Precipitation changes: Long-term rainfall trend detection
Sea level rise: Coastal monitoring and analysis
Extreme events: Frequency and intensity trend analysis
Model validation: Comparing observed vs. simulated trends
Quality Control#
Data homogeneity: Detecting artificial trends in observations
Station records: Long-term consistency checking
Reanalysis validation: Trend comparison across datasets
Best Practices#
Data Preparation#
Quality control: Remove obviously erroneous values
Homogenization: Ensure data consistency over time
Gap analysis: Understand missing data patterns
Deseasonalization: Remove seasonal cycles if needed
Interpretation#
Physical significance: Consider whether trends are physically meaningful
Spatial coherence: Look for consistent patterns across neighboring regions
Multiple variables: Cross-validate trends across different measurements
Uncertainty: Report confidence intervals and significance levels
References#
Bari, S. H., Rahman, M. T. U., Hoque, M. A., & Hussain, M. M. (2016). Analysis of seasonal and annual rainfall trends in the northern region of Bangladesh. Atmospheric Research, 176, 148-158. DOI: https://doi.org/10.1016/j.atmosres.2016.02.008
Conover, W. J. (1980). Some methods based on ranks (Chapter 5). In Practical nonparametric statistics (2nd ed.). John Wiley and Sons.
Cox, D. R., & Stuart, A. (1955). Some quick sign tests for trend in location and dispersion. Biometrika, 42(1/2), 80-95. DOI: https://doi.org/10.2307/2333424
Dietz, E. J. (1987). A comparison of robust estimators in simple linear regression. Communications in Statistics - Simulation and Computation, 16(4), 1209-1227. DOI: https://doi.org/10.1080/03610918708812645
Hamed, K. H., & Rao, A. R. (1998). A modified Mann-Kendall trend test for autocorrelated data. Journal of Hydrology, 204(1-4), 182-196. DOI: https://doi.org/10.1016/S0022-1694(97)00125-X
Helsel, D. R., & Frans, L. M. (2006). Regional Kendall test for trend. Environmental Science & Technology, 40(13), 4066-4073. DOI: https://doi.org/10.1021/es051650b
Hipel, K. W., & McLeod, A. I. (1994). Time series modelling of water resources and environmental systems (Vol. 45). Elsevier.
Hirsch, R. M., Slack, J. R., & Smith, R. A. (1982). Techniques of trend analysis for monthly water quality data. Water Resources Research, 18(1), 107-121. DOI: https://doi.org/10.1029/WR018i001p00107
Kendall, M. (1975). Rank correlation measures. Charles Griffin, London.
Libiseller, C., & Grimvall, A. (2002). Performance of partial Mann-Kendall tests for trend detection in the presence of covariates. Environmetrics, 13(1), 71-84. DOI: https://doi.org/10.1002/env.507
Mann, H. B. (1945). Nonparametric tests against trend. Econometrica, 13(3), 245-259. DOI: https://doi.org/10.2307/1907187
Sen, P. K. (1968). Estimates of the regression coefficient based on Kendall’s tau. Journal of the American Statistical Association, 63(324), 1379-1389. DOI: https://doi.org/10.1080/01621459.1968.10480934
Theil, H. (1950). A rank-invariant method of linear and polynomial regression analysis (Parts 1-3). In Ned. Akad. Wetensch. Proc. Ser. A (Vol. 53, pp. 1397-1412).
Yue, S., & Wang, C. (2004). The Mann-Kendall test modified by effective sample size to detect trend in serially correlated hydrological series. Water Resources Management, 18(3), 201-218. DOI: https://doi.org/10.1023/B:WARM.0000043140.61082.60
Yue, S., & Wang, C. Y. (2002). Applicability of prewhitening to eliminate the influence of serial correlation on the Mann-Kendall test. Water Resources Research, 38(6), 4-1. DOI: https://doi.org/10.1029/2001WR000861
Yue, S., Pilon, P., Phinney, B., & Cavadias, G. (2002). The influence of autocorrelation on the ability to detect trend in hydrological series. Hydrological Processes, 16(9), 1807-1829. DOI: https://doi.org/10.1002/hyp.1095