GridFill#

The skyborn.gridfill module provides functions to fill missing values in gridded data using iterative relaxation methods to solve Poisson’s equation.

This module is particularly useful for meteorological and oceanographic data where spatial interpolation of missing values is needed while preserving physical constraints.

Overview#

The gridfill algorithm solves the 2D Poisson equation:

\[\nabla^2 \phi = 0\]

where \(\phi\) represents the field to be filled. The iterative relaxation scheme converges to a solution that smoothly interpolates missing values while preserving the boundary conditions from observed data.

Key Features#

  • Poisson equation solver for gap filling

  • Support for cyclic and non-cyclic boundaries

  • Configurable initialization (zeros or zonal mean)

  • Multi-dimensional array support

  • Integration with xarray DataArrays

Interfaces#

The gridfill module provides two main interfaces:

  1. Standard Interface: Works with numpy arrays and masked arrays

  2. xarray Interface: Modern interface with automatic coordinate detection

Standard Interface#

Grid filling procedures for missing value interpolation.

This module provides functions to fill missing values in gridded data using iterative relaxation methods to solve Poisson’s equation. It is particularly useful for meteorological and oceanographic data where spatial interpolation of missing values is needed while preserving physical constraints.

This module is based on the gridfill package by Andrew Dawson: ajdawson/gridfill

Key Features:
  • Poisson equation solver for gap filling

  • Support for cyclic and non-cyclic boundaries

  • Configurable initialization (zeros or zonal mean)

  • Multi-dimensional array support

  • Integration with xarray DataArrays

Mathematical Background:

The algorithm solves the 2D Poisson equation: ∇²φ = 0 where φ represents the field to be filled. The iterative relaxation scheme converges to a solution that smoothly interpolates missing values while preserving the boundary conditions from observed data.

Examples

>>> import numpy as np
>>> import numpy.ma as ma
>>> from skyborn.gridfill import fill
>>>
>>> # Create test data with missing values
>>> data = np.random.rand(50, 100)
>>> mask = np.zeros_like(data, dtype=bool)
>>> mask[20:30, 40:60] = True  # Create a gap
>>> masked_data = ma.array(data, mask=mask)
>>>
>>> # Fill the missing values
>>> filled_data, converged = fill(masked_data, xdim=1, ydim=0, eps=1e-4)
>>> print(f"Convergence: {converged}")

Note

This implementation requires compiled Cython extensions for optimal performance. The core computational routines are implemented in _gridfill.pyx and compiled during package installation.

fill(grids, xdim, ydim, eps, relax=0.6, itermax=100, initzonal=False, initzonal_linear=False, cyclic=False, initial_value=0.0, verbose=False)[source]

Fill missing values in grids using Poisson equation solver.

This function fills missing values in masked arrays by solving Poisson’s equation (∇²φ = 0) using an iterative relaxation scheme. The method provides smooth interpolation of gaps while preserving boundary conditions from observed data.

Parameters:
  • grids (numpy.ma.MaskedArray) – Masked array containing data with missing values to fill. Missing values are indicated by the mask.

  • xdim (int) – Index of the dimension representing the x-coordinate (longitude)

  • ydim (int) – Index of the dimension representing the y-coordinate (latitude)

  • eps (float) – Convergence tolerance. Iteration stops when the maximum residual falls below this threshold.

  • relax (float, optional) – Relaxation parameter for the iterative scheme. Must be in range (0, 1). Values between 0.45-0.6 typically work well. Default: 0.6

  • itermax (int, optional) – Maximum number of iterations. Default: 100

  • initzonal (bool, optional) – Initialization method for missing values: - False: Initialize with zeros - True: Initialize with zonal (x-direction) mean Default: False

  • initzonal_linear (bool, optional) – Use linear interpolation for zonal initialization: - False: Use constant zonal mean (if initzonal=True) - True: Use linear interpolation between valid points in each latitude band This provides better initial conditions by connecting valid data points with linear interpolation rather than using a constant mean value. Can be used with both cyclic and non-cyclic data. Default: False

  • cyclic (bool, optional) – Whether the x-coordinate is cyclic (e.g., longitude wrapping around). When True, the algorithm treats the rightmost and leftmost columns as adjacent for interpolation purposes. Default: False

  • initial_value (float, optional) – Custom initial value for missing data points when using zero initialization (i.e., when both initzonal=False and initzonal_linear=False). This allows setting a more appropriate background value for specific applications. Default: 0.0

  • verbose (bool, optional) – Print convergence information for each grid. Default: False

Returns:

  • filled_grids (numpy.ndarray) – Array with missing values filled, same shape as input

  • converged (numpy.ndarray) – Boolean array indicating convergence status for each 2D grid slice

Raises:
  • TypeError – If input is not a masked array

  • ValueError – If xdim or ydim are invalid dimension indices

Return type:

Tuple[ndarray, ndarray]

Notes

The algorithm solves: ∇²φ = (∂²φ/∂x²) + (∂²φ/∂y²) = 0

using a finite difference relaxation scheme: φ[i,j]^(k+1) = φ[i,j]^k + relax * (residual[i,j] / 4)

where the residual is computed using a 5-point stencil.

Examples

Fill a simple 2D grid:

>>> import numpy as np
>>> import numpy.ma as ma
>>> from skyborn.gridfill import fill
>>>
>>> # Create test data with gaps
>>> data = np.random.rand(50, 100)
>>> mask = np.zeros_like(data, dtype=bool)
>>> mask[20:30, 40:60] = True
>>> masked_data = ma.array(data, mask=mask)
>>>
>>> # Fill missing values
>>> filled, converged = fill(masked_data, xdim=1, ydim=0, eps=1e-4)
>>> print(f"Converged: {converged[0]}")

Fill multiple time steps:

>>> # 3D data: (time, lat, lon)
>>> data_3d = np.random.rand(12, 50, 100)
>>> mask_3d = np.zeros_like(data_3d, dtype=bool)
>>> mask_3d[:, 20:30, 40:60] = True
>>> masked_3d = ma.array(data_3d, mask=mask_3d)
>>>
>>> filled_3d, converged_3d = fill(masked_3d, xdim=2, ydim=1, eps=1e-4)
>>> print(f"Convergence rate: {converged_3d.mean():.1%}")

See also

skyborn.gridfill.xarray.fill

Fill xarray DataArrays with automatic coordinate detection

Main Functions#

fill(grids, xdim, ydim, eps, relax=0.6, itermax=100, initzonal=False, initzonal_linear=False, cyclic=False, initial_value=0.0, verbose=False)[source]#

Fill missing values in grids using Poisson equation solver.

This function fills missing values in masked arrays by solving Poisson’s equation (∇²φ = 0) using an iterative relaxation scheme. The method provides smooth interpolation of gaps while preserving boundary conditions from observed data.

Parameters:
  • grids (numpy.ma.MaskedArray) – Masked array containing data with missing values to fill. Missing values are indicated by the mask.

  • xdim (int) – Index of the dimension representing the x-coordinate (longitude)

  • ydim (int) – Index of the dimension representing the y-coordinate (latitude)

  • eps (float) – Convergence tolerance. Iteration stops when the maximum residual falls below this threshold.

  • relax (float, optional) – Relaxation parameter for the iterative scheme. Must be in range (0, 1). Values between 0.45-0.6 typically work well. Default: 0.6

  • itermax (int, optional) – Maximum number of iterations. Default: 100

  • initzonal (bool, optional) – Initialization method for missing values: - False: Initialize with zeros - True: Initialize with zonal (x-direction) mean Default: False

  • initzonal_linear (bool, optional) – Use linear interpolation for zonal initialization: - False: Use constant zonal mean (if initzonal=True) - True: Use linear interpolation between valid points in each latitude band This provides better initial conditions by connecting valid data points with linear interpolation rather than using a constant mean value. Can be used with both cyclic and non-cyclic data. Default: False

  • cyclic (bool, optional) – Whether the x-coordinate is cyclic (e.g., longitude wrapping around). When True, the algorithm treats the rightmost and leftmost columns as adjacent for interpolation purposes. Default: False

  • initial_value (float, optional) – Custom initial value for missing data points when using zero initialization (i.e., when both initzonal=False and initzonal_linear=False). This allows setting a more appropriate background value for specific applications. Default: 0.0

  • verbose (bool, optional) – Print convergence information for each grid. Default: False

Returns:

  • filled_grids (numpy.ndarray) – Array with missing values filled, same shape as input

  • converged (numpy.ndarray) – Boolean array indicating convergence status for each 2D grid slice

Raises:
  • TypeError – If input is not a masked array

  • ValueError – If xdim or ydim are invalid dimension indices

Return type:

Tuple[ndarray, ndarray]

Notes

The algorithm solves: ∇²φ = (∂²φ/∂x²) + (∂²φ/∂y²) = 0

using a finite difference relaxation scheme: φ[i,j]^(k+1) = φ[i,j]^k + relax * (residual[i,j] / 4)

where the residual is computed using a 5-point stencil.

Examples

Fill a simple 2D grid:

>>> import numpy as np
>>> import numpy.ma as ma
>>> from skyborn.gridfill import fill
>>>
>>> # Create test data with gaps
>>> data = np.random.rand(50, 100)
>>> mask = np.zeros_like(data, dtype=bool)
>>> mask[20:30, 40:60] = True
>>> masked_data = ma.array(data, mask=mask)
>>>
>>> # Fill missing values
>>> filled, converged = fill(masked_data, xdim=1, ydim=0, eps=1e-4)
>>> print(f"Converged: {converged[0]}")

Fill multiple time steps:

>>> # 3D data: (time, lat, lon)
>>> data_3d = np.random.rand(12, 50, 100)
>>> mask_3d = np.zeros_like(data_3d, dtype=bool)
>>> mask_3d[:, 20:30, 40:60] = True
>>> masked_3d = ma.array(data_3d, mask=mask_3d)
>>>
>>> filled_3d, converged_3d = fill(masked_3d, xdim=2, ydim=1, eps=1e-4)
>>> print(f"Convergence rate: {converged_3d.mean():.1%}")

See also

skyborn.gridfill.xarray.fill

Fill xarray DataArrays with automatic coordinate detection

xarray Interface#

The xarray interface provides modern, metadata-aware gap filling with automatic coordinate detection.

Gridfill procedures for missing value interpolation with xarray interface.

This module provides functions to fill missing values in xarray DataArrays using iterative relaxation methods to solve Poisson’s equation. It preserves coordinate information and metadata throughout the computation process.

Main Functions:

fill : Fill missing values in xarray DataArray

Examples

>>> import xarray as xr
>>> import numpy as np
>>> from skyborn.gridfill.xarray import fill
>>>
>>> # Load data with missing values
>>> data = xr.open_dataarray('temperature_with_gaps.nc')
>>>
>>> # Fill missing values preserving metadata
>>> filled_data = fill(data, eps=1e-4)
>>> print(filled_data.attrs)  # Original attributes preserved
fill(data, eps, x_dim=None, y_dim=None, relax=0.6, itermax=100, initzonal=False, initzonal_linear=False, cyclic=None, initial_value=0.0, verbose=False, keep_attrs=True)[source]

Fill missing values in xarray DataArray using Poisson equation solver.

This function fills missing values (NaN or masked values) in gridded data by solving Poisson’s equation (∇²φ = 0) using an iterative relaxation scheme. The method provides smooth interpolation while preserving coordinate information and metadata from the input DataArray.

Parameters:
  • data (xarray.DataArray) – Input DataArray containing data with missing values to fill. Missing values can be NaN or masked values.

  • eps (float) – Convergence tolerance. Iteration stops when the maximum residual falls below this threshold.

  • x_dim (str, optional) – Name of the x-coordinate dimension (longitude). If None, will be detected automatically using coordinate metadata.

  • y_dim (str, optional) – Name of the y-coordinate dimension (latitude). If None, will be detected automatically using coordinate metadata.

  • relax (float, default 0.6) – Relaxation parameter for the iterative scheme. Must be in range (0, 1). Values between 0.45-0.6 typically work well.

  • itermax (int, default 100) – Maximum number of iterations.

  • initzonal (bool, default False) – Initialization method for missing values: - False: Initialize with zeros or initial_value - True: Initialize with zonal (x-direction) mean

  • initzonal_linear (bool, default False) – Use linear interpolation for zonal initialization: - False: Use constant zonal mean (if initzonal=True) - True: Use linear interpolation between valid points in each latitude band This provides better initial conditions by connecting valid data points with linear interpolation rather than using a constant mean value. Can be used with both cyclic and non-cyclic data.

  • cyclic (bool, optional) – Whether the x-coordinate is cyclic (e.g., longitude wrapping). If None, will be detected automatically for longitude coordinates.

  • initial_value (float, default 0.0) – Initial value to use for missing grid points when initzonal=False. This provides a custom starting guess for the iterative solver. When initzonal=True, this value may still be used in combination with the zonal mean for enhanced initialization.

  • verbose (bool, default False) – Print convergence information for each slice.

  • keep_attrs (bool, default True) – Preserve input DataArray attributes in output.

Returns:

filled_data – DataArray with missing values filled, preserving coordinates and optionally attributes from input.

Return type:

xarray.DataArray

Raises:
  • ValueError – If spatial coordinates cannot be identified or are invalid

  • TypeError – If input is not an xarray DataArray

Warning

Issues warning if algorithm fails to converge on any slices

Notes

The algorithm solves: ∇²φ = (∂²φ/∂x²) + (∂²φ/∂y²) = 0

using a finite difference relaxation scheme. The method automatically: - Detects spatial coordinates using metadata - Handles cyclic longitude boundaries for global data - Preserves all coordinate information and attributes - Works with multi-dimensional data (time series, levels, etc.)

For missing value detection, both NaN values and xarray/numpy masked arrays are supported.

Examples

Basic usage with automatic coordinate detection:

>>> import xarray as xr
>>> import numpy as np
>>> from skyborn.gridfill.xarray import fill
>>>
>>> # Load data with missing values
>>> data = xr.open_dataarray('sst_with_gaps.nc')
>>>
>>> # Fill missing values
>>> filled = fill(data, eps=1e-4)
>>> print(f"Original shape: {data.shape}")
>>> print(f"Filled shape: {filled.shape}")
>>> print(f"Attributes preserved: {filled.attrs == data.attrs}")

Advanced usage with explicit parameters:

>>> # Create test data with gaps
>>> lons = np.linspace(0, 360, 72, endpoint=False)
>>> lats = np.linspace(-90, 90, 36)
>>> time = pd.date_range('2020-01-01', periods=12, freq='M')
>>>
>>> # Create DataArray with metadata
>>> data = xr.DataArray(
...     np.random.rand(12, 36, 72),
...     coords={'time': time, 'lat': lats, 'lon': lons},
...     dims=['time', 'lat', 'lon'],
...     attrs={'units': 'K', 'long_name': 'temperature'}
... )
>>>
>>> # Add some missing values
>>> data = data.where(np.random.rand(*data.shape) > 0.1)
>>>
>>> # Fill with custom settings
>>> filled = fill(
...     data,
...     eps=1e-5,
...     relax=0.55,
...     initzonal=True,
...     verbose=True
... )

Working with specific coordinate dimensions:

>>> # Explicitly specify coordinate dimensions
>>> filled = fill(data, eps=1e-4, x_dim='longitude', y_dim='latitude')

See also

skyborn.gridfill.fill

Lower-level function for numpy arrays

fill_multiple(datasets, eps, x_dim=None, y_dim=None, **kwargs)[source]

Fill missing values in multiple DataArrays with consistent parameters.

This convenience function applies the same gridfill parameters to multiple DataArrays, ensuring consistent processing across related datasets.

Parameters:
  • datasets (list of xarray.DataArray) – List of DataArrays to process

  • eps (float) – Convergence tolerance for all datasets

  • x_dim (str, optional) – X-coordinate dimension name (applied to all)

  • y_dim (str, optional) – Y-coordinate dimension name (applied to all)

  • **kwargs – Additional parameters passed to fill()

Returns:

List of filled DataArrays in same order as input

Return type:

list of xarray.DataArray

Examples

>>> from skyborn.gridfill.xarray import fill_multiple
>>>
>>> # Fill multiple related variables
>>> temp_filled, humid_filled = fill_multiple(
...     [temperature_data, humidity_data],
...     eps=1e-4,
...     verbose=True
... )
validate_grid_coverage(data, x_dim=None, y_dim=None, min_coverage=0.1)[source]

Validate grid data coverage and suitability for gridfill.

This function analyzes the input data to determine if it’s suitable for gap filling and provides diagnostic information.

Parameters:
  • data (xarray.DataArray) – Input data to analyze

  • x_dim (str, optional) – X-coordinate dimension name

  • y_dim (str, optional) – Y-coordinate dimension name

  • min_coverage (float, default 0.1) – Minimum fraction of valid data required (0.0 to 1.0)

Returns:

Dictionary containing validation results: - ‘valid’: bool, whether data is suitable for filling - ‘coverage’: float, fraction of valid data points - ‘total_points’: int, total number of grid points - ‘missing_points’: int, number of missing points - ‘messages’: list, diagnostic messages

Return type:

dict

Examples

>>> from skyborn.gridfill.xarray import validate_grid_coverage
>>>
>>> # Check data quality before filling
>>> validation = validate_grid_coverage(data, min_coverage=0.2)
>>> if validation['valid']:
...     filled = fill(data, eps=1e-4)
... else:
...     print("Data quality issues:", validation['messages'])

Main Functions#

fill(data, eps, x_dim=None, y_dim=None, relax=0.6, itermax=100, initzonal=False, initzonal_linear=False, cyclic=None, initial_value=0.0, verbose=False, keep_attrs=True)[source]#

Fill missing values in xarray DataArray using Poisson equation solver.

This function fills missing values (NaN or masked values) in gridded data by solving Poisson’s equation (∇²φ = 0) using an iterative relaxation scheme. The method provides smooth interpolation while preserving coordinate information and metadata from the input DataArray.

Parameters:
  • data (xarray.DataArray) – Input DataArray containing data with missing values to fill. Missing values can be NaN or masked values.

  • eps (float) – Convergence tolerance. Iteration stops when the maximum residual falls below this threshold.

  • x_dim (str, optional) – Name of the x-coordinate dimension (longitude). If None, will be detected automatically using coordinate metadata.

  • y_dim (str, optional) – Name of the y-coordinate dimension (latitude). If None, will be detected automatically using coordinate metadata.

  • relax (float, default 0.6) – Relaxation parameter for the iterative scheme. Must be in range (0, 1). Values between 0.45-0.6 typically work well.

  • itermax (int, default 100) – Maximum number of iterations.

  • initzonal (bool, default False) – Initialization method for missing values: - False: Initialize with zeros or initial_value - True: Initialize with zonal (x-direction) mean

  • initzonal_linear (bool, default False) – Use linear interpolation for zonal initialization: - False: Use constant zonal mean (if initzonal=True) - True: Use linear interpolation between valid points in each latitude band This provides better initial conditions by connecting valid data points with linear interpolation rather than using a constant mean value. Can be used with both cyclic and non-cyclic data.

  • cyclic (bool, optional) – Whether the x-coordinate is cyclic (e.g., longitude wrapping). If None, will be detected automatically for longitude coordinates.

  • initial_value (float, default 0.0) – Initial value to use for missing grid points when initzonal=False. This provides a custom starting guess for the iterative solver. When initzonal=True, this value may still be used in combination with the zonal mean for enhanced initialization.

  • verbose (bool, default False) – Print convergence information for each slice.

  • keep_attrs (bool, default True) – Preserve input DataArray attributes in output.

Returns:

filled_data – DataArray with missing values filled, preserving coordinates and optionally attributes from input.

Return type:

xarray.DataArray

Raises:
  • ValueError – If spatial coordinates cannot be identified or are invalid

  • TypeError – If input is not an xarray DataArray

Warning

Issues warning if algorithm fails to converge on any slices

Notes

The algorithm solves: ∇²φ = (∂²φ/∂x²) + (∂²φ/∂y²) = 0

using a finite difference relaxation scheme. The method automatically: - Detects spatial coordinates using metadata - Handles cyclic longitude boundaries for global data - Preserves all coordinate information and attributes - Works with multi-dimensional data (time series, levels, etc.)

For missing value detection, both NaN values and xarray/numpy masked arrays are supported.

Examples

Basic usage with automatic coordinate detection:

>>> import xarray as xr
>>> import numpy as np
>>> from skyborn.gridfill.xarray import fill
>>>
>>> # Load data with missing values
>>> data = xr.open_dataarray('sst_with_gaps.nc')
>>>
>>> # Fill missing values
>>> filled = fill(data, eps=1e-4)
>>> print(f"Original shape: {data.shape}")
>>> print(f"Filled shape: {filled.shape}")
>>> print(f"Attributes preserved: {filled.attrs == data.attrs}")

Advanced usage with explicit parameters:

>>> # Create test data with gaps
>>> lons = np.linspace(0, 360, 72, endpoint=False)
>>> lats = np.linspace(-90, 90, 36)
>>> time = pd.date_range('2020-01-01', periods=12, freq='M')
>>>
>>> # Create DataArray with metadata
>>> data = xr.DataArray(
...     np.random.rand(12, 36, 72),
...     coords={'time': time, 'lat': lats, 'lon': lons},
...     dims=['time', 'lat', 'lon'],
...     attrs={'units': 'K', 'long_name': 'temperature'}
... )
>>>
>>> # Add some missing values
>>> data = data.where(np.random.rand(*data.shape) > 0.1)
>>>
>>> # Fill with custom settings
>>> filled = fill(
...     data,
...     eps=1e-5,
...     relax=0.55,
...     initzonal=True,
...     verbose=True
... )

Working with specific coordinate dimensions:

>>> # Explicitly specify coordinate dimensions
>>> filled = fill(data, eps=1e-4, x_dim='longitude', y_dim='latitude')

See also

skyborn.gridfill.fill

Lower-level function for numpy arrays

fill_multiple(datasets, eps, x_dim=None, y_dim=None, **kwargs)[source]#

Fill missing values in multiple DataArrays with consistent parameters.

This convenience function applies the same gridfill parameters to multiple DataArrays, ensuring consistent processing across related datasets.

Parameters:
  • datasets (list of xarray.DataArray) – List of DataArrays to process

  • eps (float) – Convergence tolerance for all datasets

  • x_dim (str, optional) – X-coordinate dimension name (applied to all)

  • y_dim (str, optional) – Y-coordinate dimension name (applied to all)

  • **kwargs – Additional parameters passed to fill()

Returns:

List of filled DataArrays in same order as input

Return type:

list of xarray.DataArray

Examples

>>> from skyborn.gridfill.xarray import fill_multiple
>>>
>>> # Fill multiple related variables
>>> temp_filled, humid_filled = fill_multiple(
...     [temperature_data, humidity_data],
...     eps=1e-4,
...     verbose=True
... )
validate_grid_coverage(data, x_dim=None, y_dim=None, min_coverage=0.1)[source]#

Validate grid data coverage and suitability for gridfill.

This function analyzes the input data to determine if it’s suitable for gap filling and provides diagnostic information.

Parameters:
  • data (xarray.DataArray) – Input data to analyze

  • x_dim (str, optional) – X-coordinate dimension name

  • y_dim (str, optional) – Y-coordinate dimension name

  • min_coverage (float, default 0.1) – Minimum fraction of valid data required (0.0 to 1.0)

Returns:

Dictionary containing validation results: - ‘valid’: bool, whether data is suitable for filling - ‘coverage’: float, fraction of valid data points - ‘total_points’: int, total number of grid points - ‘missing_points’: int, number of missing points - ‘messages’: list, diagnostic messages

Return type:

dict

Examples

>>> from skyborn.gridfill.xarray import validate_grid_coverage
>>>
>>> # Check data quality before filling
>>> validation = validate_grid_coverage(data, min_coverage=0.2)
>>> if validation['valid']:
...     filled = fill(data, eps=1e-4)
... else:
...     print("Data quality issues:", validation['messages'])

Examples#

Basic Usage with Numpy Arrays#

import numpy as np
import numpy.ma as ma
from skyborn.gridfill import fill

# Create test data with missing values
data = np.random.rand(50, 100)
mask = np.zeros_like(data, dtype=bool)
mask[20:30, 40:60] = True  # Create a gap
masked_data = ma.array(data, mask=mask)

# Fill the missing values
filled_data, converged = fill(masked_data, xdim=1, ydim=0, eps=1e-4)
print(f"Convergence: {converged[0]}")

Modern xarray Interface#

import xarray as xr
import numpy as np
from skyborn.gridfill.xarray import fill

# Load data with missing values
data = xr.open_dataarray('sst_with_gaps.nc')

# Fill missing values preserving metadata
filled = fill(data, eps=1e-4)
print(f"Attributes preserved: {bool(filled.attrs)}")

# Advanced usage with validation
from skyborn.gridfill.xarray import validate_grid_coverage

validation = validate_grid_coverage(data, min_coverage=0.2)
if validation['valid']:
    filled = fill(data, eps=1e-5, verbose=True)
else:
    print("Data quality issues:", validation['messages'])

Working with Time Series#

import numpy as np
import pandas as pd
import xarray as xr
from skyborn.gridfill.xarray import fill

# Create time series data with gaps
lons = np.linspace(0, 360, 72, endpoint=False)
lats = np.linspace(-90, 90, 36)
time = pd.date_range('2020-01-01', periods=12, freq='M')

# Create DataArray with metadata
data = xr.DataArray(
    np.random.rand(12, 36, 72),
    coords={'time': time, 'lat': lats, 'lon': lons},
    dims=['time', 'lat', 'lon'],
    attrs={'units': 'K', 'long_name': 'temperature'}
)

# Add some missing values
data = data.where(np.random.rand(*data.shape) > 0.1)

# Fill with custom settings
filled = fill(
    data,
    eps=1e-5,
    relax=0.55,
    initzonal=True,
    verbose=True
)

Multiple Dataset Processing#

from skyborn.gridfill.xarray import fill_multiple

# Fill multiple related variables consistently
temp_filled, humid_filled = fill_multiple(
    [temperature_data, humidity_data],
    eps=1e-4,
    verbose=True
)

Cyclic Boundary Conditions#

# For global data, cyclic boundaries are detected automatically
global_sst = xr.open_dataarray('global_sst.nc')
filled_sst = fill(global_sst, eps=1e-4)  # Automatic cyclic detection

# Or specify explicitly
filled_sst = fill(global_sst, eps=1e-4, cyclic=True)

Performance Considerations#

Algorithm Parameters#

  • eps: Convergence tolerance. Smaller values give higher accuracy but require more iterations

  • relax: Relaxation parameter (0 < relax < 1). Values 0.45-0.6 typically work well

  • itermax: Maximum iterations. Increase for difficult convergence cases

  • initzonal: Use zonal mean initialization for data with strong zonal patterns

Best Practices#

  1. Validate data quality before filling using validate_grid_coverage()

  2. Use appropriate convergence tolerance - 1e-4 is often sufficient

  3. Enable cyclic boundaries for global longitude data

  4. Consider zonal initialization for atmospheric data with strong zonal patterns

  5. Monitor convergence with verbose output for difficult cases

Mathematical Details#

Algorithm Description#

The gridfill algorithm uses a finite difference relaxation scheme to solve:

\[\nabla^2 \phi = \frac{\partial^2 \phi}{\partial x^2} + \frac{\partial^2 \phi}{\partial y^2} = 0\]

The iterative update formula is:

\[\phi_{i,j}^{(k+1)} = \phi_{i,j}^k + \text{relax} \times \frac{\text{residual}_{i,j}}{4}\]

where the residual is computed using a 5-point stencil:

\[\text{residual}_{i,j} = \phi_{i+1,j} + \phi_{i-1,j} + \phi_{i,j+1} + \phi_{i,j-1} - 4\phi_{i,j}\]

Boundary Conditions#

  • Non-cyclic: Natural boundary conditions at edges

  • Cyclic: Periodic boundary conditions for longitude (wrapping)

  • Missing data: Dirichlet boundary conditions from observed values

Convergence Criteria#

The algorithm converges when:

\[\max(|\text{residual}_{i,j}|) < \epsilon\]

for all missing data points.

References#

This implementation is based on the gridfill package by Andrew Dawson: ajdawson/gridfill

The algorithm is commonly used in meteorological and oceanographic applications for:

  • Sea surface temperature gap filling

  • Satellite data reconstruction

  • Climate model data processing

  • Observational data quality control

See Also#

  • skyborn.interp - For other interpolation methods

  • skyborn.windspharm - For spherical harmonic analysis

  • skyborn.calc - For atmospheric calculations