Getting started¶

Welcome to `xarray-einstats`!¶

xarray-einstats is an open source Python library part of the ArviZ project. It acts as a bridge between the xarray library for labelled arrays and libraries for raw arrays such as NumPy or SciPy.

Xarray has as “Compatibility with the broader ecosystem” as one of its main goals. Which is what allows xarray-einstats to perform this bridge role with minimal code and duplication.

Overview¶

xarray-einstats provides wrappers for:

Most of the functions in numpy.linalg
A subset of scipy.stats
rearrange and reduce from einops

These wrappers have the same names and functionality as the original functions. The difference in behaviour is that the wrappers will not make assumptions about the meaning of a dimension based on its position nor they have arguments like axis or axes. They will have dims argument that take dimension names instead of integers indicating the positions of the dimensions on which to act.

It also provides a handful of re-implemented functions:

These are partially reimplemented because the original function doesn’t yet support multidimensional and/or batched computations. They also share the name with a function in NumPy or SciPy, but they only implement a subset of the features. Moreover, the goal is for those to eventually be wrappers too.

Using `xarray-einstats`¶

DataArray inputs¶

Functions in xarray-einstats are designed to work on DataArray objects.

Let’s load some example data:

from xarray_einstats import linalg, stats, tutorial

da = tutorial.generate_matrices_dataarray(4)
da

<xarray.DataArray (batch: 10, experiment: 3, dim: 4, dim2: 4)> Size: 4kB
3.799 0.4308 3.24 0.1412 0.9402 0.7951 ... 0.6156 1.124 0.8559 2.108 0.7637
Dimensions without coordinates: batch, experiment, dim, dim2

and show an example:

stats.skew(da, dims=["batch", "dim2"])

<xarray.DataArray (experiment: 3, dim: 4)> Size: 96B
1.256 1.432 0.9728 1.762 1.612 1.188 1.033 2.388 2.196 1.455 1.631 1.373
Dimensions without coordinates: experiment, dim

xarray-einstats uses dims as argument throughout the codebase as an alternative to both axis or axes indistinctively, also as alternative to the (..., M, M) convention used by NumPy.

The use of dims follows dot, instead of the singular dim argument used for example in mean. Both a single dimension or multiple are valid inputs, and using dims emphasizes the fact that operations and reductions can be performed over multiple dimensions at the same time. Moreover, in linear algebra functions, dims is often restricted to a 2 element list as it indicates which dimensions define the matrices, interpreting all the others as batch dimensions.

That means that the two calls below are equivalent, even if the dimension names of the inputs are not, because their dimension names are the same. Thus,

linalg.det(da, dims=["dim", "dim2"])

<xarray.DataArray (batch: 10, experiment: 3)> Size: 240B
23.55 2.033 0.3923 -7.374 0.06645 ... 1.804 -0.1599 8.875 -0.04935 -8.428
Dimensions without coordinates: batch, experiment

returns the same as:

linalg.det(da.transpose("dim2", "experiment", "dim", "batch"), dims=["dim", "dim2"])

<xarray.DataArray (experiment: 3, batch: 10)> Size: 240B
23.55 -7.374 -5.617 -12.29 1.77 -0.6289 ... -11.07 -0.5096 -28.77 -0.1599 -8.428
Dimensions without coordinates: experiment, batch

Important

In xarray_einstats only the dimension names matter, not their order.

Getting started¶

Welcome to xarray-einstats!¶

Overview¶

Using xarray-einstats¶

DataArray inputs¶

Dataset and GroupBy inputs¶

Welcome to `xarray-einstats`!¶

Using `xarray-einstats`¶