Skip to content

Einops tutorial, part 1: basics

Welcome to einops-land!

We don't write

y = x.transpose(0, 2, 3, 1)

We write comprehensible code

y = rearrange(x, 'b c h w -> b h w c')

einops supports widely used tensor packages (such as numpy, pytorch, chainer, gluon, tensorflow), and extends them.

What's in this tutorial?

  • fundamentals: reordering, composition and decomposition of axes
  • operations: rearrange, reduce, repeat
  • how much you can do with a single operation!

Preparations

In [1]:
# Examples are given for `numpy`. This code also setups ipython/jupyter
# so that numpy arrays in the output are displayed as images
import numpy
from utils import display_np_arrays_as_images
display_np_arrays_as_images()

Load a batch of images to play with

In [2]:
ims = numpy.load('./resources/test_images.npy', allow_pickle=False)
# There are 6 images of shape 96x96 with 3 color channels packed into tensor
print(ims.shape, ims.dtype)
(6, 96, 96, 3) float64
In [3]:
# display the first image (whole 4d tensor can't be rendered)
ims[0]
Out[3]:
In [4]:
# second image in a batch
ims[1]
Out[4]:
In [5]:
# we'll use three operations
from einops import rearrange, reduce, repeat
In [6]:
# rearrange, as its name suggests, rearranges elements
# below we swapped height and width.
# In other words, transposed first two axes (dimensions)
rearrange(ims[0], 'h w c -> w h c')
Out[6]:

Composition of axes

transposition is very common and useful, but let's move to other capabilities provided by einops

In [7]:
# einops allows seamlessly composing batch and height to a new height dimension
# We just rendered all images by collapsing to 3d tensor!
rearrange(ims, 'b h w c -> (b h) w c')
Out[7]:
In [8]:
# or compose a new dimension of batch and width
rearrange(ims, 'b h w c -> h (b w) c')
Out[8]:
In [9]:
# resulting dimensions are computed very simply
# length of newly composed axis is a product of components
# [6, 96, 96, 3] -> [96, (6 * 96), 3]
rearrange(ims, 'b h w c -> h (b w) c').shape
Out[9]:
(96, 576, 3)
In [10]:
# we can compose more than two axes. 
# let's flatten 4d array into 1d, resulting array has as many elements as the original
rearrange(ims, 'b h w c -> (b h w c)').shape
Out[10]:
(165888,)

Decomposition of axis

In [11]:
# decomposition is the inverse process - represent an axis as a combination of new axes
# several decompositions possible, so b1=2 is to decompose 6 to b1=2 and b2=3
rearrange(ims, '(b1 b2) h w c -> b1 b2 h w c ', b1=2).shape
Out[11]:
(2, 3, 96, 96, 3)
In [12]:
# finally, combine composition and decomposition:
rearrange(ims, '(b1 b2) h w c -> (b1 h) (b2 w) c ', b1=2)
Out[12]:
In [13]:
# slightly different composition: b1 is merged with width, b2 with height
# ... so letters are ordered by w then by h
rearrange(ims, '(b1 b2) h w c -> (b2 h) (b1 w) c ', b1=2)
Out[13]:
In [14]:
# move part of width dimension to height. 
# we should call this width-to-height as image width shrinked by 2 and height doubled. 
# but all pixels are the same!
# Can you write reverse operation (height-to-width)?
rearrange(ims, 'b h (w w2) c -> (h w2) (b w) c', w2=2)
Out[14]:

Order of axes matters

In [15]:
# compare with the next example
rearrange(ims, 'b h w c -> h (b w) c')
Out[15]:
In [16]:
# order of axes in composition is different
# rule is just as for digits in the number: leftmost digit is the most significant, 
# while neighboring numbers differ in the rightmost axis.

# you can also think of this as lexicographic sort
rearrange(ims, 'b h w c -> h (w b) c')
Out[16]:
In [17]:
# what if b1 and b2 are reordered before composing to width?
rearrange(ims, '(b1 b2) h w c -> h (b1 b2 w) c ', b1=2) # produces 'einops'
rearrange(ims, '(b1 b2) h w c -> h (b2 b1 w) c ', b1=2) # produces 'eoipns'
Out[17]:

Meet einops.reduce

In einops-land you don't need to guess what happened

x.mean(-1)

Because you write what the operation does

reduce(x, 'b h w c -> b h w', 'mean')

if axis is not present in the output — you guessed it — axis was reduced.

In [18]:
# average over batch
reduce(ims, 'b h w c -> h w c', 'mean')
Out[18]:
In [19]:
# the previous is identical to familiar:
ims.mean(axis=0)
# but is so much more readable
Out[19]:
In [20]:
# Example of reducing of several axes 
# besides mean, there are also min, max, sum, prod
reduce(ims, 'b h w c -> h w', 'min')
Out[20]:
In [21]:
# this is mean-pooling with 2x2 kernel
# image is split into 2x2 patches, each patch is averaged
reduce(ims, 'b (h h2) (w w2) c -> h (b w) c', 'mean', h2=2, w2=2)
Out[21]:
In [22]:
# max-pooling is similar
# result is not as smmoth as for mean-pooling
reduce(ims, 'b (h h2) (w w2) c -> h (b w) c', 'max', h2=2, w2=2)
Out[22]:
In [23]:
# yet another example. Can you compute result shape?
reduce(ims, '(b1 b2) h w c -> (b2 h) (b1 w)', 'mean', b1=2)
Out[23]:

Stack and concatenate

In [24]:
# rearrange can also take care of lists of arrays with the same shape
x = list(ims)
print(type(x), 'with', len(x), 'tensors of shape', x[0].shape)
# that's how we can stack inputs
# "list axis" becomes first ("b" in this case), and we left it there
rearrange(x, 'b h w c -> b h w c').shape
<class 'list'> with 6 tensors of shape (96, 96, 3)
Out[24]:
(6, 96, 96, 3)
In [25]:
# but new axis can appear in the other place:
rearrange(x, 'b h w c -> h w c b').shape
Out[25]:
(96, 96, 3, 6)
In [26]:
# that's equivalent to numpy stacking
numpy.array_equal(rearrange(x, 'b h w c -> h w c b'), numpy.stack(x, axis=3))
Out[26]:
True
In [27]:
# ... or we can concatenate 
rearrange(x, 'b h w c -> h (b w) c').shape  # numpy.stack(x, axis=3))
Out[27]:
(96, 576, 3)
In [28]:
# which is behavior of concatenation
numpy.array_equal(rearrange(x, 'b h w c -> h (b w) c'), numpy.concatenate(x, axis=1))
Out[28]:
True

Addition or removal of axes

You can write 1 to create new axis of length 1. Similarly you can remove such axis.

There is also a synonym () that you can use. That's a composition of zero axes and it also has a unit length.

In [29]:
x = rearrange(ims, 'b h w c -> b 1 h w 1 c') # functionality of numpy.expand_dims
print(x.shape)
print(rearrange(x, 'b 1 h w 1 c -> b h w c').shape) # functionality of numpy.squeeze
(6, 1, 96, 96, 1, 3)
(6, 96, 96, 3)

Reduce ⇆ repeat

reduce and repeat are like opposite of each other: first one reduces amount of elements, second one increases

In [30]:
# compute max in each image individually, then show a difference 
x = reduce(ims, 'b h w c -> b () () c', 'max') - ims
rearrange(x, 'b h w c -> h (b w) c')
Out[30]:

Fancy examples in random order

(a.k.a. mad designer gallery)

In [31]:
# interweaving pixels of different pictures
# all letters are observable
rearrange(ims, '(b1 b2) h w c -> (h b1) (w b2) c ', b1=2)
Out[31]:
In [32]:
# interweaving along vertical for couples of images
rearrange(ims, '(b1 b2) h w c -> (h b1) (b2 w) c', b1=2)
Out[32]:
In [33]:
# interweaving lines for couples of images
# exercise: achieve the same result without einops in your favourite framework
reduce(ims, '(b1 b2) h w c -> h (b2 w) c', 'max', b1=2)
Out[33]:
In [34]:
# color can be also composed into dimension
# ... while image is downsampled
reduce(ims, 'b (h 2) (w 2) c -> (c h) (b w)', 'mean')
Out[34]:
In [35]:
# disproportionate resize
reduce(ims, 'b (h 4) (w 3) c -> (h) (b w)', 'mean')
Out[35]:
In [36]:
# spilt each image in two halves, compute mean of the two
reduce(ims, 'b (h1 h2) w c -> h2 (b w)', 'mean', h1=2)
Out[36]:
In [37]:
# split in small patches and transpose each patch
rearrange(ims, 'b (h1 h2) (w1 w2) c -> (h1 w2) (b w1 h2) c', h2=8, w2=8)
Out[37]:
In [38]:
# stop me someone!
rearrange(ims, 'b (h1 h2 h3) (w1 w2 w3) c -> (h1 w2 h3) (b w1 h2 w3) c', h2=2, w2=2, w3=2, h3=2)
Out[38]:
In [39]:
rearrange(ims, '(b1 b2) (h1 h2) (w1 w2) c -> (h1 b1 h2) (w1 b2 w2) c', h1=3, w1=3, b2=3)
Out[39]:
In [40]:
# patterns can be arbitrarily complicated
reduce(ims, '(b1 b2) (h1 h2 h3) (w1 w2 w3) c -> (h1 w1 h3) (b1 w2 h2 w3 b2) c', 'mean', 
       h2=2, w1=2, w3=2, h3=2, b2=2)
Out[40]:
In [41]:
# subtract background in each image individually and normalize
# pay attention to () - this is composition of 0 axis, a dummy axis with 1 element.
im2 = reduce(ims, 'b h w c -> b () () c', 'max') - ims
im2 /= reduce(im2, 'b h w c -> b () () c', 'max')
rearrange(im2, 'b h w c -> h (b w) c')
Out[41]:
In [42]:
# pixelate: first downscale by averaging, then upscale back using the same pattern
averaged = reduce(ims, 'b (h h2) (w w2) c -> b h w c', 'mean', h2=6, w2=8)
repeat(averaged, 'b h w c -> (h h2) (b w w2) c', h2=6, w2=8)
Out[42]:
In [43]:
rearrange(ims, 'b h w c -> w (b h) c')
Out[43]:
In [44]:
# let's bring color dimension as part of horizontal axis
# at the same time horizonal axis is downsampled by 2x
reduce(ims, 'b (h h2) (w w2) c -> (h w2) (b w c)', 'mean', h2=3, w2=3)
Out[44]:

Ok, numpy is fun, but how do I use einops with some other framework?

If that's what you've done with ims being numpy array:

rearrange(ims, 'b h w c -> w (b h) c')

That's how you adapt the code for other frameworks:

# pytorch:
rearrange(ims, 'b h w c -> w (b h) c')
# tensorflow:
rearrange(ims, 'b h w c -> w (b h) c')
# chainer:
rearrange(ims, 'b h w c -> w (b h) c')
# gluon:
rearrange(ims, 'b h w c -> w (b h) c')
# cupy:
rearrange(ims, 'b h w c -> w (b h) c')
# jax:
rearrange(ims, 'b h w c -> w (b h) c')

...well, you got the idea.

Einops allows backpropagation as if all operations were native to framework. Operations do not change when moving to another framework

Summary

  • rearrange doesn't change number of elements and covers different numpy functions (like transpose, reshape, stack, concatenate, squeeze and expand_dims)
  • reduce combines same reordering syntax with reductions (mean, min, max, sum, prod, and any others)
  • repeat additionally covers repeating and tiling
  • composition and decomposition of axes are a corner stone, they can and should be used together