Grouping a data variable by the events in the DataFrame

In this section we’re going to illustrate how to use the method groupby_events().

Let’s start by recalling that our Dataset contains the data variable ball_trajectory that has shows for each frame the cartesian coordinates of the ball. Our goal in this tutorial is to find out where the majority of these points lie for each group.

We can start by generating the groups. These groups will tell us which positions in ball_trajectory correspond to the frames determined by each event. Then we’ll compute the median of them. We can do that like this:

import numpy as np
import pandas as pd
import xarray as xr
import xarray_events

ds = xr.Dataset(
    data_vars={
        'ball_trajectory': (
            ['frame', 'cartesian_coords'],
            np.exp(np.linspace((-6, -8), (3, 2), 2450))
        )
    },
    coords={
        'frame': np.arange(1, 2451),
        'cartesian_coords': ['x', 'y'],
        'player_id': [2, 3, 7, 19, 20, 21, 22, 28, 34, 79]
    },
    attrs={'match_id': 12, 'resolution_fps': 25}
)

events = pd.DataFrame(
    {
        'event_type':
            ['pass', 'goal', 'pass', 'pass', 'pass',
             'penalty', 'goal', 'pass', 'pass', 'penalty'],
        'start_frame': [1, 425, 600, 945, 1100, 1280, 1890, 2020, 2300, 2390],
        'end_frame': [424, 599, 944, 1099, 1279, 1889, 2019, 2299, 2389, 2450],
        'player_id': [79, 79, 19, 2, 3, 2, 3, 79, 2, 79]
    }
)
(
    ds
    .events.load(events, {'frame': ('start_frame', 'end_frame')})
    .events.groupby_events('ball_trajectory')
    .median()
)
<xarray.DataArray 'ball_trajectory' (event_index: 10, cartesian_coords: 2)>
array([[5.39252098e-03, 7.95626970e-04],
       [1.62105883e-02, 2.70288906e-03],
       [4.21467117e-02, 7.81448485e-03],
       [1.05625415e-01, 2.16889707e-02],
       [1.95479999e-01, 4.29810185e-02],
       [8.34698826e-01, 2.15651098e-01],
       [3.25129820e+00, 9.76995999e-01],
       [6.90622435e+00, 2.25647449e+00],
       [1.36302613e+01, 4.80287169e+00],
       [1.79888284e+01, 6.53714824e+00]])
Coordinates:
  * cartesian_coords  (cartesian_coords) <U1 'x' 'y'
  * event_index       (event_index) int64 0 1 2 3 4 5 6 7 8 9