Grouping a data variable by the events in the DataFrame
¶
In this section we’re going to illustrate how to use the method
groupby_events()
.
Let’s start by recalling that our Dataset
contains the data variable
ball_trajectory
that has shows for each frame the cartesian coordinates
of the ball. Our goal in this tutorial is to find out where the majority of
these points lie for each group.
We can start by generating the groups. These groups will tell us which positions
in ball_trajectory
correspond to the frames determined by each event.
Then we’ll compute the median of them. We can do that like this:
import numpy as np
import pandas as pd
import xarray as xr
import xarray_events
ds = xr.Dataset(
data_vars={
'ball_trajectory': (
['frame', 'cartesian_coords'],
np.exp(np.linspace((-6, -8), (3, 2), 2450))
)
},
coords={
'frame': np.arange(1, 2451),
'cartesian_coords': ['x', 'y'],
'player_id': [2, 3, 7, 19, 20, 21, 22, 28, 34, 79]
},
attrs={'match_id': 12, 'resolution_fps': 25}
)
events = pd.DataFrame(
{
'event_type':
['pass', 'goal', 'pass', 'pass', 'pass',
'penalty', 'goal', 'pass', 'pass', 'penalty'],
'start_frame': [1, 425, 600, 945, 1100, 1280, 1890, 2020, 2300, 2390],
'end_frame': [424, 599, 944, 1099, 1279, 1889, 2019, 2299, 2389, 2450],
'player_id': [79, 79, 19, 2, 3, 2, 3, 79, 2, 79]
}
)
(
ds
.events.load(events, {'frame': ('start_frame', 'end_frame')})
.events.groupby_events('ball_trajectory')
.median()
)
<xarray.DataArray 'ball_trajectory' (event_index: 10, cartesian_coords: 2)> array([[5.39252098e-03, 7.95626970e-04], [1.62105883e-02, 2.70288906e-03], [4.21467117e-02, 7.81448485e-03], [1.05625415e-01, 2.16889707e-02], [1.95479999e-01, 4.29810185e-02], [8.34698826e-01, 2.15651098e-01], [3.25129820e+00, 9.76995999e-01], [6.90622435e+00, 2.25647449e+00], [1.36302613e+01, 4.80287169e+00], [1.79888284e+01, 6.53714824e+00]]) Coordinates: * cartesian_coords (cartesian_coords) <U1 'x' 'y' * event_index (event_index) int64 0 1 2 3 4 5 6 7 8 9
xarray.DataArray
'ball_trajectory'
- event_index: 10
- cartesian_coords: 2
- 0.005393 0.0007956 0.01621 0.002703 ... 13.63 4.803 17.99 6.537
array([[5.39252098e-03, 7.95626970e-04], [1.62105883e-02, 2.70288906e-03], [4.21467117e-02, 7.81448485e-03], [1.05625415e-01, 2.16889707e-02], [1.95479999e-01, 4.29810185e-02], [8.34698826e-01, 2.15651098e-01], [3.25129820e+00, 9.76995999e-01], [6.90622435e+00, 2.25647449e+00], [1.36302613e+01, 4.80287169e+00], [1.79888284e+01, 6.53714824e+00]])
- cartesian_coords(cartesian_coords)<U1'x' 'y'
array(['x', 'y'], dtype='<U1')
- event_index(event_index)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])