Expanding an events column to match the Dataset’s shape

In this section we’re going to demonstrate how to use expand_to_match_ds(). Some observations first:

  • The events DataFrame doesn’t have a custom index name, so we’re going to let fill_value_col be event_index.

  • We’re going to fill the output DataArray with the index of each event in a forward-fill way. It’s important that this column be unique, which is the case this way.

  • We’re going to use start_frame as the dimension_matching_col by previously specifying that it maps to frame in the Dataset. This mapping is consistent since the values of start_frame form a subset of the values of Dataset.

  • fill_value_col and fill_method use the default values, so we won’t explicitly specify them.

By calling expand_to_match_ds() this way we’ll be constructing a DataArray with the following properties:

  • The coordinate is frame.

  • At each position, there’s the (unique) index of each event repeated forward until a new index needs to be placed. Therefore, each value represents the event that is currently taking place at the frame determined by the coordinate.

To do this, we first need to make sure to call load() specifying the mapping and then call expand_to_match_ds() with the values already discussed:

import numpy as np
import pandas as pd
import xarray as xr
import xarray_events

ds = xr.Dataset(
    data_vars={
        'ball_trajectory': (
            ['frame', 'cartesian_coords'],
            np.exp(np.linspace((-6, -8), (3, 2), 2450))
        )
    },
    coords={
        'frame': np.arange(1, 2451),
        'cartesian_coords': ['x', 'y'],
        'player_id': [2, 3, 7, 19, 20, 21, 22, 28, 34, 79]
    },
    attrs={'match_id': 12, 'resolution_fps': 25}
)

events = pd.DataFrame(
    {
        'event_type':
            ['pass', 'goal', 'pass', 'pass', 'pass',
             'penalty', 'goal', 'pass', 'pass', 'penalty'],
        'start_frame': [1, 425, 600, 945, 1100, 1280, 1890, 2020, 2300, 2390],
        'end_frame': [424, 599, 944, 1099, 1279, 1889, 2019, 2299, 2389, 2450],
        'player_id': [79, 79, 19, 2, 3, 2, 3, 79, 2, 79]
    }
)
(
    ds
    .events.load(events, {'frame': ('start_frame', 'end_frame')})
    .events.expand_to_match_ds('start_frame')
)
<xarray.DataArray 'event_index' (frame: 2450)>
array([ 0., nan, nan, ..., nan, nan, nan])
Coordinates:
  * frame    (frame) int64 1 2 3 4 5 6 7 ... 2444 2445 2446 2447 2448 2449 2450

See expand_to_match_ds(): A closer look. for a detailed explanation on how that happened.