Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few issues and discussion points wrt #242 #257

Open
probberechts opened this issue Dec 14, 2023 · 9 comments
Open

A few issues and discussion points wrt #242 #257

probberechts opened this issue Dec 14, 2023 · 9 comments

Comments

@probberechts
Copy link
Contributor

Although a bit late, I see a few issues wrt the recently merged PR #242 by @DriesDeprest.

First, the PR incorporates the Opta "Challenge" event in the kloppy DuelEvent. This changes the definition of a DuelEvent that we agreed upon in #135. Previously, duels corresponded to events that require an intervention. Instead, the main use of Opta's "Challenge" event is to describe the player who gets dribbled past when a dribbler takes them on. It means that the player who gets dribbled past either did nothing at all or was not able to touch the ball. Otherwise, the event would have been labeled as a "Tackle". Therefore, the definition of a DuelEvent in the Opta serializer is no longer consistent with the definition in the StatsBomb and Wyscout serializer.

I am not per se against adding the "Challenge" event, but then the StatsBomb and Wyscout serializer should be adapted accordingly and there should be a distinction between -- in Opta terminology -- a tackle and a challenge. A tackle is an intervention, while a challenge is an opportunity to tackle. To draw a parallel, giving the same label to a tackle and a challenge would be like labeling a big chance as a shot. This also sabotages my effort to integrate Kloppy and socceraction because Challenges are not seen as actions in SPADL.

Second, the PR introduces a DuelType.Tackle qualifier, which is equivalent to DuelType.GROUND + ~DuelType.LOOSE_BALL. Adding this was previously suggested by @MKlaasman in #135 (comment). Although I don't like redundant qualifiers, I am not strongly against it but it should be added to the StatsBomb and Wyscout parsers too and be documented to avoid confusion regarding the difference between a ground duel and a tackle.

@JanVanHaaren JanVanHaaren changed the title A few issues and dicscussion points wrt #242 A few issues and discussion points wrt #242 Dec 15, 2023
@JanVanHaaren
Copy link
Collaborator

Thank you for raising the issues and starting the discussion, @probberechts. I must admit that I didn't think much about the potential implications while reviewing #242. I also completely forgot about the discussion in #135.

As I recently mentioned in #240, it would be helpful if we had formal definitions for each of the kloppy event types and qualifiers.

@DriesDeprest
Copy link
Contributor

Thanks for your input on this, @probberechts.

Regarding 1/ the difference between Challenge & Tackle:
To make sure I understand - are you saying that Opta's definition of a Challenge ("player unsuccessfully attempts to tackle an opponent as the opponent dribbles past them") does not mean an unsuccessful tackle but rather an unsuccessful attempt to even make the tackle? If so, I guess I misinterpreted the definition and agree that the current implementation is undesirable.

I think we should add a DribbledPast event, which would be identified as follows for

  • StatsBomb: 1 to 1 link with StatsBomb "Dribbled Past" event
  • Opta: 1 to 1 link with Opta "Challenge" event
  • Wyscout: An unsuccessful Dribble Past event. A successful Dribble Past event should be marked as a successful DuelEvent imo.

Regarding 2/ whether we want a Tackle qualifier or not. I forgot about the discussion in #135. Moving forward, I don't have a strong opinion on whether we have an explicit Tackle qualifier or whether it is up to the user to recognize tackle events as Ground DuelEvents that are not LooseBall. Thus, happy to follow what is decided.

@koenvo do you have an opinion on this?

@probberechts
Copy link
Contributor Author

Yes, exactly. The Opta "Challenge" event agrees for ~90% with the StatsBomb's "Dribbled Past" event.

You can find some examples of "Challenge" events in BEL - POR at Euro2020 at

  • 3:31 (Vertonghen)
  • 5:00 (Tielemans)
  • 13:40 (Witsel)
  • 33:19 (Witsel)
  • 33:58 (Tielemans)
  • 39:19 (Witsel)
  • 47:04 (De Bruyne) only in Opta data

There are 27 Opta "Challenge" and 22 StatsBomb "Dribbled Past" events in this game.

I think we should add a DribbledPast event

I like how StatsBomb defines a duel: "Duel events describe when a defender challenges an attacker in some way".
In a "Dribbled Past"/"Challenge" event, there is always some degree of contact / pressing between the player that dribbles and the one that gets dribbled past. Hence, intuitively, it is some kind of duel and could thus be incorporated in the DuelEvent.

One option would be to define a "Dribbled Past" event as DuelEvent with DuelType.GROUND qualifier and DuelResult.LOST outcome. A StatsBomb "Duel" with type "Tackle" and an Opta "Tackle" event would be mapped to a DuelEvent with DuelType.GROUND and DuelType.TACKLE qualifiers with DuelResult.WON or DuelResult.LOST outcome (depending on whether possession is regained).

I like this proposal because:

  • It limits the number of event types
  • A duel and challenge are conceptually very related
  • It makes it very explicit that a tackle is something "extra" that a player can do in a ground duel

@DriesDeprest
Copy link
Contributor

Thanks for sharing, I like your proposal.

So would you agree with the following next actions:

  • Add TACKLE qualifiers to our DuelEvents for the provider tackle events in the different deserializers:

  • Recognize the provider dribble(d) past / challenge events and mark them as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST outcome:

    • Opta: Handle tackle / challenge differently
    • Wyscout: Recognize Dribble Past event as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST / WIN outcome depending on provider event success
    • StatsBomb: Recognize Dribbled Past event as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST

@probberechts
Copy link
Contributor Author

I only have a minor remark regarding

Wyscout: Recognize Dribble Past event as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST / WIN outcome depending on provider event success

The Wyscout docs give the following examples of a won dribble past attempt:

  1. defending player dispossesses the attacker
  2. defending player kicks the ball out
  3. the attacker stays with the ball, but the defender forces him to go back

According to the current implementation of the StatsBomb deserializer, a team has to regain possession after a duel for it to be considered successful. Hence, only the first one would yield a DuelResult.WON outcome.

I am not sure what the best solution would be here. You could certainly argue that the second and third examples are—albeit to a lesser degree—also successful.

@DriesDeprest
Copy link
Contributor

Okay, I understand. For Wyscout v3 I would then apply the logic shown in the screenshot to determine the DuelResult
image

The stoppedProgress and recoveredPossession can be read from Wyscout v3's raw data.

Do you agree with this approach?

@DriesDeprest
Copy link
Contributor

@probberechts I've created an overview of how different duel events of different providers, currently, are parsed by kloppy and two suggestions on how to change that to properly capture the dribbled past events.

I've added an explicit and implicit suggestion on how to adjust the kloppy duel type definitions to be able to support dribbled past events. In the explicit version, we would add a DribbledPast duel type to explicitly label dribbled past events. In the implicit version, a dribbled past event could be recognized as a duel event with a qualifier Ground and no qualifier Tackle or LooseBall.

I've done this exercise for Opta, StatsBomb & Wyscout v3. I'm not planning in the near future to adjust the Wyscout v3 parsing deserialization logic, but wanted to already do the thought exercise to make sure our decision is future-proof in case we will update the Wyscout v3 deserializer.

Do you like any of the two proposals? Which has your preference? Or am I still missing something in my suggestion?

@probberechts
Copy link
Contributor Author

probberechts commented Jan 11, 2024

Thanks @DriesDeprest.

I am happy with the assignment of the DuelType qualifiers and I don't have a strong preference for the explicit or implicit DuelType qualifiers.

Determining the DuelOutcome is more challenging. I guess the first question is whether we stick to the existing outcomes (WON, LOST, NEUTRAL) or whether we add additional gradations of being successful. Looking at your analysis, I think the following criteria are used for a duel to be successful or unsuccessful by the data providers:

  • Which dueling player touches the ball first (this is used to determine won/lost for aerial duels)
  • Whether one of the dueling players performs the next action
  • Whether one of the dueling player's teammates performs the next action (i.e., keep or regain possession)
  • Whether the ball is knocked out of bounds
  • Whether the attacker is forced to go back

One idea would be to work with qualifiers (i.e., a DuelOutcome qualifier) for each of these criteria. It should be possible to derive each of them from the data. Then you can derive a default (WON, LOST, NEUTRAL) outcome by combing qualifiers and users can modify this definition if they do not agree. It will require quite a lot of work to implement this though.

The alternative would be to mostly rely on the provider's definitions as in your propososal. Trying to summarize this and stating what's still unclear:

  • Duel[Aerial, LooseBall]: we use the data provider's definition to define the outcome as WON or LOST.
    • Wyscout considers a duel won in favor of the player who touches the ball first, no matter what happens next. An aerial duel that results in a foul is considered won in favor of the player who suffered a foul.
    • StatsBomb: TODO[I cannot find an exact definition for the outcomes]
    • Opta: TODO[I cannot find an exact definition for the outcomes]
  • Duel[Ground, LooseBall]
    • StatsBomb: we consider the duel as WON if the player's team regains possession, otherwise as LOST.
    • Opta: we use the data provider's definition to define the outcome as WON or LOST. TODO[I cannot find an exact definition for the outcomes]
    • Wyscout: TODO[Wyscout V2 has won / neutral / lost and accurate / not accurate qualifiers. Does V3 have these too? What does won / neutral / lost mean? What do we use to define the outcome?]
  • Duel[Ground, Tackle]:
    • StatsBomb: we consider the duel as WON if the player's team regains possession, otherwise as LOST.
    • Opta: we consider the duel as WON if the player's team regains possession or if the ball goes out of play, otherwise as LOST. TODO["if the ball goes out of play" is not compatible with StatsBomb]
    • Wyscout: TODO
  • Duel[Ground, DribbbledPast]:
    • StatsBomb: we consider a "DribbledPast" event as a lost duel
    • Opta: we consider a "Challenge" as a lost duel. TODO:[Is a "Tackle (attempt)" a DribbledPast or Tackle?]
    • Wyscout: TODO[???]

@DriesDeprest
Copy link
Contributor

Thanks for reviewing and sharing your insights @probberechts.

On the explicit vs implicit DuelType qualifiers, my preference would go the explicit suggestion.

Regarding the DuelResult, I agree that your solution of adding the listed qualifiers that would allow calculating the DuelResult will result in more predictable and standardized behaviour across data providers. However, I don't feel comfortable committing to develop this logic, as it indeed seems like quite a lot of work.

Therefore, I would suggest that in the short run I refine our current implementation by also recognizing dribbled past events and for now use the providers' outcome labels to determine our result. Thus, I'll follow the logic which we'll agree upon here.

@JanVanHaaren @koenvo any thoughts on this? I'd like to start implementing this, but want to make sure you guys agree with the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants