Skip to content

Latest commit

 

History

History
456 lines (412 loc) · 11.1 KB

wyscout_opp_poss_time.md

File metadata and controls

456 lines (412 loc) · 11.1 KB

How to calculate opponent's pure possession time in Wyscout


I was reading the Wyscout Glossary and I came across Possession Adjusted. It's not like I didn't already know that the defensive metrics “per 90 minutes” was wrong and incomplete, but then another light went on in my head. The source of this light was the following sentences:

In the match of AFC Bournemouth - Manchester City (0:1, 2 March 2019) Man City had had 80% of possession (42:37 pure possession time), while Bournemouth only had 20% (10:35 pure possession time).
Aké, who played the whole match for Bournemouth, had 10 interceptions. His PAdj interceptions value would be: 10 / 42.5 * 30 = 7.06


There was the formula. Math doesn't lie. In the Wyscout Advance Search, both the interceptions p90 and padj interceptions were listed. Then it was just a matter of doing the math.

(Interception p90) / (Opp Poss Time) * 30 = (Padj Interception)

Wyscout has Padj versions for both Interception and Sliding Tackle metrics. Some players may have “NA” Tackle numbers and some may have “NA” Interception numbers. Then both can be used. Also, since Wyscout shows a maximum of 2 decimals, there may be a slight difference if the process is applied separately for tackles and interceptions. So I created a function taking into account each case.

def opp_poss_time(raw_inter=None, padj_inter=None, raw_tackle=None, padj_tackle=None):
    opp_poss_time_from_inter = (raw_inter * 30 / padj_inter) if raw_inter and padj_inter else None
    opp_poss_time_from_tackle = (raw_tackle * 30 / padj_tackle) if raw_tackle and padj_tackle else None

    # Both calculations returned None while None
    if opp_poss_time_from_inter is None and opp_poss_time_from_tackle is None:
        return None

    # If there is only one value, return it, if there are two, return the average
    return opp_poss_time_from_inter if opp_poss_time_from_tackle is None else (
        opp_poss_time_from_tackle if opp_poss_time_from_inter is None else
        (opp_poss_time_from_inter + opp_poss_time_from_tackle) / 2
    )

Based on the interception and sliding tackle metrics of English Premier League DMCs for the 2023/2024 season, let's calculate the opponents' pure possession time when they are on the pitch

import pandas as pd
data = pd.read_excel("data.xlsx")

df = data.copy()
df = df[df["Minutes played"] >= 900]
df['opp_poss_time'] = df.apply(
    lambda row: opp_poss_time(row["Interceptions per 90"], row["PAdj Interceptions"], row["Sliding tackles per 90"], row["PAdj Sliding tackles"]), 
    axis=1
)

df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Player Team within selected timeframe Position Minutes played Defensive duels per 90 Defensive duels won, % Sliding tackles per 90 PAdj Sliding tackles Interceptions per 90 PAdj Interceptions opp_poss_time
0 Bruno Guimarães Newcastle United DMF, LCMF 3689 6.98 61.89 0.54 0.75 3.46 4.85 21.501031
1 D. Rice Arsenal DMF, LCMF 3601 4.77 66.49 0.47 0.70 3.50 5.14 20.285436
2 Rodri Manchester City RDMF, DMF, LDMF 3263 5.57 59.90 0.08 0.14 4.03 6.89 17.345013
3 R. Christie Bournemouth LDMF, RDMF, LCMF 3242 7.77 58.57 0.81 1.02 4.83 6.09 23.808316
4 M. Caicedo Chelsea RDMF, DMF, LDMF 3216 6.58 56.60 0.31 0.45 4.00 5.89 20.520091

Let's also change the number of Defensive Duels to PAdj. But before that, let's look at the ranking. Then let's sort by Padj Def Duel.

selected_columns = ["Player", "Team within selected timeframe", "Position", "Minutes played", "Defensive duels per 90", "Defensive duels won, %", "opp_poss_time"]

df.sort_values("Defensive duels per 90", ascending = False).head(10).filter(selected_columns)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Player Team within selected timeframe Position Minutes played Defensive duels per 90 Defensive duels won, % opp_poss_time
26 N. Domínguez Nottingham Forest LDMF, RDMF, LCMF 1651 10.58 67.01 28.226496
9 A. Mac Allister Liverpool DMF, RCMF, LCMF 2897 10.13 57.36 18.327602
14 R. Yates Nottingham Forest RDMF, RCMF 2309 9.86 64.03 27.584688
7 João Palhinha Fulham RDMF, LDMF, LCMF 3022 9.83 61.52 23.755446
21 W. Endo Liverpool DMF 1950 9.55 56.04 18.167705
32 I. Sangaré Nottingham Forest LDMF, RCMF 1139 9.48 60.00 28.369565
8 Vini Souza Sheffield United DMF, RCMF 3019 8.85 66.67 29.744412
17 Casemiro Manchester United DMF, RDMF, RCB 2226 8.65 60.75 24.553645
10 C. Nørgaard Brentford DMF 2796 8.59 61.05 24.538352
30 S. Lukić Fulham RDMF, LDMF, RCMF 1289 8.59 52.85 25.066950
df["PAdj Def Duels"] = df.apply(
    lambda row: row["Defensive duels per 90"] / row["opp_poss_time"] * 30,
    axis = 1
)

selected_columns = ["Player", "Team within selected timeframe", "Position", "PAdj Def Duels", "Defensive duels per 90", "Defensive duels won, %", "opp_poss_time"]

df.sort_values("PAdj Def Duels", ascending = False).head(10).filter(selected_columns)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Player Team within selected timeframe Position PAdj Def Duels Defensive duels per 90 Defensive duels won, % opp_poss_time
9 A. Mac Allister Liverpool DMF, RCMF, LCMF 16.581547 10.13 57.36 18.327602
21 W. Endo Liverpool DMF 15.769741 9.55 56.04 18.167705
7 João Palhinha Fulham RDMF, LDMF, LCMF 12.413996 9.83 61.52 23.755446
33 S. Amrabat Manchester United LDMF, RCMF, DMF 11.544308 8.58 53.40 22.296703
15 Y. Bissouma Tottenham Hotspur LDMF, DMF, RDMF 11.305667 6.64 61.31 17.619482
26 N. Domínguez Nottingham Forest LDMF, RDMF, LCMF 11.244754 10.58 67.01 28.226496
31 R. Bentancur Tottenham Hotspur LDMF, DMF, RDMF 11.176129 6.28 61.25 16.857357
14 R. Yates Nottingham Forest RDMF, RCMF 10.723340 9.86 64.03 27.584688
17 Casemiro Manchester United DMF, RDMF, RCB 10.568696 8.65 60.75 24.553645
10 C. Nørgaard Brentford DMF 10.501928 8.59 61.05 24.538352