Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HitResults are not calculated properly at the hit-window edges compared to osu!stable. #11311

Open
abstrakt8 opened this issue Dec 25, 2020 · 33 comments · May be fixed by #26452
Open

HitResults are not calculated properly at the hit-window edges compared to osu!stable. #11311

abstrakt8 opened this issue Dec 25, 2020 · 33 comments · May be fixed by #26452
Labels
area:replay compatibility change Changes to be considered in the future which break compatibility with osu!stable scores. ruleset/osu!

Comments

@abstrakt8
Copy link
Contributor

Describe the bug:

For the OsuRuleset (did not research/test for other rulesets) I have found a bug as described in the title.

Example:

Given OD=10, which means the hit-window for Meh (50) is supposed to be +-59.5ms:

In Stable: +59ms => Ok, +60ms => Meh
In Lazer: +59ms => Ok, +60ms => Ok which is incorrect.

This is demonstrated in one score of mine:

Side by side comparison: https://www.youtube.com/watch?v=IQGsfzOEyJY

  1. If you skip to the end you can already see that there is a big accuracy difference.
  2. At timestamp 1:57 in the last stream: There is supposed to be three consecutive 50's which is
    calculated correctly in Stable but not in Lazer.

I used gosumemory to calculate the hit offsets of this replay and at the
suspicious note (=6th last note of the map) I hit the note with exactly +60ms
offset. (https://gist.github.com/abstrakt-osu/46997fa9d203564ed1f1b1eee09a18ba#file-gosu-json-L894).

Screenshots or videos showing encountered issue:

Side by side comparison: https://www.youtube.com/watch?v=IQGsfzOEyJY

Beatmap: https://osu.ppy.sh/beatmapsets/1010865#osu/2117132
Score (+Replay): https://osu.ppy.sh/scores/osu/3321687146

Stable replay: https://youtu.be/FDhGuZh3P_U
Lazer replay: https://youtu.be/Ositc3vu_bg

osu!lazer version: 2020.1225.0

Logs:

@bdach
Copy link
Collaborator

bdach commented Dec 25, 2020

Given the following, I'm not surprised that lazer is giving a hit window of 60ms:

private static readonly DifficultyRange[] osu_ranges =
{
new DifficultyRange(HitResult.Great, 80, 50, 20),
new DifficultyRange(HitResult.Ok, 140, 100, 60),
new DifficultyRange(HitResult.Meh, 200, 150, 100),
new DifficultyRange(HitResult.Miss, 400, 400, 400),
};

public static double DifficultyRange(double difficulty, double min, double mid, double max)
{
if (difficulty > 5)
return mid + (max - mid) * (difficulty - 5) / 5;
if (difficulty < 5)
return mid - (mid - min) * (5 - difficulty) / 5;
return mid;
}

Not sure where the 59.5ms figure cited above is coming from.

@abstrakt8
Copy link
Contributor Author

abstrakt8 commented Dec 25, 2020

Not sure where the 59.5ms figure cited above is coming from.

Took the value from https://osu.ppy.sh/wiki/en/Beatmap_Editor/Song_Setup#overall-difficulty . When hovering over the beatmap difficulty stats in stable you can also see the hit window for 100 being +-59.5ms. Of course it could be implemented differently internally.

@peppy
Copy link
Member

peppy commented Dec 25, 2020

this may be an intentional fix, for what it's worth. will need further discussion.

@smoogipoo smoogipoo added the compatibility change Changes to be considered in the future which break compatibility with osu!stable scores. label Mar 31, 2022
@smoogipoo
Copy link
Contributor

smoogipoo commented Mar 31, 2022

The .5 figure is coming from the fact that osu!stable uses a strict < instead of lazer's <=. That is to say, in osu-stable terms, <60 is equivalent to saying <= 59.5 (or a timing window of 59.5ms) because the resolution is 1ms. Thus it's also song select's way of showing that this is the case as well.

Of note: In osu-stable, osu!mania uniformly uses <=, and I believe osu!taiko also uses <= for the normal hit (strongs are still <).

This could be fixed by subtracting 1 from these values, however I propose we keep them as-is and touch upon this in the future. I've labelled this issue as such.

@bdach
Copy link
Collaborator

bdach commented Mar 31, 2022

The compatibility is one thing, but the replay playback errors are another. LegacyBeatmapEncoder applies a rounding operation to replay frames:

// Rounding because stable could only parse integral values
int time = (int)Math.Round(legacyFrame.Time + offset);

This means that, for instance, with a timing window of 60ms, if you hit a note at +59.6ms, it will be a 300 in the original play but then will change to +60ms in the replay and thus become a 100. In that case maybe frame times should be truncated/floored when being written out to replay?

@smoogipoo
Copy link
Contributor

Probably not floored, because it works both ways - floor(-59.6) == -60. But truncation sounds reasonable.

@smoogipoo
Copy link
Contributor

smoogipoo commented Apr 1, 2022

Probably not floored, because it works both ways - floor(-59.6) == -60. But truncation sounds reasonable.

Edit: I guess it doesn't matter since we're not dealing with relative time values... But that means I don't think just flooring is enough to avoid replay playback errors. You'd want to ceil early hits.

@peppy peppy added the priority:2 Moderately important. Relied on by some users or impeding the usability of the game label Apr 17, 2022
@Walavouchey
Copy link
Member

relevant wiki page https://github.com/ppy/osu/wiki/Gameplay-differences-from-osu!stable#hit-window-edge-calculations-dont-match-stable (based on a quick glance of the discussion here, it might be missing some details for specific hit windows?)

@Walavouchey
Copy link
Member

Walavouchey commented Jan 28, 2023

In the above wiki pr I've done some investigation about how stable does hit window comparisons. In summary:

Ruleset Comparison (stable) Comparison (lazer) Comparison (lazer replay)
osu! abs(round(hit error)) < floor(hit window) abs(hit error) <= hit window abs(round(hit error)) <= hit window
osu!taiko abs(round(hit error)) < floor(hit window), except for the miss window which uses <= abs(hit error) <= hit window abs(round(hit error)) <= hit window
osu!mania abs(round(hit error)) <= floor(hit window) abs(hit error) <= hit window abs(round(hit error)) <= hit window
  • Hit windows are truncated (cast to integers, i.e. floored) after accounting for OD, while the hit error uses rounded time derived from the audio engine.
  • Lazer sticks with doubles everywhere and doesn't round anything, except for replay storage.
  • DT and HT make the clock tick faster and slower respectively, which scales the hit error and hit windows used in calculations. This may affect how much the hit windows can vary, but I haven't thought it through thoroughly.
A few examples
Case Max error (formula) Max error (lazer) Error Max error (lazer replay) Error Max error (stable) Error Delta (stable - lazer replay)
OK window osu!, OD 10 60 <= 60.0 0 < 60.5 +0.5 < 59.5 -0.5 -1
OK window osu!, OD 5.4 96.8 <= 96.8 0 < 96.5 -0.3 < 95.5 -1.3 -1
OK window osu!, OD 5.6 95.2 <= 95.2 0 < 95.5 +0.3 < 94.5 -0.7 -1
OK window osu!mania, OD 5.4 110.8 <= 110.8 0 < 110.5 -0.3 < 110.5 -0.3 0
OK window osu!mania, OD 5.6 110.2 <= 110.2 0 < 110.5 +0.3 < 110.5 +0.3 0
Some theoreticals
Case Max error (formula) Max error (lazer replay if ceiled&floored) Max replay frame error Max error (lazer replay if floored&ceiled) Max replay frame error Max replay frame error (stable)
OK window osu!, OD 10 60 < 61 60 <= 60 60 59
OK window osu!, OD 5.4 96.8 < 97 96 <= 96 96 95
OK window osu!, OD 5.6 95.2 < 96 95 <= 95 95 94
OK window osu!mania, OD 5.4 110.8 < 111 110 <= 110 110 110
OK window osu!mania, OD 5.6 110.2 < 111 110 <= 110 110 110
  • "ceiled&floored" is smoogi's suggestion. It effectively means floor(abs(hit error)).
  • "floored&ceiled" is the reverse of smoogi's suggestion. It effectively means ceil(abs(hit error)).

Neither of these two give correct effective hit windows (max error) for any case. The comparisons simply aren't the same.

Of course, true hit timing aside, if we only compare replays between stable and lazer where we only see the ints, "ceiled&floored" would match stable in the case of osu!mania, but not for osu! (as in, you can be off by a whole millisecond in lazer but not in stable). Same for "floored&ceiled".

In short:

  • In stable with <, the max hit error can be up to 1.5 ms smaller
  • In stable with <=, the max hit error can be up to 0.5 ms smaller or larger
  • In lazer, hit windows are accurate during gameplay because everything uses doubles (however, due to rounding in replays the judgement may change)
  • For replays in lazer (where frame times are rounded integers), max hit errors are accurate to stable wherever <= is used, but larger by 1 ms compared to stable wherever < is used. (Assuming the hit windows have the same value in stable and lazer, of course.) The max hit error can be up to 0.5 ms smaller or larger

I haven't been able to verify this in-game though as I don't have the tooling or time to do so. Maybe there's a relevant test scene in lazer, or if not maybe one of those could be cooked up (comparing against stable rounding/truncation)?

LegacyBeatmapEncoder applies a rounding operation to replay frames:

Rounding the hit time for replays should be correct, but I'm not entirely sure what that implies. Are these rounded hit times used for replay playback of plays set on lazer/stable or only for .osr exports?

I believe osu!taiko also uses <= for the normal hit (strongs are still <).

I haven't been able to find anything of the sort in the stable reference code.

In stable, taiko misses are given here, the comparison for which (<=) can be seen here and here (variable name for the "miss window" is a bit misleading). Greats and Ok's are compared with <. There's no difference between regular and large taiko notes from what I can tell.

You'd want to ceil early hits.

Both stable and lazer use absolute hit error, so it's always comparing two positive values. There's no special case to consider here, especially since hit times are rounded in stable anyway.

public HitResult ResultFor(double timeOffset)
{
timeOffset = Math.Abs(timeOffset);
for (var result = HitResult.Perfect; result >= HitResult.Miss; --result)
{
if (IsHitResultAllowed(result) && timeOffset <= WindowFor(result))
return result;
}
return HitResult.None;
}

Where timeOffset is the hit error (positive or negative, but made positive above). This is the same in stable (example).

@smoogipoo
Copy link
Contributor

smoogipoo commented Jan 30, 2023

Rounding the hit time for replays should be correct, but I'm not entirely sure what that implies

Both stable and lazer use absolute hit error, so it's always comparing two positive values. There's no special case to consider here, especially since hit times are rounded in stable anyway.

There are two issues presented here: one is the hit error as you're playing the game, and the other is the timing error of replays. Legacy replays only store integer time deltas whereas lazer deals with floating-point time, so lazer rounds the time values to write into the replay.

Suppose you have a hitobject at 500ms. The user hits a circle at -59.6ms delta lazer-time (i.e. 440.4ms). Should this value be rounded up or down for storage in the replay. Conversely, suppose the user hits at +59.6ms, should this value be rounded up or down?
My proposal is to ceil the former and floor the latter, that way even though there's still up to 1ms discrepancy it will always be the correct judgement.

There's no difference between regular and large taiko notes from what I can tell.

I believe your assessment here, and for the other hit results, is correct.

@Walavouchey
Copy link
Member

Walavouchey commented Jan 30, 2023

Suppose you have a hitobject at 500ms. The user hits a circle at -59.6ms delta lazer-time (i.e. 440.4ms). Should this value be rounded up or down for storage in the replay. Conversely, suppose the user hits at +59.6ms, should this value be rounded up or down?

Rounding the value emulates the rounded-integer temporal granularity of stable. With that, the only difference for replays in lazer (with integer time values) is that they're still compared against doubles, non-floored hit windows (as well as the < vs <= thing). Using ceil and floor (instead of round) would only cause yet another difference between stable and lazer.

This affects the lazer part of the examples I presented I suppose. I've updated them with lazer replay timings and also made it all into a table.

even though there's still up to 1ms discrepancy it will always be the correct judgement.

Well I thought through the implications of this... (see updated examples section) but I can't convince myself that this would solve anything.

  • Are you supposed to decide whether to floor or ceil a replay frame depending on proximity to (early-side or late-side of) a hit object?
  • You're essentially making hit windows more lenient, but according to the example in OP, it's already too lenient (expected Meh, got Ok). On top of that, a same exact (sub-millisecond) input would be able to give different replay files on lazer compared to stable.
  • Even if you you do this, it doesn't make stable-set replays show correct judgements in lazer.

    In Stable: +59ms => Ok, +60ms => Meh
    In Lazer: +59ms => Ok, +60ms => Ok which is incorrect.

@peppy peppy added this to the User acceptance milestone Feb 6, 2023
@peppy peppy added priority:1 Very important. Feels bad without fix. Affects the majority of users. and removed priority:2 Moderately important. Relied on by some users or impeding the usability of the game labels Feb 6, 2023
@peppy peppy modified the milestones: User acceptance, Game balance May 5, 2023
@peppy peppy moved this to Needs implementation in Path to osu!(lazer) ranked play Aug 20, 2023
@peppy peppy removed this from the Game balance milestone Aug 20, 2023
@peppy peppy moved this from Needs implementation to Needs discussion in Path to osu!(lazer) ranked play Aug 20, 2023
@bdach
Copy link
Collaborator

bdach commented Aug 22, 2023

Reading through this again, I see two options:

  • Follow stable to the T: round/truncate everything as stable does, reintroduce strict/weak equality as stable has it.
  • Something more creative like keeping lazer replay timestamps as doubles for the higher fidelity, but fiddling with hitwindows to preserve accuracy (something like adding/subtracting half a ms from hitwindows so that a "less than 60ms" window from stable ends up being a 59.5ms window in lazer, so 59.4ms still effectively gets judged as 59ms, and 59.6ms gets bumped up to 60ms and demoted to the next hit window).

This one is probably too technical for community discussion so we'll probably have to come to a conclusion internally. @ppy/team-client anyone wants to sound off on this, or is this going to be a matter of "try both and see which works best"?

@Walavouchey
Copy link
Member

I haven't been able to verify this in-game though as I don't have the tooling or time to do so. Maybe there's a relevant test scene in lazer, or if not maybe one of those could be cooked up (comparing against stable rounding/truncation)?

is there a test scene for this? it would be invaluable for both outlined options

@smoogipoo
Copy link
Contributor

I don't think this is something that is a blocker for the path-to-ranked project since it mostly affects (very few) replays and is otherwise unnoticeable in gameplay. We can fix this at any point later on, with likewise minimal effect to gameplay.

@Paturages
Copy link

Paturages commented Dec 20, 2023

Adding another sample, just in case (same settings as #25973)

Beatmap
Gameplay score
Replay score
Replay file

since it mostly affects (very few) replays

As high-level osu!mania gameplay is inherently reliant on tight timing windows, I think this is probably bound to happen more often than not for osu!mania replays. I don't think it's a blocker for ranked play though, as long as the gameplay score (not the replay one) makes sense.

@peppy
Copy link
Member

peppy commented Dec 21, 2023

This does affect both gameplay and replays FWIW

@peppy
Copy link
Member

peppy commented Jan 3, 2024

Bumping this to p0 until i or someone else investigates.

@peppy peppy added priority:0 Showstopper. Critical to the next release. and removed priority:1 Very important. Feels bad without fix. Affects the majority of users. labels Jan 3, 2024
@Detze
Copy link
Contributor

Detze commented Jan 9, 2024

As noted, currently lazer hit windows are too lenient for standard and taiko, and too strict for mania, compared to stable hit windows. For OD 10 in standard, the OD hit window formula gives 20 ms for the 300 hit window. In reality, true hit error in stable, as measured by the game's clock, can be in the interval (-19.5 ms, 19.5 ms) to get a 300, as described by the stable comparison formula. In lazer, a new, more intuitive formula is used, which results in the interval of [-20 ms, 20 ms], a 0.5 ms difference, a non-trivial reduction in difficulty. The effect is even more pronounced for fractional ODs, as stable floors the hit window, while lazer doesn't, resulting in up to 1.3-1.5 ms of difference. The disparity disadvantages existing stable scores in standard and taiko, and advantages existing stable scores in mania. This means that if an obsoleting of stable scores isn't desired, then these stable's hit windows must be matched in lazer, and thus the intuitive formula sadly cannot be followed as it is.

This also means that 20 ms that is obtained from the OD hit window formula is no longer the "true" hit window - it's actually 19.5 ms. If the player asks for the actual hit window (eg. by hovering over the OD value in song select), it would therefore be best to return 19.5 ms, and not 20 ms. As a side note, it is the "true" hit window value that gets divided by speed rate, not the result of the OD hit window formula (except for mania, of course).

There is a disadvantage to rounding player inputs to integers - it introduces a small amount of additional variance not due to the player's input variance. This play, set on stable, has the UR of 28.84. Had it been played on lazer, it would be expected to have about 28.55 UR. This effect is less pronounced the higher the UR. Thus, it is preferable to use double for hit error.

This also means that downgrading double to int almost always changes the UR of the replay by a slight amount. This can be observed in Paturages's comment above. The UR value changing between gameplay result screen and replay result screen, while only changing slightly, could be confusing. The gameplay UR should be the one considered correct, not the replay one. That's an issue for another thread though.

An idea to resolve the issue would be to find a formula to use as the judgement formula which is equivalent to stable's, and works the same regardless of whether input is rounded (or transformed in another way, as we can modify LegacyScoreEncoder) or not, so that outputting legacy replays works. This is possible, and without rounding hiterror. The formula is abs(hit_error) < floor(hit_window) - 0.5 for standard and taiko (exclusive hit windows), and abs(hit_error) < floor(hit_window + 1) - 0.5 for mania (inclusive hit windows). They are equivalent to stable's for any arguments, except hit_error half-integer. This formula also has a nice property of just having the "true" hit window formula on the right hand side and just hit error on the left hand side.

Proof

For x not a half-integer, and N integer, abs(Math.Round(x)) < N is equivalent to abs(x) < N - 0.5.

Proof round() matches Math.round() everywhere except possibly half-integers. thus here they're equivalent.

useful Math.Round() properties for x not half-integer:
Math.Round(x) < N is equivalent to x < N - 0.5
Math.Round(x) > N is equivalent to x > N + 0.5

case I: round(x) > 0 (this means x > 0.5, thus x = abs(x))
	abs(round(x)) < N <=>
	round(x) < N <=>
	x < N - 0.5 <=>
	abs(x) < N - 0.5
	qed

case II: round(x) < 0 (this means x < -0.5, thus -x = abs(x))
	abs(round(x)) < N <=>
	-round(x) < N <=>
	round(x) > -N <=>
	x > -N + 0.5 <=>
	-x < N - 0.5 <=>
	abs(x) < N - 0.5
	qed

case III - round(x) = 0 <=> x in (-0.5, 0.5), so abs(x) < 0.5
	LHS: abs(round(x)) < N <=> 0 < N

	RHS: abs(x) + 0.5 < N
		thus abs(x) + 0.5 < 1, so this is true iff 1 <= N, which is equivalent to LHS
	qed

Hence, for x not half-integer, the stable judgement formula for standard and taiko is equivalent to abs(hit_error) < floor(hit_window) - 0.5. For mania and taiko miss hit windows, stable's formula is equivalent to abs(round(hit_error)) < floor(hit window) + 1, as both sides of the inequality are integers. This is equivalent to abs(hit_error) < floor(hit window) + 0.5 and abs(hit_error) < floor(hit_window + 1) - 0.5.

What about the hit_error half-integer case? Stable uses Math.Round(), which uses a to-even tiebreaker method. Of course, the half-integers that aren't close to the "true" hit window edge work as expected for both stable's and the proposed formula. However, in stable, half-integers at the edge can be either judged as in or out, depending on the OD. The proposed formula simply considers them all out, regardless of OD. It's not impossible to support to-even half-integer rounding in the judgement check method, but it's pointlessly complex just to not make hitting accurately harder by less than a nanosecond, and the new formula is more consistent. I would say half-integer hit errors not resulting in the same hit result in gameplay as in stable is a non-issue. It's only meaningful in the context of legacy score encoding (and even then, it's a very rare issue).

Not sure if I'm understanding legacy score encoding correctly, but from looking at the linked code it seems to me that it's currently not working properly. For example, on osu!mania with OD 10, a hit that is 34.4 ms late would be a Good on live lazer, but rounded, it would be a hit error of 34 ms, which, when replayed, would be a Great. This would be fixed with the new proposed formula (by rounding, and with away-from-zero (in terms of hit error) tiebreaker to handle the half-integer case).

Proof New judgement formula (for standard and taiko): LHS: abs(hit error) < floor(hit window) - 0.5

Thesis: new lazer replay formula that gives the same results as the above formula is:
RHS: abs(round(hit error)) < floor(hit window) - 0.5
with round() - tiebreaker whatever we want

	RHS can drop the -0.5 because it's two integers (for integers, N < M iff N < M - 0.5)
	abs(round(hit error)) < floor(hit window)
	for not half-integer hit error, this is equivalent to
		abs(hit error) < floor(hit window) - 0.5
		qed
	for half-integer hit error:
		only half-integer exactly at the "true" hit window edge need careful inspection - if the half-integer is (in absolute value) smaller/larger than the exact edge, then it is in/out by a comfortable margin and thus the result doesn't change between the two formulas
		for the new judgement formula, half integers are considered out. thus, in round(), we want to:
			for hit error > 0:
				ceil the hit error
			for hit error < 0:
				floor the hit error
		in total, we want round() to use away-from-zero tiebreaking

The mania and taiko miss hit window proof is the same, with an extra +1 on the RHS.

Unfortunately, from what I can see, LegacyScoreEncoder operates on replay frame time values, not hit errors. We would need to know the time of the object being hit (then, with the new proposed formula, we could find the correct integer to downgrade to). As it is, a 34.4 ms early hit would be x.6 in beatmap time, and thus change the rounding direction compared to a 34.4 ms late hit. Because of this, it doesn't seem to be possible to round in the proper direction with just the replay frame time value. Because of this, we would need to know the time of the object being hit to correctly round half-integer hit errors (again, it's a very rare case, but once in a blue moon this might create a mismatch between gameplay and legacy replay).

After this issue is resolved, previously set lazer scores might have higher accuracy than they should, if there were hits at the edge (for example, hits in the [19.5 ms, 20 ms] range on OD 10 in standard). Unfortunately, it's either them or all the historical scores from stable that will be inaccurate, so the choice seems clear. This is why I would say this is actually an important issue to get right. Correcting ruleset rules after scores had been set on the leaderboards and pp awarded could be difficult.

I've submitted a PR that aims to resolve the issue using this approach here.

@Detze Detze linked a pull request Jan 9, 2024 that will close this issue
2 tasks
@peppy
Copy link
Member

peppy commented Jan 10, 2024

Stable uses Math.Round()

Out of curiosity and concern, how are you checking against stable when the code is not published? Usually we'd have people ask for the relevant extract of code before making statements like this.

@Detze
Copy link
Contributor

Detze commented Jan 10, 2024

I claimed that based on Walavouchey's investigation with stable reference code, the fact that Math.Round() is the method one would normally use to round in C#, and that indeed that's what was used in LegacyScoreEncoder.

@Detze
Copy link
Contributor

Detze commented Jan 10, 2024

Your comment made me think more about the rounding, and its relation to legacy score encoding. The part of the formula in Walavouchey's table responsible for the hit error reads simply: round(hit_error). But this, in fact, hides one more operation, since hit error = hit time - object time. Therefore, in stable, rounding could actually be applied in either of two places (or even both, but for stable, that would be equivalent to (II)):

  • (I): hit_error = round(hit_time - object_time), or
  • (II): hit_error = round(hit_time) - object_time.

Reading Walavouchey's table, I assumed (I) was the used formula, and my remarks about legacy score encoding in the previous comment were based on that assumption. But in fact, the used formula could also be (II), and if that's the case, then the legacy score encoding can be rounded in the correct direction with just the replay frame value, as from that formula it can be seen that rounding the time value is correct to downgrade to int in legacy score encoding, as it basically repeats the same operation.

If object_time is an integer, and osu!stable hit object times fortunately are integers, these are equivalent (proof omitted). I was wrong in the "Unfortunately, from what I can see {...}" paragraph. The rounding direction does change, but that's the correct behavior to downgrade a fractional value to a symmetrical interval of integers.

However, this is the correct way to downgrade only if the judgement formula is equivalent to stable's, and my '34.4 ms late hit' point still stands. For example, in mania, a 16.4 ms off (either way) hit would be Perfect in stable, and in lazer with my PR, but would be a Great in live lazer (if Perfect hit window was 16 ms), while being rounded to 16 ms during legacy score encoding, which would be a Perfect.

Moreover, in lazer, the object time is a double. The two formulas are not equivalent for fractional object times. For double object times, the legacy score encoding will work correctly only if stable followed the (II) formula. Although I would not expect stable to be able to replay lazer's double object time maps anyway.

@Detze
Copy link
Contributor

Detze commented Jan 10, 2024

I'm a bit confused how it's possible that a replayed osu!mania lazer play doesn't have the exact same result counts? #25973 was closed with a comment pointing to LegacyScoreEncoder code, and I don't quite see how that's relevant. Does immediately replaying an osu!mania lazer play somehow use legacy encoded replays?

@Detze
Copy link
Contributor

Detze commented Jan 12, 2024

I'm a bit confused how it's possible that a replayed osu!mania lazer play doesn't have the exact same result counts? #25973 was closed with a comment pointing to LegacyScoreEncoder code, and I don't quite see how that's relevant. Does immediately replaying an osu!mania lazer play somehow use legacy encoded replays?

Reading through Player.cs it seems that legacy rulesets indeed use legacy replays in lazer. That's surprising to me, I thought lazer would have its own new, more accurate replay format.

If anyone was wondering, here's how the results are counted differently in Paturages's mania replays:

Maboroshi: the map is OD 8.3. live lazer's Perfect hit window is 15.77 ms, and Great hit window is 39.1 ms. In stable, they're 16.5 ms and 39.5 ms. In gameplay, hits that are off by x in [15.5, 15.77] ms are Perfect, but in the replay they get rounded to 16 ms and are thus Great. This is why the replay's Perfect count is lower. On stable, the Perfect count would be even higher than in lazer gameplay, and so would the sum of Perfect and Great counts. Neither lazer gameplay's nor replay's result counts are correct from stable's point of view. Counts in lazer with my PR are 2117/418/23/1/0/2, which looks legit.

The Weekend: the map is OD 8; live lazer's Perfect hit window is 16.1 ms, stable's is 16.5 ms. The rest of the hit windows are all 0.5 ms wider in stable because of rounding. In gameplay, hits that are off by x in (16.1, 16.5] are Great, but in the replay they get rounded to 16 ms and are thus Perfect. Similarly, all the other results have 0.5 ms more lenient hit windows in replay and stable, and this is why the replay has higher accuracy than the gameplay across the board. The hit result counts in the replay are actually the correct result counts from stable's point of view, as the judgement formulas are equivalent, and legacy score encoding rounds the same way as stable. I get the same hit results counts in lazer with my PR.

@bdach
Copy link
Collaborator

bdach commented Apr 25, 2024

I'm deprioritising this from p0. #26452 exists but it hasn't been reviewed by anyone but me despite being open for months and judging from references to it from elsewhere we might not be interested in doing anything about it at all. Unless I'm wrong @ppy/team-client?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:replay compatibility change Changes to be considered in the future which break compatibility with osu!stable scores. ruleset/osu!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants