-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable shadowrealm testing for url api #41985
base: master
Are you sure you want to change the base?
Enable shadowrealm testing for url api #41985
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the HTML PR for this is still outstanding, right? And there was also talk of defining some kind of IDL construct which I don't think has happened. What is the standardization status of this feature?
That's correct. We have updates to complete that work forthcoming.
Yes, that matches my understanding as well.
As of now the feature is Stage 3, with requests from Mozilla for more feature testing such this—or be downgraded to Stage 2. For now my goal is to enable as much testing as possible, but limited to APIs that are |
57a0ec2
to
c816d1f
Compare
@annevk thanks for the review! Changes made as requested |
c816d1f
to
e58d9d0
Compare
So the problem from my perspective is that we typically land test changes alongside the specification PR change. Landing test changes ahead of the specification being formalized is not something we have precedent for, apart from when tests have a |
@annevk I understand, and it should be fine to leave the PR open until the spec change work is complete, if that's what you'd prefer. I will follow up here when the time is right. |
e58d9d0
to
85c7544
Compare
Rebased and added some additional changes. |
85c7544
to
05c115c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you carefully reviewed the before and after results this seems okay. However, I'm not sure what the impact of this is on Interop 2024. This late in the year it might require someone else to double check this doesn't end up impacting the Interop scores in some weird way.
So not approving until we're fully clear on that.
05c115c
to
b4e0e1e
Compare
Starting with tests that don't have additional dependencies.
Some tests required interfaces such as Request, Response, or FormData that are not available in ShadowRealm. These should be split into separate test files that are not run in ShadowRealm scopes.
In url/failure.html there were tests passing invalid URLs to the second parameter of the URL constructor. Split these into their own file so they can be run in workers and ShadowRealms, separately from the other tests in url/failure.html that use the current document's URL.
b4e0e1e
to
f382c0c
Compare
I spent some time digging into this. I've been using wpt.fyi for before and after results. Here are the relevant splits from this PR:
There were also splits of tests that weren't included in the interop label, according to wpt.fyi, and so should have no relevance:
and, of course, the new So I'd propose to include the following new tests in the interop stats:
Here are the before and after stats for the tests touched by this PR that are included in the label. Note, I don't know exactly how the interop scores are calculated, but I'm assuming 'percentage of subtests passed' is probably a good measure to see if something's up.
(D.N.E. = did not exist) I'm not sure why there are no results for Edge on the PR stats, but I've omitted Edge from the above table for that reason. As you can see, only Chrome is affected in a significant way. This is primarily because of the 2000+ subtests in IdnaTestV2, they only pass a bit over half while Firefox and Safari pass all of them. Without adjusting what tests are covered by the Interop label, Chrome would get an artificially inflated score because IdnaTestV2.window.html no longer exists. With my proposed adjustment, Chrome would get an artificial drop in score without any actual regression causing it, because IdnaTestV2 is now executed in two environments. Here are some solutions I've thought of to avoid this artificial drop:
@annevk Let me know how you'd prefer to proceed! |
Thanks @ptomato. I think for split tests we should only include the any.html and not the any.worker.html. That way we don't increase or decrease the overall number of tests in Interop. Assuming the labeling is done well that should then amount to this being a no-op, which is what we'd want at this point. Which would reduce your list down to:
And we'd continue to include what we include now. @gsnedders @jgraham @foolip please let me know if you have any concerns. |
Sounds good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but please give @jgraham @gsnedders @foolip one week to weigh in before merging this. And if this does turn out to not be a no-op for Interop 2024 it'll have to be reverted.
The scoring system is sum over all tests of fraction of subtests passing divided by the total number of tests e.g. if you have two tests with scores 2/2 and 9/18 then the Interop score will be (1 + 0.5)/2 = 0.75, not 10/20 = 0.5. So I think to land this we also need a PR against wpt-metadata to add the labels to whatever new tests require them, so that it can land right afterwards (I think it will be blocked until this PR is landed due to the missing tests). And it needs to be clear that the interop scores are not going to change once both PRs are landed. |
Ah, that does change things somewhat, as test failures in a file with few tests have a much larger impact than test failures in a file with many tests. I've done a quick spreadsheet calculating the scores that way and given Anne's proposed list of wpt-metadata changes from #41985 (comment). I'll spare you the details but here is the result.
Not sure if these changes are small enough to be considered just noise, or this would be a problem? An alternative would be to only replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these changes are small enough to be considered just noise, or this would be a problem?
Given it's not a no-op change to Interop, this needs an Interop Test Change Proposal issue opened before it (and the as-of-yet-unwritten wpt-metadata PR) can land together.
This view exaggerates the differences this PR makes (through the inclusion of is:different
), but does show there are meaningful changes. (This shows the proposed Interop results, I believe.)
(FWIW: I'm not opposed to this change being made, it just needs to go through the right process given the Interop impact.)
In web-platform-tests/wpt#41985 we are proposing to rename and split some tests, in order to increase test coverage for URL in ShadowRealm scopes. ShadowRealm is irrelevant to Interop, but we do have to split some files that previously contained a mix of tests both suitable and unsuitable for executing in ShadowRealm scopes. This PR adjusts the `interop-2023-url` label so that the splits in the abovementioned PR have as little effect as possible on the Interop scores.
In web-platform-tests/wpt#41985 we are proposing to rename and split some tests, in order to increase test coverage for URL in ShadowRealm scopes. ShadowRealm is irrelevant to Interop, but we do have to split some files that previously contained a mix of tests both suitable and unsuitable for executing in ShadowRealm scopes. This PR adjusts the `interop-2023-url` label so that the splits in the abovementioned PR have as little effect as possible on the Interop scores. It also adjusts the browser bug metadata so that the tests referred to are in the correct files.
Thanks for the pointers and for the nice link showing the results; much better than doing it in a spreadsheet. wpt-metadata PR: web-platform-tests/wpt-metadata#6930 |
Starting with tests that don't have additional dependencies.