Enable shadowrealm testing for url api #41985

rwaldron · 2023-09-14T19:51:23Z

Starting with tests that don't have additional dependencies.

annevk

So the HTML PR for this is still outstanding, right? And there was also talk of defining some kind of IDL construct which I don't think has happened. What is the standardization status of this feature?

url/historical.any.js

rwaldron · 2023-09-15T13:06:16Z

So the HTML PR for this is still outstanding, right?

That's correct. We have updates to complete that work forthcoming.

And there was also talk of defining some kind of IDL construct which I don't think has happened.

Yes, that matches my understanding as well.

What is the standardization status of this feature?

As of now the feature is Stage 3, with requests from Mozilla for more feature testing such this—or be downgraded to Stage 2. For now my goal is to enable as much testing as possible, but limited to APIs that are [Exposed=*].

rwaldron · 2023-09-15T13:07:40Z

@annevk thanks for the review! Changes made as requested

annevk · 2023-09-15T13:24:28Z

So the problem from my perspective is that we typically land test changes alongside the specification PR change. Landing test changes ahead of the specification being formalized is not something we have precedent for, apart from when tests have a .tentative suffix.

rwaldron · 2023-09-15T13:56:33Z

@annevk I understand, and it should be fine to leave the PR open until the spec change work is complete, if that's what you'd prefer. I will follow up here when the time is right.

ptomato · 2024-01-18T21:19:48Z

Rebased and added some additional changes.

annevk

Assuming you carefully reviewed the before and after results this seems okay. However, I'm not sure what the impact of this is on Interop 2024. This late in the year it might require someone else to double check this doesn't end up impacting the Interop scores in some weird way.

So not approving until we're fully clear on that.

url/urlencoded-parser.any.js

url/urlencoded-parser-request-response.any.js

url/url-setters.any.js

Starting with tests that don't have additional dependencies.

Some tests required interfaces such as Request, Response, or FormData that are not available in ShadowRealm. These should be split into separate test files that are not run in ShadowRealm scopes.

In url/failure.html there were tests passing invalid URLs to the second parameter of the URL constructor. Split these into their own file so they can be run in workers and ShadowRealms, separately from the other tests in url/failure.html that use the current document's URL.

ptomato · 2024-10-25T19:55:13Z

Assuming you carefully reviewed the before and after results this seems okay. However, I'm not sure what the impact of this is on Interop 2024. This late in the year it might require someone else to double check this doesn't end up impacting the Interop scores in some weird way.

I spent some time digging into this. I've been using wpt.fyi for before and after results.

Here are the relevant splits from this PR:

urlencoded-parser.any.js → split out urlencoded-parser-request-response.any.js
failure.html → split out url-constructor-base-failure.any.js
IdnaTestV2.window.html → renamed to IdnaTestV2.any.html (and so is additionally executed in Worker global scope)

There were also splits of tests that weren't included in the interop label, according to wpt.fyi, and so should have no relevance:

urlsearchparams-constructor.any.js → split out urlsearchparams-constructor-formdata.any.js
toascii.window.js → toascii.any.js, toascii-elements.window.js

and, of course, the new .shadowrealm.html tests are not labeled.

So I'd propose to include the following new tests in the interop stats:

IdnaTestV2.any.html
IdnaTestV2.worker.html
urlencoded-parser-request-response.any.html
urlencoded-parser-request-response.worker.html
url-constructor-base-failure.any.html
url-constructor-base-failure.worker.html

Here are the before and after stats for the tests touched by this PR that are included in the label. Note, I don't know exactly how the interop scores are calculated, but I'm assuming 'percentage of subtests passed' is probably a good measure to see if something's up.

Test	Chrome before	after	Firefox before	after	Safari before	after
failure.html	1146/1211	742/785	1151/1211	749/785	1211/1211	785/785
url-constructor-base-fai…	D.N.E.	405/427	D.N.E.	403/427	D.N.E.	427/427
" .worker.html	D.N.E.	405/427	D.N.E.	403/427	D.N.E.	427/427
IdnaTestV2.window.html	1320/2032	D.N.E.	2032/2032	D.N.E.	2032/2032	D.N.E.
IdnaTestV2.any.html	D.N.E.	1320/2032	D.N.E.	2032/2032	D.N.E.	2032/2032
IdnaTestV2.worker.html	D.N.E.	1320/2032	D.N.E.	2032/2032	D.N.E.	2032/2032
historical.any.html	8/8	8/8	8/8	8/8	8/8	8/8
" .worker.html	7/7	7/7	7/7	7/7	7/7	7/7
url-constructor…exclude=…	664/704	664/704	701/704	701/704	704/704	704/704
" .worker.html	664/704	664/704	701/704	701/704	704/704	704/704
url-origin.any.html	376/385	376/385	383/385	383/385	385/385	385/385
" .worker.html	376/385	376/385	383/385	383/385	385/385	385/385
url-searchparams.any.html	4/4	4/4	4/4	4/4	4/4	4/4
" .worker.html	4/4	4/4	4/4	4/4	4/4	4/4
url-setters-stripping.an…	248/260	248/260	260/260	260/260	260/260	260/260
" .worker.html	248/260	248/260	260/260	260/260	260/260	260/260
url-setters.any.html?exc…	206/226	206/226	219/226	219/226	222/226	222/226
" .worker.html	206/226	206/226	219/226	219/226	222/226	222/226
url-tojson.any.html	1/1	1/1	1/1	1/1	1/1	1/1
" .worker.html	1/1	1/1	1/1	1/1	1/1	1/1
urlencoded-parser.any.ht…	105/105	36/36	105/105	36/36	105/105	36/36
" .worker.html	105/105	36/36	105/105	36/36	105/105	36/36
urlencoded-parser-reques…	D.N.E.	71/71	D.N.E.	71/71	D.N.E.	71/71
" .worker.html	D.N.E.	71/71	D.N.E.	71/71	D.N.E.	71/71
Subtest Total	5689/6628	3827/4032	6544/6628	3972/4032	6620/6628	4024/4032
" in percentage	86%	95%	99%	99%	100%	100%
With proposed stats update		7419/9092		8984/9092		9084/9092
" in percentage		81%		99%		100%

(D.N.E. = did not exist)

I'm not sure why there are no results for Edge on the PR stats, but I've omitted Edge from the above table for that reason.

As you can see, only Chrome is affected in a significant way. This is primarily because of the 2000+ subtests in IdnaTestV2, they only pass a bit over half while Firefox and Safari pass all of them. Without adjusting what tests are covered by the Interop label, Chrome would get an artificially inflated score because IdnaTestV2.window.html no longer exists. With my proposed adjustment, Chrome would get an artificial drop in score without any actual regression causing it, because IdnaTestV2 is now executed in two environments.

Here are some solutions I've thought of to avoid this artificial drop:

Adopt my proposal above and do nothing else; no idea if this drop is significant in the actual final score!
Adopt my proposal above, except omit IdnaTestV2.worker.html from the stats; this should put Chrome approximately back at the score they previously had.
In f382c0c#diff-eb593412b052468f9080ba5210283245bce8697c5b966f6f05f408e048f5d067, omit dedicatedworker from the meta comment, or drop that hunk of the diff altogether; open a followup PR to be merged after Interop 2024 is over; adopt the rest of my proposal above for the files other than IdnaTestV2.

@annevk Let me know how you'd prefer to proceed!

annevk · 2024-11-05T14:36:04Z

Thanks @ptomato.

I think for split tests we should only include the any.html and not the any.worker.html. That way we don't increase or decrease the overall number of tests in Interop. Assuming the labeling is done well that should then amount to this being a no-op, which is what we'd want at this point. Which would reduce your list down to:

IdnaTestV2.any.html
urlencoded-parser-request-response.any.html
url-constructor-base-failure.any.html

And we'd continue to include what we include now.

@gsnedders @jgraham @foolip please let me know if you have any concerns.

ptomato · 2024-11-07T19:05:24Z

Sounds good to me.

annevk

Approving, but please give @jgraham @gsnedders @foolip one week to weigh in before merging this. And if this does turn out to not be a no-op for Interop 2024 it'll have to be reverted.

jgraham · 2024-11-08T15:14:13Z

The scoring system is sum over all tests of fraction of subtests passing divided by the total number of tests e.g. if you have two tests with scores 2/2 and 9/18 then the Interop score will be (1 + 0.5)/2 = 0.75, not 10/20 = 0.5.

So I think to land this we also need a PR against wpt-metadata to add the labels to whatever new tests require them, so that it can land right afterwards (I think it will be blocked until this PR is landed due to the missing tests). And it needs to be clear that the interop scores are not going to change once both PRs are landed.

ptomato · 2024-11-08T20:35:45Z

Ah, that does change things somewhat, as test failures in a file with few tests have a much larger impact than test failures in a file with many tests.

I've done a quick spreadsheet calculating the scores that way and given Anne's proposed list of wpt-metadata changes from #41985 (comment). I'll spare you the details but here is the result.

	Chrome	Firefox	Safari
Before	95.70%	99.33%	99.79%
After	95.82%	99.19%	99.80%
Difference	+0.12%	−0.15%	+0.02%

Not sure if these changes are small enough to be considered just noise, or this would be a problem?

An alternative would be to only replace IdnaTestV2.window.html with IdnaTestV2.any.html in wpt-metadata. That would give only +0.01% to Firefox and leave the other two scores unchanged to two decimal places. (i.e., hundredths of a percent.) It'd be less accurate IMO because it would actually move existing tests out from under the interop umbrella, but it'd be the most no-op possible given the constraints of the scoring system.

gsnedders

Not sure if these changes are small enough to be considered just noise, or this would be a problem?

Given it's not a no-op change to Interop, this needs an Interop Test Change Proposal issue opened before it (and the as-of-yet-unwritten wpt-metadata PR) can land together.

This view exaggerates the differences this PR makes (through the inclusion of is:different), but does show there are meaningful changes. (This shows the proposed Interop results, I believe.)

(FWIW: I'm not opposed to this change being made, it just needs to go through the right process given the Interop impact.)

In web-platform-tests/wpt#41985 we are proposing to rename and split some tests, in order to increase test coverage for URL in ShadowRealm scopes. ShadowRealm is irrelevant to Interop, but we do have to split some files that previously contained a mix of tests both suitable and unsuitable for executing in ShadowRealm scopes. This PR adjusts the `interop-2023-url` label so that the splits in the abovementioned PR have as little effect as possible on the Interop scores.

In web-platform-tests/wpt#41985 we are proposing to rename and split some tests, in order to increase test coverage for URL in ShadowRealm scopes. ShadowRealm is irrelevant to Interop, but we do have to split some files that previously contained a mix of tests both suitable and unsuitable for executing in ShadowRealm scopes. This PR adjusts the `interop-2023-url` label so that the splits in the abovementioned PR have as little effect as possible on the Interop scores. It also adjusts the browser bug metadata so that the tests referred to are in the correct files.

ptomato · 2024-11-12T01:11:23Z

Thanks for the pointers and for the nice link showing the results; much better than doing it in a spreadsheet.

wpt-metadata PR: web-platform-tests/wpt-metadata#6930
Test change proposal: web-platform-tests/interop#904

wpt-pr-bot added the url label Sep 14, 2023

wpt-pr-bot assigned mikewest Sep 14, 2023

wpt-pr-bot requested review from annevk, domenic, GPHemsley, mikewest and TimothyGu September 14, 2023 19:51

annevk reviewed Sep 15, 2023

View reviewed changes

url/historical.any.js Show resolved Hide resolved

rwaldron force-pushed the rwaldron/shadowrealm-url-tests branch from 57a0ec2 to c816d1f Compare September 15, 2023 13:07

rwaldron requested a review from annevk September 15, 2023 13:07

rwaldron force-pushed the rwaldron/shadowrealm-url-tests branch from c816d1f to e58d9d0 Compare September 15, 2023 13:16

annevk added the status:needs-spec-decision label Sep 15, 2023

This was referenced Sep 15, 2023

Enable shadowrealm testing for abort api #41965

Merged

Enable shadowrealm testing for dom events api #41966

Open

Ms2ger mentioned this pull request Oct 3, 2023

Add shadowrealm global testing to URL suite #33739

Closed

ptomato force-pushed the rwaldron/shadowrealm-url-tests branch from e58d9d0 to 85c7544 Compare January 18, 2024 21:18

rwaldron mentioned this pull request Jan 22, 2024

[Stage 3] Web APIs to be exposed to ShadowRealms tc39/proposal-shadowrealm#393

Open

47 tasks

GPHemsley removed their request for review March 16, 2024 14:25

ptomato force-pushed the rwaldron/shadowrealm-url-tests branch from 85c7544 to 05c115c Compare October 22, 2024 20:01

annevk reviewed Oct 22, 2024

View reviewed changes

url/urlencoded-parser.any.js Outdated Show resolved Hide resolved

url/urlencoded-parser-request-response.any.js Outdated Show resolved Hide resolved

url/url-setters.any.js Show resolved Hide resolved

annevk removed the status:needs-spec-decision label Oct 22, 2024

ptomato force-pushed the rwaldron/shadowrealm-url-tests branch from 05c115c to b4e0e1e Compare October 25, 2024 16:30

rwaldron and others added 3 commits October 25, 2024 09:31

Enable shadowrealm testing for url api

d846da5

Starting with tests that don't have additional dependencies.

Split URL tests into ShadowRealm-applicable and not-applicable files

447ded0

Some tests required interfaces such as Request, Response, or FormData that are not available in ShadowRealm. These should be split into separate test files that are not run in ShadowRealm scopes.

Enable ShadowRealm testing for more URL APIs

f382c0c

ptomato force-pushed the rwaldron/shadowrealm-url-tests branch from b4e0e1e to f382c0c Compare October 25, 2024 16:31

annevk approved these changes Nov 8, 2024

View reviewed changes

gsnedders requested changes Nov 8, 2024

View reviewed changes

This was referenced Nov 12, 2024

Update Interop metadata for proposed test splits web-platform-tests/wpt-metadata#6930

Open

Split some URL tests into separate files to allow for better ShadowRealm coverage web-platform-tests/interop#904

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable shadowrealm testing for url api #41985

Enable shadowrealm testing for url api #41985

rwaldron commented Sep 14, 2023

annevk left a comment

rwaldron commented Sep 15, 2023

rwaldron commented Sep 15, 2023

annevk commented Sep 15, 2023

rwaldron commented Sep 15, 2023

ptomato commented Jan 18, 2024

annevk left a comment

ptomato commented Oct 25, 2024 •

edited

Loading

annevk commented Nov 5, 2024

ptomato commented Nov 7, 2024

annevk left a comment

jgraham commented Nov 8, 2024 •

edited

Loading

ptomato commented Nov 8, 2024

gsnedders left a comment •

edited

Loading

ptomato commented Nov 12, 2024

Enable shadowrealm testing for url api #41985

Are you sure you want to change the base?

Enable shadowrealm testing for url api #41985

Conversation

rwaldron commented Sep 14, 2023

annevk left a comment

Choose a reason for hiding this comment

rwaldron commented Sep 15, 2023

rwaldron commented Sep 15, 2023

annevk commented Sep 15, 2023

rwaldron commented Sep 15, 2023

ptomato commented Jan 18, 2024

annevk left a comment

Choose a reason for hiding this comment

ptomato commented Oct 25, 2024 • edited Loading

annevk commented Nov 5, 2024

ptomato commented Nov 7, 2024

annevk left a comment

Choose a reason for hiding this comment

jgraham commented Nov 8, 2024 • edited Loading

ptomato commented Nov 8, 2024

gsnedders left a comment • edited Loading

Choose a reason for hiding this comment

ptomato commented Nov 12, 2024

ptomato commented Oct 25, 2024 •

edited

Loading

jgraham commented Nov 8, 2024 •

edited

Loading

gsnedders left a comment •

edited

Loading