Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scamper1 parsing errors #1052

Open
stephen-soltesz opened this issue Feb 7, 2022 · 0 comments
Open

scamper1 parsing errors #1052

stephen-soltesz opened this issue Feb 7, 2022 · 0 comments
Labels
review/triage Team should review and assign priority

Comments

@stephen-soltesz
Copy link
Contributor

stephen-soltesz commented Feb 7, 2022

After deploying the new alternative ETL pipeline SLIs, we found that the scamper1 datatype would report parse errors after restarting:

Screen Shot 2022-02-07 at 2 24 28 PM

We suspected this may be due to a temporary format error in the early version, but have not yet confirmed.

After deploying both a modified version of the v1 parser for the "traceroute"/paris1 datatype and after copying the ndt/traceroute data to ndt/scamper1, the error rate is about 20%

Screen Shot 2022-02-07 at 2 24 45 PM

The causes of both is currently unknown, but the parser should recognize these as either "invalid" measurements (to not include them in the set of measurements used to calculate the error rate) or to fix the parser to recognize these files.

Update:
A lot of files containing only a UUID field have been found in the legacy scamper archive.
{"UUID": "ndt-v97k9_1555519948_0000000000005586"}
For example, the files under https://pantheon.corp.google.com/storage/browser/_details/archive-measurement-lab/ndt/traceroute/2019/04/21/20190422T003934.994993Z-traceroute-mlab4-lga05-ndt.tgz;tab=live_object?project=measurement-lab

These seem to be triggering the majority of the "invalid traceroute file" errors.

When looking at the error rate of the legacy scamper parser, it often reaches extremely high levels.
Screenshot 2022-02-07 7 07 32 PM

The fix is to filter out the legacy dates in the errors returned by the the scamper1 parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review/triage Team should review and assign priority
Projects
None yet
Development

No branches or pull requests

1 participant