Skip to content

web-platform-tests/data-migration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WPT Data Migration Scripts

This repository contains scripts that can be used or modified to correct mistakes in the datastore that backs wpt.fyi.

Running a script

First of all, run gcloud auth application-default login (you should already have access to wptdashboard and/or wptdashboard-staging projects).

This repo does NOT use Go modules yet, so it is recommeneded to check out the repo at $GOPATH/src/github.com/web-platform-tests/data-migration. Then run go get -u ./... to get all the dependencies.

Finally, you can run most scripts with go run, e.g. go run tagger/master.go --help.

Writing a script

We have a few different categories of scripts.

Datastore-only

This is the most common kind. These scripts do a pass of scan-check-modify over all TestRuns in Datastore in parallel. Check-and-modify is done atomically in a transaction.

The reusable logic is in processor/. New scripts only need to implement the Runs interface.

Examples can be found in tagger/.

Storage

The following scripts also download results from GCS, so they are a lot slower.

add_run_info/ - used to backfill product and browser name metadata, as well as switch to a new URL schema.

add_time_start/ - used to backfill the TimeStart metadata for runs done before that information was added.

dedup_runs/ - used to deduplicate runs with the same raw_results_url from before results-processor was idempotent.

Bigtable

grid/ - an experiment to load all results into Bigtable.

About

Some temporary scripts for ad-hoc data migration/fixup

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages