-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get build data in BigQuery #596
Comments
Could we start by adding the data to bigquery on ingestion? That would let us keep the existing infrastructure and gradually moving over to bigquery-only solutions over time. |
I'm not sure that's easier than ditching postgres for BigQuery, but seems feasible to me. |
Ingestion calls I'm pretty oblivious about how BigQuery works in terms of inserts and bulk inserts but I get a feeling you could just That way you'd be able to try it out very gently. By the way, that frontend that Buildhub2 has is pretty neat because it's able to provide a pretty decent interface without really knowing it's got anything to do with software builds. I never really loved |
I think it probably makes more sense to do this in airflow as a seperate process, rather than in buildhub. Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1607229 about that |
Buildhub2 has a Django app that serves the API. The daemon stores all the data in Postgres tables managed by Django and then indexes that for Elasticsearch.
However, we can probably just do all of this in BigQuery. Further, if we had it in BigQuery, it is easier to access the data from Telemetry tools. Buildhub2 is used by several things in Telemetry (Mission Control, probe-scraper, other things?) and that number should continue to grow as they add more dashboards around crash pings and other things.
This issue covers getting the data in BigQuery either by ditching Postgres for BigQuery or ditching Elasticsearch for BigQuery.
The text was updated successfully, but these errors were encountered: