Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate updating the data #37

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 47 additions & 29 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -1,49 +1,67 @@
# This is a basic workflow to help you get started with Actions
name: Scrape and publish

name: CI

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
schedule:
# every day after midnight, UTC
- cron: '1 * * * *'

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
scrape-and-publish:
runs-on: ubuntu-latest

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
- name: Set up Python 3.11
uses: actions/setup-python@v3
with:
python-version: "3.7"
python-version: "3.11"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
sudo apt-get install -y jq

- name: Remember current time
id: time
run: echo "DATE=$(date --utc)" >> $GITHUB_OUTPUT

- name: Remember checksum of the data before update
id: data-before
run: echo "MD5=$(md5sum data.json)" >> $GITHUB_OUTPUT

# Runs a single command using the runners shell
- name: Check repo data.json md5 hash
run: echo "::set-env name=datamd5::$(python $GITHUB_WORKSPACE/data.json | md5sum)"
- name: Scrape data
run: ./scraper.py > data.raw

# Runs a single command using the runners shell
- name: Check gcpinstances.info data.json md5 hash
run: echo "::set-env name=sitemd5::$(curl -s https://gcpinstances.info/data.json | md5sum)"
- name: Prettify JSON
run: jq --sort-keys . data.raw > data.json

- name: Get checksum of the data after update
id: data-after
run: echo "MD5=$(md5sum data.json)" >> $GITHUB_OUTPUT

- name: Update checking timestamp
run: sed -i 's/id="last_check">.*</id="last_check">${{ steps.time.outputs.DATE }}</g' index.html

# Runs a set of commands using the runners shell
- name: Run a multi-line script
run: if ! "$datamd5" "$sitemd5"; then echo "They don't match."; fi
- name: aaa
run: export
- name: Update update timestamp, if updated
run: sed -i 's/id="last_update">.*</id="last_update">${{ steps.time.outputs.DATE }}</g' index.html
if: steps.data-before.outputs.MD5 != steps.data-after.outputs.MD5

- name: Prepare to publish
run: |
mkdir publish
cp index.html publish/
cp data.json publish/

- uses: shallwefootball/s3-upload-action@master
with:
aws_key_id: ${{ secrets.AWS_KEY_ID }}
aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY}}
aws_bucket: gcpinstances.info
source_dir: publish

- uses: EndBug/add-and-commit@v9
with:
add: 'index.html data.json'
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
venv/
Empty file removed automation.py
Empty file.
43,620 changes: 43,619 additions & 1 deletion data.json

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@
<h1>GCPinstances.info <small>Easy GCP <b>Compute Engine</b> Instance Comparison (by <a href="https://www.doit.com">DoiT International</a>)</small></h1>


<p class="pull-right label label-info">Last Update: 2023-04-09 15:00:00 UTC</p>
<p class="pull-right label label-info">Last Prices Check: <span id="last_check">Sun Apr 9 16:02:35 UTC 2023</span></p>
<p class="pull-right label label-info">Last Change: <span id="last_update">Sun Apr 9 16:02:35 UTC 2023</span></p>

<ul class="nav nav-tabs">
<li role="presentation" class="active"><a href="/">Compute Engine</a></li>
<!-- li role="presentation" class=""><a href="/rds/">RDS</a></li -->
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
requests==2.28.2
2 changes: 2 additions & 0 deletions scraper.py
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#!/usr/bin/env python3

import json
import requests

Expand Down