Skip to content
This repository has been archived by the owner on Apr 3, 2021. It is now read-only.

[Feat] [Important] Get data from ENSE #159

Open
JorgeMiguelGomes opened this issue Aug 13, 2019 · 12 comments
Open

[Feat] [Important] Get data from ENSE #159

JorgeMiguelGomes opened this issue Aug 13, 2019 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@JorgeMiguelGomes
Copy link
Member

Context
ENSE is the official entity in Portugal.
As far as we know they are getting information every day in the morning and in the afternoon.
On this website they have information regarding the REPA network

Problem
ENSE will not share their data in an open format. The way that the information is presented is not user friendly

Objective

  1. Create a script that fetches data (scrapes it) from the official website and updates the REPA stations twice a day on the website "Já Não Dá Para Abastecer"
  2. Send updated information, via @vostpt/bot, to Discord Channel ID 608766585309626368
@tiagoad
Copy link
Member

tiagoad commented Aug 13, 2019

A little input from someone who hasn't been involved in development:

In preliminary testing, due to ENSE blocking the direct table data download, it seems the fusiontables endpoints are returning 403 Forbidden (except for fusiontables.table.get, which only returns the column names)

The "private" viz API seems to work and returns JSONP (SELECT * JSONP URL), although some testing is required as I suspect google will detect the bot and start asking for captchas.

@JorgeMiguelGomes
Copy link
Member Author

@tiagoad thanks! Can we switch to English so that everyone can contribute? Thank you.

@OldMetalmind
Copy link
Member

With quick research, I've found that when calling with this URL we can have the data in a table.

@tiagoad
Copy link
Member

tiagoad commented Aug 13, 2019

By the way, if google DOES block the direct JSONP requests, it would be trivial to write a puppeteer script that launches a google chrome instance on this URL and intercepts the requests as they come.
That said, it's a bit resource intensive and I have absolutely no idea how that would integrate in the rest of the codebase.
I am available to write such script, and maybe integrate with some indication from the rest of the team members.

@fbsoares
Copy link

So, one solution would be having a puppeter script that produces a json file every X minutes with ENSE parsed data and then a Console command that reads that JSON file to the database every X+Y minutes.

@tiagoad
Copy link
Member

tiagoad commented Aug 13, 2019

So, one solution would be having a puppeter script that produces a json file every X minutes with ENSE parsed data and then a Console command that reads that JSON file to the database every X+Y minutes.

Yes, but only if we see direct HTTP download is failing, I think.

@mribeiro
Copy link

mribeiro commented Aug 13, 2019

Making a request to this URL seems to return a long json object inside a js function.
Couldn't we just strip the function call and parse that js object to get the data? Let's say, with node or something (I didn't cross-check all info but it populates the table @OldMetalmind mentioned)?
If the data is supposed to be sent to discord then any node js script should be able to do this right?
I can give a hand if you think this is a good solution.

@JorgeMiguelGomes
Copy link
Member Author

@Cotemero would @mribeiro be able to also inject data on the database with this method?

@mribeiro
Copy link

Hi @JorgeMiguelGomes @Cotemero ,
I wrote a node script giving this output . I tried with a local example of data taken from here (kudos @tiagoad for the link). I'm mapping the column names directly as json attributes. I'm sure they aren't the best but I can surely map them to something different.

The script adapts to the columns: so if one is added (or removed/edited) the output adapts accordingly.
I haven't made it an expressjs app to expose as an API but that's quick anyway.

@tomahock
Copy link
Collaborator

@mribeiro can you provide the script? I think that will be enough for what we need!

@mribeiro
Copy link

mribeiro commented Aug 14, 2019

@tomahock All available in https://github.com/mribeiro/ense_parser !

Shall I move this to in progress?

@miguelantoniosantos
Copy link
Member

miguelantoniosantos commented Aug 14, 2019

Initial implementation on #185 and #187

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants