master | heroku | |
---|---|---|
tests | ||
coverage |
This project is centered around parsing various datasets, including UK government data on property sales, police reporting data, and post code data. The goal is to harness geographical information to establish connections between postcodes using latitude and longitude.
The primary objective is to develop a scalable GraphQL backend capable of swiftly delivering requested results. This endeavor seeks to illuminate intricate aspects of GraphQL use, addressing challenges like the N+1 problem and scaling scenarios where more than one database is required for both write and read nodes.
Key features of the project include a robust automated Quality Assurance (QA) system, incorporating anonymized data seeding for comprehensive QA testing. The project also explores the flexibility of JavaScript, pushing the boundaries of the language. Notably, it delves into the constraints of default V8 object fields, which are capped at around ~8.4 million, while highlighting the superior handling capacity of the Map data structure.
Additionally, the project incorporates a queue system to enhance the efficiency of data processing. In essence, project serves as a practical demonstration of diverse and advanced aspects of software development, reflecting a commitment to excellence and innovation.
- GraphQL live demo [currently unavailable]
- Web Application example of how data can be consumed
- Web Application live demo [currently unavailable]
if you're using make
commands, docker and docker-compose are required, and local node.js with npm are optional
- node.js v18+
- npm v5+ or yarn
- optional makefile comes out of the box in unix enviroments
- optional docker v18.09+
- optional sqlite3 v3+ for 'integration' tests only
- with
make
commands no steps additional are required, otherwise you need to execute$ npm i
$ make test
or$ npm test
- optional 'jest' CLI params, examples:
- to collect coverage, example:
$ npm test -- --coverage
, report will be located in ./coverage directory - to execute tests only in a specific file, for example:
$ npm test src/graphql/user.test.js
- to collect coverage, example:
- optional 'jest' CLI params, examples:
- database configuration is located in the file src/orm-config.js
- to get database schema up to date:
$ npm run sql db:migrate
, you can also create a database via ORMnpm run sql db:create
- to seed the database with 'test' data:
$ npm run sql db:seed:all
$ make
or$ npm start
$ make serve
, there is no npm equivalent- if you only need to generate static assets
$ make build
or$ npm run build
- generated assets will be located in ./build directory
make PORT=18081
- heroku -> current production, contains production specific changes, trigger production deploment on every push
- master -> most upto date production ready, all pull requests in to this branch got mandatory check 'ci/circleci: jest'
- feature branches -> get merged into the master branch when they are ready and mandatory checks passed
- CI executes tests in an isolated environment
Variable | Default Value | Type | Description |
---|---|---|---|
PORT | 8081 | number | The port on which the application will be available. |
SSL_KEY | string | The absolute path to the SSL key (e.g., /home/ubuntu/private.key ). |
|
SSL_CERT | string | The absolute path to the SSL certificate (e.g., /home/ubuntu/certificate.crt ). |
|
*** | *** | *** | If a replica's config is specified, non-replica connections are used only for writes. |
DB_HOSTNAME | 127.0.0.1 | string | The host on which the database can be reached. |
DB_USERNAME | root | string | The username for connecting to the database. |
DB_PASSWORD | password | string | The password for the database user. |
DB_PORT | 3306 | number | The port on which the database can be reached. |
DB_NAME | explore | string | The name of the database schema. |
DB_DIALECT | mysql | string | The database dialect, one of mysql / sqlite / postgres. |
DB_REPLICA_HOSTNAME | 127.0.0.1 | string | The host of the database replica for read-only operations. |
DB_REPLICA_USERNAME | root | string | The username for connecting to the database replica for read-only operations. |
DB_REPLICA_PASSWORD | password | string | The password for the user connecting to the database replica for read-only operations. |
NPM command | corresponding JS file |
---|---|
parse:postcodes |
src/parse:postcodes |
parse:postcodes:lsoa |
src/parse:postcodes:lsoa |
parse:incidents |
src/parse:markers:and:incidents |
parse:properties |
src/parse:markers:and:properties |
parse:areas |
src/parse:areas |
parse:timelines |
src/parse:timelines |
example: npm run parse:postcodes -- --file=/media/data/postcodes.csv
database | version | adapter | main purpose |
---|---|---|---|
MySQL | 8 | mysql2 | production |
PostgreSQL | 11 | pg | production |
SQLite | 4 | sqlite3 | QA Automation & CI |
-
if you use MySQL 5.7+ you need to make sure it can work with mysql native password
-
PostrgeSQL and SQLite are partially supported because some of the queries are not fully engine-agnostic, and some functions do not exist in SQLite for example