Skip to content

4. Dashboard metrics data loading

carolinemwood edited this page May 6, 2024 · 1 revision

This document outlines the process of extracting from Oracle (THE schema) and loading it into Postgres (SILVA schema) for the Dashboard page on the SILVA application.

TL;DR;

Long version

Cron-job

The SILVA backend REST API, built on Spring Boot, utilizes scheduling capabilities provided by Spring Boot. A cron expression schedules the job to run at 11 PM, Monday to Friday.

@Component
public class JobName {

  @Scheduled(cron = "0 0 23 * * MON-FRI")
  public void extractDataForTheDashboard() {
    dashboardExtractionService.extractDataForTheDashboard();
  }
}

Learn more:

Data sources

  • Source: Oracle database with THE schema. JDBC connection over encrypted listeners.
  • Destination: Postgres database created for SILVA modernization, hosted on OpenShift.

ℹ️ No tunning or improvements were made in terms of connection pooling, temporary data, or indexes. A single thread, on a single process was capable of getting it done.

Tables Read on Oracle

Here are all Oracles tables read for this process. By default it collects data from the last 24 months.

  • THE.OPENING: Opening table.
SELECT op.OPENING_ID AS openingId
  ,op.OPENING_STATUS_CODE AS openingStatusCode
  ,op.ENTRY_USERID AS entryUserId
  ,op.ENTRY_TIMESTAMP AS entryTimestamp
  ,op.UPDATE_TIMESTAMP AS updateTimestamp
  ,op.ADMIN_DISTRICT_NO AS adminDistrictNo
  ,op.RESULTS_SUBMISSION_ID AS resultsSubmissionId
  ,GREATEST(op.ENTRY_TIMESTAMP,op.UPDATE_TIMESTAMP) AS actionTimestamp
FROM THE.OPENING op
  WHERE op.ENTRY_TIMESTAMP >= ADD_MONTHS(SYSDATE, - ?1)
  OR op.UPDATE_TIMESTAMP >= ADD_MONTHS(SYSDATE, - ?1)
ORDER BY actionTimestamp DESC;
  • THE.RESULTS_ELECTRONIC_SUBMISSION: Results electronic submissions, for the free growing milestones, clientNumber filter.

ℹ️ Note that here, only submission IDs present in the previous query result will be fetched.

SELECT res.RESULTS_SUBMISSION_ID AS resultsSubmissionId
  ,res.CLIENT_NUMBER AS clientNumber
FROM THE.RESULTS_ELECTRONIC_SUBMISSION res
WHERE res.RESULTS_SUBMISSION_ID IN (?1);
  • THE.RESULTS_AUDIT_EVENT: Results Audit Events table, required tracking opening changes and history.
SELECT ra.RESULTS_AUDIT_ACTION_CODE AS resultsAuditActionCode
  ,ra.ACTION_DATE AS actionDate
  ,ra.ENTRY_TIMESTAMP AS entryTimestamp
  ,ra.ENTRY_USERID AS entryUserid
  ,ra.OPENING_ID AS openingId
  ,GREATEST(ra.ENTRY_TIMESTAMP,ra.ACTION_DATE) AS actionTimestamp
FROM THE.RESULTS_AUDIT_EVENT ra
WHERE ra.ENTRY_TIMESTAMP >= ADD_MONTHS(SYSDATE, - ?1)
  OR ra.ACTION_DATE >= ADD_MONTHS(SYSDATE, - ?1)
ORDER BY actionTimestamp DESC;
  • THE.STOCKING_EVENT_HISTORY: Stocking Event History table, required for the history feature.
SELECT seh.RESULTS_AUDIT_ACTION_CODE AS resultsAuditActionCode
  ,seh.ENTRY_USERID AS entryUserid
  ,seh.OPENING_ID AS openingId
  ,seh.ENTRY_TIMESTAMP AS entryTimestamp
  ,seh.AMEND_EVENT_TIMESTAMP AS amendEventTimestamp
  ,GREATEST(seh.ENTRY_TIMESTAMP,seh.AMEND_EVENT_TIMESTAMP) AS actionTimestamp
FROM THE.STOCKING_EVENT_HISTORY seh
  WHERE seh.ENTRY_TIMESTAMP >= ADD_MONTHS(SYSDATE, - ?1)
  OR seh.AMEND_EVENT_TIMESTAMP  >= ADD_MONTHS(SYSDATE, - ?1)
ORDER BY actionTimestamp DESC;
  • THE.ORG_UNIT: Organization Units table, providing district names.
SELECT ou.ORG_UNIT_NO AS orgUnitNo
  ,ou.ORG_UNIT_CODE AS orgUnitCode
  ,ou.ORG_UNIT_NAME AS orgUnitName
FROM THE.ORG_UNIT ou
WHERE ou.ORG_UNIT_NO IN (?1);

ℹ️ Note that here, only Org Units present in the openings query result will be fetched.

  • THE.RESULTS_AUDIT_ACTION_CODE: Descriptions of user's actions.
SELECT raac.RESULTS_AUDIT_ACTION_CODE AS resultsAuditActionCode
  ,raac.DESCRIPTION AS description
  FROM THE.RESULTS_AUDIT_ACTION_CODE raac 
WHERE raac.RESULTS_AUDIT_ACTION_CODE IN (?1);

ℹ️ Note that here, only action codes present in both audit and stocking events queries results will be fetched.

Tables Loaded into Postgres

For data loading, the process happened differently, using tools such as Hibernate ORM and JPA. Hibernate provides Object-relational Mapping for Java classes, where developers can write Java code and Hibernate turns that into SQL code. JPA use to stand for Java Persistence API, but it got improved and replaced by Jakarta, now it means Jakarta Persistence API.

That means there's no SQL code written for data loading. Simply calling Java-defined classes and interfaced does the trick.

Code samples:

// Table definition
@Entity
@Table(name = "openings_last_year")
public class OpeningsLastYearEntity {
  @Id
  @Column(name = "opening_id")
  private Long openingId;

  /* ... more columns ...*/
}

// Interface for database access - no SQL code required, it'll be generated
public interface OpeningsLastYearRepository extends JpaRepository<OpeningsLastYearEntity, Long> {

  List<OpeningsLastYearEntity> findAllByOpeningIdInList(List<Long> openingIdList);
}

// Then we can do
openingsLastYearRepository.deleteAll();
openingsLastYearRepository.saveAllAndFlush(openingsLastYearMap.values());

Learn more:

Monitoring

  • The table ORACLE_EXTRACTION_LOGS stores all the log messages, including warnings and data inconsistencies. Only logs from the last extraction will be stored.
  • API endpoints allow requesting logs or manually triggering the job.

2024-04-18_18-14

Here's how they look like on a request response:

2024-04-18_18-14_1

Manual trigger access is restricted to authorized team members and managers.

Notes

This process is still a work-in-progress and has not been run in production yet. Updates can be expected soon.