Skip to content

12. Data migration from TFRS to LCFS

Prashanth edited this page Nov 26, 2024 · 1 revision

Overview

/etl/data-migration.sh script automates the migration of data from TFRS to LCFS using Apache NiFi processors, including:

  • Updating database connections in controller services.
  • Executing processors sequentially or on-demand.
  • Establishing port-forwarding to ensure database connections of both LCFS & TFRS are reachable.
  • Providing robust logging and error handling.

Prerequisites

Tools and Environment:

  • Apache NiFi: Ensure your NiFi instance is running and accessible.
  • **Ensure you're logged into openshift on the terminal using the token
  • Command-Line Tools:
    • jq for JSON parsing.
    • oc for OpenShift commands.
    • lsof to check active ports.

Usage

Script Invocation:

./etl/data-migration.sh [environment] [--debug|--verbose]

Parameters:

  1. Environment:
    • dev, test, or prod.
    • Determines the namespace and environment from which data migration needs to be performed.
  2. Flags:
    • --debug: Enables the most detailed logging.
    • --verbose: Enables verbose logging for troubleshooting.

Steps to Get a Processor ID in Apache NiFi

  1. Access Apache NiFi: Open the NiFi UI in your browser (e.g., http://localhost:8080/nifi).

  2. Select the Processor:

    • Locate the processor in your NiFi flow.
    • Right-click on the processor and select Configure.
  3. Find the Processor ID:

    • Go to the Settings tab in the configuration modal.
    • Look for the Processor ID at the top.
    • Copy the Processor ID.
  4. Example Image: Below is an image showing where to find the Processor ID:

    Processor ID in NiFi


Script Details

Main Features:

  1. Port Forwarding:

    • Automatically checks if the required port is already forwarded.
    • Establishes a connection if necessary.
  2. Processor Execution:

    • Triggers processors to run and monitors their execution status.
    • Handles failures and retries.
  3. Controller Service Updates:

    • Updates database connections dynamically in NiFi controller services.
    • Ensures smooth integration with external systems.

File Structure:

  • data-migration.sh: Main script file.

How to Update the Script with new processor IDs

  1. At the top of the script update the new processor ID along with the existing ones:

    readonly ORGANIZATION_PROCESSOR="328e2539-0192-1000-0000-00007a4304c1"
    readonly USER_PROCESSOR="e6c63130-3eac-1b13-a947-ee0103275138"
    readonly TRANSFER_PROCESSOR="b9d73248-1438-1418-a736-cc94c8c21e70"
  2. Navigate to the to end of the script and update the execution list as well:

    # Expand these processors as needed
    execute_processor "$ORGANIZATION_PROCESSOR" "$env"
    execute_processor "$USER_PROCESSOR" "$env"
    execute_processor "$TRANSFER_PROCESSOR" "$env"

Troubleshooting

Common Issues:

  1. Port Already in Use:

    • Error: Port is already forwarded. Skipping...
    • Solution: Confirm if the port-forward is correct. Use lsof to inspect active ports.
    • Eg: lsof -ti :5435 | xargs kill -9 command to delete any process using the port
  2. Processor ID Not Found:

    • Error: Processor ID does not exist.
    • Solution: Verify the Processor ID in the NiFi UI.
  3. NiFi API Errors:

    • Error: HTTP Code 400 or 500
    • Solution: Ensure that:
      • NiFi is running and accessible.
      • The API credentials and URLs are correct.

Example Execution

Command:

./etl/data-migration.sh dev --verbose

Expected Output:

[INFO][2024-11-25 10:00:00] Starting NiFi Processor Management Script
[INFO][2024-11-25 10:00:01] Updating NiFi connection for TFRS in dev
[INFO][2024-11-25 10:00:02] Port forwarding for TFRS: 5435 -> 5432
[INFO][2024-11-25 10:00:03] Triggering processor ORGANIZATION_PROCESSOR
[INFO][2024-11-25 10:00:04] All processors executed successfully.