Skip to content

11. Backend Logging Guide

Hamed Valiollahi edited this page Nov 4, 2024 · 3 revisions

This document provides comprehensive guidelines for developers on how to effectively use our standardized logging system. By following this guide, you will ensure consistent logging practices across our backend services, facilitating easier debugging, monitoring, and analysis.


Table of Contents

  1. Introduction
  2. Logging Modes
  3. Logging Practices
  4. Sample Logs
  5. Logging Levels
  6. Correlation ID Usage
  7. Best Practices and Guidelines
  8. Using Kibana for Log Analysis
  9. Additional Resources
  10. FAQs

Introduction

Effective logging is crucial for diagnosing issues, monitoring application behavior, and gaining insights into system performance. This guide outlines our standardized approach to backend logging, helping you produce logs that are both developer-friendly and suitable for production analysis.

Our logging system is designed to be developer-friendly, allowing you to focus on logging important information without worrying about adding additional metadata like source information or correlation IDs—these are added automatically by our logging infrastructure.


Logging Modes

Our logging system operates in two distinct modes, catering to different stages of the development lifecycle:

Development Mode

  • Purpose: To provide developers with readable and highlighted logs during local development.
  • Features:
    • Human-readable format.
    • Syntax highlighting for better readability.
    • Indented and structured output for complex data types.

Non-Development Mode

  • Purpose: To produce logs suitable for production environments, optimized for log aggregation and analysis tools like Kibana.
  • Features:
    • JSON-formatted logs for easy parsing.
    • Compact representation of data.
    • Designed for ingestion into centralized logging systems.

Logging Practices

What to Log

As a developer, your responsibility is to log important information that aids in understanding the application's behavior and diagnosing issues. This includes:

  • Significant Events: Actions like "Report generated", "Report signed".
  • Important Data Points: IDs, counts, or statuses that are crucial for tracking operations.
  • Errors and Exceptions: Any caught exceptions or error conditions that need attention.
  • Process Milestones: Steps in a long-running process to indicate progress.

How to Log

  • Use the Logger: Utilize the provided logger object in your code to log messages.
  • Log Important Information: Include relevant data by passing key-value pairs.
  • No Need to Add Metadata: Do not manually add source information, correlation IDs, or other metadata—they are added automatically by our logging infrastructure.

Example Usage:

logger.info("Fuel supply created for compliance report", fuel_supply_id=1, compliance_report_id=1)

In this example, you are logging an informational message indicating that a fuel supply has been successfully created, along with the fuel_supply_id and compliance_report_id.


Sample Logs

Development Mode Logs

Example of logs generated in development mode:

Note that the following will appear highlighted in the terminal.

2024-11-03 18:50:23 [INFO] Getting fuel supply list for compliance report
2024-11-03 18:50:23   compliance_report_id=123
2024-11-03 18:50:23   correlation_id='177b381a-ca37-484d-a3b9-bbb16061775a'
2024-11-03 18:50:23   level='info'
2024-11-03 18:50:23   filename='reports/views.py'
2024-11-03 18:50:23   func_name='get_fuel_supply_list'
2024-11-03 18:50:23   lineno=101

Example of an API call success log:

2024-11-03 19:19:04 [INFO] API call successful
2024-11-03 19:19:04   user_id=***
2024-11-03 19:19:04   request_payload={'page': 1, 'size': 10, ...}
2024-11-03 19:19:04   response_data=UsersSchema(pagination=..., users=[...])
2024-11-03 19:19:04   source_info={'filename': 'user/views.py', 'func_name': 'get_users', 'lineno': 76}
2024-11-03 19:19:04   correlation_id='****'
2024-11-03 19:19:04   level='info'
2024-11-03 19:19:04   filename='decorators.py'
2024-11-03 19:19:04   func_name='wrapper'
2024-11-03 19:19:04   lineno=176

Notes:

  • Information like correlation_id, log level, and source information (filename, func_name, lineno) is added automatically.

Non-Development Mode Logs

Example of logs generated in non-development mode:

{
  "compliance_report_id": 123,
  "event": "Getting fuel supply list for compliance report",
  "correlation_id": "816dbbdf-11fe-4df5-8dc8-754c07610742",
  "level": "info",
  "filename": "reports/views.py",
  "func_name": "get_fuel_supply_list",
  "lineno": 101
}

Notes:

  • The log is in JSON format, suitable for ingestion into log analysis tools like Kibana.

Logging Levels

It's important to use the appropriate logging levels to indicate the severity and nature of the events being logged. Below are the logging levels available:

  • logger.debug(...) – Detailed information, typically of interest only when diagnosing problems.
  • logger.info(...) – Confirmation that things are working as expected.
  • logger.warning(...) – An indication that something unexpected happened, or indicative of some problem in the near future.
  • logger.error(...) – Due to a more serious problem, the software has not been able to perform some function.
  • logger.critical(...) – A serious error, indicating that the program itself may be unable to continue running.

Example Usage:

logger.debug("Debug message", attribute1=value1, attribute2=value2)
logger.info("Info message", attribute1=value1, attribute2=value2)
logger.warning("Warning message", attribute1=value1, attribute2=value2)
logger.error("Error message", attribute1=value1, attribute2=value2)
logger.critical("Critical message", attribute1=value1, attribute2=value2)

Correlation ID Usage

What is a Correlation ID?

A correlation ID is a unique identifier assigned to each request or operation, which allows you to trace and correlate log entries across different services or components.

Why is it Helpful?

  • Traceability: Helps in tracking the flow of a request through different layers of the application.
  • Debugging: Makes it easier to find all logs related to a specific operation.
  • Support: Facilitates customer support by allowing teams to quickly locate relevant logs based on a correlation ID.

Future Integration with Front-End

In the future, we aim to have the same correlation ID passed from the front-end to the back-end. This will allow for end-to-end tracing of user actions, greatly enhancing our ability to diagnose issues and understand user behavior.

How It Works:

  • The correlation ID is automatically generated and attached to each request.
  • Developers do not need to manually include it in log statements.
  • When reviewing logs, you can filter by correlation ID to see all related log entries.

Best Practices and Guidelines

Consistency

  • Use Appropriate Log Levels: Choose the correct logging level to reflect the severity of the event.

Security and Privacy

  • Avoid Logging Sensitive Data: Do not log personally identifiable information (PII) or sensitive data such as passwords, tokens, or confidential information.

  • Automatic Data Masking: Our logging system uses the censor_sensitive_data_processor function (found in backend/lcfs/logging_config.py) to automatically mask sensitive fields in logs.

    def censor_sensitive_data_processor(_, __, event_dict):
        """Mask sensitive information in logs."""
        sensitive_keys = {'password', 'token', 'secret_key', 'authorization', 'api_key'}
        for key in sensitive_keys:
            if key in event_dict:
                event_dict[key] = '***'
        return event_dict
  • Best Practice: Even with automatic masking, always refrain from logging sensitive information to ensure data privacy and security.

Performance Considerations

  • Avoid Excessive Logging: Log only necessary information to prevent log bloat and performance degradation.
  • Efficient Logging: Be mindful of the overhead introduced by logging, especially in loops or high-throughput areas.

Using Kibana for Log Analysis

Accessing Kibana

  1. Open Kibana: Access the Kibana dashboard via the OpenShift Platform.

  2. Log In: Use your IDIR credentials.

  3. Set Up Your Index (First-Time Users):

    • Index Pattern: app-*
    • Timestamp Field Name: @timestamp

After setup, go to the Discover page to view logs.

Searching and Filtering Logs

  • Basic Search: Filter logs by typing keywords or phrases.
    • Example:
      kubernetes.namespace_name:"YOUR_PROJECT_NAME" AND kubernetes.container_name.raw:"lcfs-backend"
      
    • Replace "YOUR_PROJECT_NAME" with your project name.

Tips

  • Customize Columns: On the Discover page, adjust columns for key information.
  • Save Searches: Save frequently used queries.

This should help you navigate and analyze logs effectively in Kibana!


FAQs

Q1: Do I need to include source information like filename and line number in my log statements?

No, you do not need to include source information manually. The logging system automatically appends source information to your logs.

Q2: How do I include additional context in my logs?

Simply pass key-value pairs as additional arguments to your log statement. For example:

logger.info("Processing request", request_id=request.id, ...)

Q3: Do I need to manage the correlation ID?

No, the correlation ID is managed by the logging infrastructure. It is automatically attached to each log entry and can be used to trace requests.

Q4: Should I log exceptions and stack traces?

Yes, when handling exceptions, you should log them using the appropriate logging level (e.g., logger.error) and include the exception information:

except Exception as e:
    logger.error("An error occurred while processing the request", error=str(e))

Q5: What logging levels are available, and when should I use them?

Refer to the Logging Levels section for details on each logging level and appropriate usage scenarios.