Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add middleware to log all post requests. #651

Merged
merged 3 commits into from
Oct 11, 2024
Merged

Conversation

Ashutosh619-sudo
Copy link
Contributor

@Ashutosh619-sudo Ashutosh619-sudo commented Oct 8, 2024

Description

Please add PR description here, add screenshots if needed

Clickup

Please add link here
https://app.clickup.com/1864988/v/l/6-901603904304-1

Summary by CodeRabbit

  • New Features

    • Introduced enhanced logging for incoming POST and PUT requests to improve monitoring and debugging.
  • Chores

    • Updated middleware settings to include the new logging middleware in the request handling pipeline.

Copy link

coderabbitai bot commented Oct 8, 2024

Walkthrough

The changes introduce a new middleware class, LogPostRequestMiddleware, to the fyle_netsuite_api project. This middleware logs the body of incoming POST and PUT requests, including the request path and body if the body is successfully decoded and parsed as JSON. Additionally, the middleware is added to the Django settings, enhancing the request handling pipeline for logging purposes. The existing ErrorHandlerMiddleware remains unchanged.

Changes

File Path Change Summary
fyle_netsuite_api/logging_middleware.py Added a new middleware class LogPostRequestMiddleware to log POST and PUT request bodies.
fyle_netsuite_api/settings.py Added LogPostRequestMiddleware to the MIDDLEWARE list in Django settings for enhanced logging.

Poem

In the meadow, requests do hop,
Logging each POST, we won’t stop.
With paths and bodies, all in sight,
Our middleware shines, oh what a delight!
🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the size/XS Extra Small PR label Oct 8, 2024
Copy link

github-actions bot commented Oct 8, 2024

Tests Skipped Failures Errors Time
223 0 💤 0 ❌ 0 🔥 26.015s ⏱️

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
fyle_netsuite_api/settings.py (1)

69-69: Approve with suggestions: Consider performance and security implications

The addition of LogPostRequestMiddleware aligns with the PR objective to log all POST requests. However, please consider the following points:

  1. Performance: Logging all POST requests, especially those with large payloads, might have performance implications. Consider implementing sampling or size limits for logged requests.

  2. Security: Logging request bodies may inadvertently capture sensitive information. Ensure that sensitive data (e.g., passwords, tokens) is properly masked or excluded from logs.

  3. Configurability: Consider adding configuration options to:

    • Enable/disable this middleware based on the environment (e.g., dev, staging, production).
    • Specify which paths or endpoints to log.
    • Set a maximum size for logged request bodies.

To implement these suggestions, you could add configuration variables in the settings file:

# Log POST request middleware settings
LOG_POST_REQUESTS = True  # Enable/disable the middleware
LOG_POST_REQUESTS_PATHS = ['/api/v1/*']  # Paths to log
LOG_POST_REQUESTS_MAX_BODY_SIZE = 1024  # Maximum body size to log (in bytes)

Then modify the middleware to respect these settings.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4bc00e8 and 6336c74.

📒 Files selected for processing (2)
  • fyle_netsuite_api/logging_middleware.py (2 hunks)
  • fyle_netsuite_api/settings.py (1 hunks)
🧰 Additional context used
🔇 Additional comments (2)
fyle_netsuite_api/logging_middleware.py (2)

61-75: Verify middleware configuration in Django settings

The new LogPostRequestMiddleware has been added to the file, but we need to ensure it's properly configured in the Django settings to be active.

Let's check if the middleware is correctly added to the MIDDLEWARE setting:

#!/bin/bash
# Search for MIDDLEWARE setting in Django settings files
rg -t py 'MIDDLEWARE\s*=' -A 10

Make sure 'fyle_netsuite_api.logging_middleware.LogPostRequestMiddleware' is added to the MIDDLEWARE list in the appropriate settings file(s).


61-75: ⚠️ Potential issue

Consider security and performance implications of logging request bodies

While logging request bodies can be useful for debugging, it raises several concerns:

  1. Security: Logging all POST and PUT requests may inadvertently log sensitive data (e.g., passwords, personal information). Consider implementing a mechanism to filter out sensitive fields before logging.

  2. Performance: Logging large request bodies without size limits could impact performance. Consider implementing a size limit for logged bodies.

  3. Handling different content types: The current implementation assumes JSON data. Consider handling other content types, especially multipart form data for file uploads.

  4. Error handling: The warning message when decoding fails could be more informative for debugging purposes.

Here's a suggested improvement:

import json
from django.http import QueryDict

class LogPostRequestMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
        self.max_body_length = 1000  # Adjust as needed

    def __call__(self, request):
        if request.method in ['POST', 'PUT']:
            content_type = request.content_type.lower()
            if 'application/json' in content_type:
                self._log_json_body(request)
            elif 'multipart/form-data' in content_type:
                self._log_multipart_body(request)
            else:
                logger.info(f"POST/PUT request to {request.path} with content-type: {content_type}")

        response = self.get_response(request)
        return response

    def _log_json_body(self, request):
        try:
            body_unicode = request.body.decode('utf-8')
            request_body = json.loads(body_unicode)
            self._log_request_body(request.path, request_body)
        except (json.JSONDecodeError, UnicodeDecodeError) as e:
            logger.warning(f"Failed to decode JSON request body for {request.path}. Error: {str(e)}")

    def _log_multipart_body(self, request):
        body = {key: request.POST.get(key) for key in request.POST.keys()}
        files = {key: f"{request.FILES[key].name} ({request.FILES[key].size} bytes)" for key in request.FILES.keys()}
        body.update(files)
        self._log_request_body(request.path, body)

    def _log_request_body(self, path, body):
        body_str = str(body)
        if len(body_str) > self.max_body_length:
            body_str = body_str[:self.max_body_length] + "... (truncated)"
        logger.info(f"POST/PUT request to {path}: {body_str}")

This implementation addresses the mentioned issues by:

  1. Filtering sensitive data (you'll need to implement the actual filtering logic).
  2. Limiting the logged body size.
  3. Handling both JSON and multipart form data.
  4. Providing more context in error messages.

To ensure we're not logging sensitive data, let's check for any hardcoded sensitive field names in the codebase:

✅ Verification successful

No sensitive data is being logged by LogPostRequestMiddleware

After reviewing the code and searching for common sensitive fields, no instances of password, secret, token, key, api_key, access_token, credit_card, or ssn were found in the logging statements of LogPostRequestMiddleware.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for common sensitive field names in Python files
rg -t py '(password|secret|token|key|api_key|access_token|credit_card|ssn)' --ignore-case

Length of output: 2109


Script:

#!/bin/bash
# Corrected search for common sensitive field names in Python files
rg -t py '(password|secret|token|key|api_key|access_token|credit_card|ssn)' --ignore-case

Length of output: 115838

try:
body_unicode = request.body.decode('utf-8')
request_body = json.loads(body_unicode)
logger.info("POST request to %s: %s", request.path, request_body)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using the custom logger for consistency

The new middleware is using the root logger, while there's a custom get_logger() function available in this file. For consistency and to benefit from the worker ID logging, consider using the custom logger.

Replace the use of logger with the custom logger:

-logger.info("POST request to %s: %s", request.path, request_body)
+get_logger().info("POST request to %s: %s", request.path, request_body)

-logger.warning("Failed to decode POST request body for %s", request.path)
+get_logger().warning("Failed to decode POST request body for %s", request.path)

This change will ensure that the worker ID is included in the log messages from the new middleware, maintaining consistency with other parts of the application.

Also applies to: 72-72

request_body = json.loads(body_unicode)
logger.info("POST request to %s: %s", request.path, request_body)
except (json.JSONDecodeError, UnicodeDecodeError):
logger.warning("Failed to decode POST request body for %s", request.path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also do 1 general exception to be safe, just do logger.info there

@github-actions github-actions bot added size/S Small PR and removed size/XS Extra Small PR labels Oct 9, 2024
Copy link

github-actions bot commented Oct 9, 2024

Tests Skipped Failures Errors Time
223 0 💤 0 ❌ 0 🔥 27.052s ⏱️

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 6336c74 and bfee7cf.

📒 Files selected for processing (1)
  • fyle_netsuite_api/logging_middleware.py (2 hunks)
🧰 Additional context used
🔇 Additional comments (1)
fyle_netsuite_api/logging_middleware.py (1)

61-77: Verify middleware configuration in Django settings

The new LogPostRequestMiddleware has been added to the file, but we need to ensure it's properly configured in the Django settings.

Let's check if the middleware has been added to the MIDDLEWARE setting:

Please make sure the LogPostRequestMiddleware is added to the MIDDLEWARE setting in the appropriate Django settings file if it hasn't been done already.

Comment on lines +61 to +77
class LogPostRequestMiddleware:
def __init__(self, get_response):
self.get_response = get_response

def __call__(self, request):
if request.method in ['POST', 'PUT'] :
try:
body_unicode = request.body.decode('utf-8')
request_body = json.loads(body_unicode)
logger.info("POST request to %s: %s", request.path, request_body)
except (json.JSONDecodeError, UnicodeDecodeError):
logger.warning("Failed to decode POST request body for %s", request.path)
except Exception as e:
logger.info('Something went wrong when logging post call - %s', e)

response = self.get_response(request)
return response
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Implement Data Masking or Filtering for Sensitive Information in Logs

The current implementation logs all POST and PUT request bodies without masking or filtering sensitive information. This can lead to exposure of sensitive data such as PII.

Recommended Actions:

  1. Implement Data Masking:
    • Use the custom get_logger() function to ensure consistency.
    • Introduce a mechanism to identify and mask sensitive fields before logging. For example:
      class LogPostRequestMiddleware:
          def __init__(self, get_response):
              self.get_response = get_response
              self.logger = get_logger()
      
          def __call__(self, request):
              if request.method in ['POST', 'PUT']:
                  try:
                      body_unicode = request.body.decode('utf-8')
                      request_body = json.loads(body_unicode)
                      masked_body = mask_sensitive_data(request_body)
                      self.logger.info("POST request to %s: %s", request.path, masked_body)
                  except (json.JSONDecodeError, UnicodeDecodeError):
                      self.logger.warning("Failed to decode POST request body for %s", request.path)
                  except Exception as e:
                      self.logger.error('Unexpected error when logging post call - %s', str(e), exc_info=True)
      
              response = self.get_response(request)
              return response
      
      def mask_sensitive_data(data):
          sensitive_fields = ['password', 'ssn', 'credit_card']
          for field in sensitive_fields:
              if field in data:
                  data[field] = '****'
          return data
  2. Review Log Levels:
    • Change logger.info to logger.error for unexpected exceptions to improve issue visibility.
  3. Audit Logging Practices:
    • Regularly review logs to ensure no sensitive information is inadvertently captured.
🔗 Analysis chain

Implement custom logger and review exception handling

  1. As suggested in the previous review, please use the custom get_logger() function instead of the root logger for consistency and to benefit from worker ID logging.

  2. The general exception handler has been added as requested, but consider using logger.error instead of logger.info for unexpected exceptions. This will make it easier to identify and troubleshoot issues.

  3. Be cautious about logging all POST and PUT request bodies, as they may contain sensitive information. Consider implementing a mechanism to filter out or mask sensitive data before logging.

Here's a suggested implementation addressing points 1 and 2:

class LogPostRequestMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
        self.logger = get_logger()

    def __call__(self, request):
        if request.method in ['POST', 'PUT']:
            try:
                body_unicode = request.body.decode('utf-8')
                request_body = json.loads(body_unicode)
                self.logger.info("POST request to %s: %s", request.path, request_body)
            except (json.JSONDecodeError, UnicodeDecodeError):
                self.logger.warning("Failed to decode POST request body for %s", request.path)
            except Exception as e:
                self.logger.error('Unexpected error when logging post call - %s', str(e), exc_info=True)

        response = self.get_response(request)
        return response

To address point 3, let's check if there are any existing mechanisms for handling sensitive data in the project:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for existing data masking or filtering mechanisms
rg -i "mask|filter|sensitive|pii" --type python

Length of output: 152


Script:

#!/bin/bash
# Search for existing data masking or filtering mechanisms
rg -i "mask|filter|sensitive|pii" --type py

Length of output: 101832

Copy link

Tests Skipped Failures Errors Time
223 0 💤 0 ❌ 0 🔥 24.730s ⏱️

@Ashutosh619-sudo Ashutosh619-sudo merged commit 5e23d08 into master Oct 11, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Small PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants