Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add unique constraint in task_logs for expense_group, add interval for next run #690

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

Hrishabh17
Copy link
Contributor

@Hrishabh17 Hrishabh17 commented Nov 14, 2024

Description

add unique constraint in task_logs for expense_group, add interval for next run

Clickup

http://app.clickup.com/

Summary by CodeRabbit

  • New Features

    • Updated the expense_group field in the TaskLog model to ensure each log can only reference a unique expense group.
    • Enhanced scheduling logic to accurately calculate the next run time based on specified intervals.
  • Bug Fixes

    • Improved the handling of nullable fields in the task log model.
  • Documentation

    • Added help text for the expense_group field to clarify its purpose.

@github-actions github-actions bot added the size/S Small PR label Nov 14, 2024
Copy link

Tests Skipped Failures Errors Time
261 0 💤 0 ❌ 0 🔥 55.855s ⏱️

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.47%. Comparing base (94f4a67) to head (75cbaaf).
Report is 22 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #690      +/-   ##
==========================================
- Coverage   94.70%   94.47%   -0.24%     
==========================================
  Files          61       61              
  Lines        4801     4924     +123     
==========================================
+ Hits         4547     4652     +105     
- Misses        254      272      +18     
Files with missing lines Coverage Δ
apps/tasks/models.py 100.00% <100.00%> (ø)
apps/workspaces/tasks.py 94.61% <100.00%> (+0.17%) ⬆️

... and 2 files with indirect coverage changes

Copy link

coderabbitai bot commented Nov 14, 2024

Walkthrough

This pull request includes modifications to the expense_group field in the tasklog model, enforcing a unique constraint and allowing null values. Additionally, it updates the schedule_sync function to calculate the next_run time based on a specified interval in hours. The import statements in the relevant files have also been adjusted to accommodate these changes.

Changes

File Change Summary
apps/tasks/migrations/0012_alter_tasklog_expense_group.py Modified expense_group field in tasklog model: added help_text, set null=True, unique=True, and on_delete=PROTECT.
apps/tasks/models.py Updated expense_group field in TaskLog model to enforce unique constraint.
apps/workspaces/tasks.py Modified schedule_sync function to calculate next_run using timedelta. Updated import statement to include timedelta.

Possibly related PRs

  • fix duplicated expense_groups #653: The changes in this PR involve modifications to the expense_group handling in the apps/fyle/models.py file, which is directly related to the expense_group field in the tasklog model being modified in the main PR.

Suggested labels

deploy

Suggested reviewers

  • ashwin1111

Poem

🐇 In the fields where bunnies play,
Changes hop in bright array.
Unique groups now take their stand,
Scheduling tasks, oh so grand!
With help text clear, we leap with glee,
A better path for you and me! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
apps/tasks/models.py (1)

23-23: Consider adding a database index for better query performance.

Since this field will be used for unique constraint validation and likely for lookups, adding a database index would optimize query performance.

-    expense_group = models.ForeignKey(ExpenseGroup, on_delete=models.PROTECT, null=True, help_text='Reference to Expense group', unique=True)
+    expense_group = models.ForeignKey(ExpenseGroup, on_delete=models.PROTECT, null=True, help_text='Reference to Expense group', unique=True, db_index=True)
apps/workspaces/tasks.py (1)

68-76: LGTM! Consider timezone handling.

The changes correctly implement future scheduling by adding the interval to the current time. This ensures that tasks are scheduled for future execution rather than immediate runs.

Consider using timezone.now() instead of datetime.now() to ensure consistent timezone handling across the application:

-                'next_run': datetime.now() + timedelta(hours=hours),
+                'next_run': timezone.now() + timedelta(hours=hours),

This would require adding:

from django.utils import timezone
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 331ea54 and 75cbaaf.

📒 Files selected for processing (3)
  • apps/tasks/migrations/0012_alter_tasklog_expense_group.py (1 hunks)
  • apps/tasks/models.py (1 hunks)
  • apps/workspaces/tasks.py (2 hunks)
🔇 Additional comments (3)
apps/tasks/migrations/0012_alter_tasklog_expense_group.py (2)

9-12: LGTM: Dependencies are correctly specified

The migration dependencies are properly ordered, ensuring this runs after both the required fyle and tasks migrations.


15-19: Verify existing data before applying the unique constraint

The addition of the unique constraint could fail if duplicate expense_group references exist in the current data. Additionally, making the field nullable (null=True) is a significant change that requires careful consideration.

Run this script to check for potential migration issues:

Recommendations:

  1. Before deploying, create a data migration to identify and handle any duplicate entries
  2. Update application logic to handle nullable expense_group if this wasn't allowed before
  3. Consider adding database-level validation for when null values are acceptable

The use of PROTECT for on_delete is a good choice as it prevents accidental deletion of referenced expense groups. However, ensure that your application includes proper error handling for cases where deletion is attempted.

apps/tasks/models.py (1)

23-23: Verify existing data before applying unique constraint.

The addition of a unique constraint to the expense_group field enforces a 1:1 relationship between TaskLog and ExpenseGroup. While this aligns with the PR objectives, we should ensure there are no existing duplicate references that could cause migration failures.

✅ Verification successful

Let me gather more information about the model and its relationships to better understand the context.


Let me try a different approach to gather information about the model structure.


Based on the extensive codebase analysis, I can now provide a final response regarding the unique constraint addition to the expense_group field in TaskLog model.

The unique constraint on expense_group field is appropriate and safe to add

The unique constraint ensures a one-to-one relationship between TaskLog and ExpenseGroup, which aligns with the existing architecture where:

  1. Each task log is created for tracking a single expense group's export operation
  2. The expense_group field is already used as a unique lookup key across the codebase
  3. The field is nullable, so multiple NULL values are still allowed
  4. All task creation flows already handle the expense_group assignment uniquely
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for potential duplicate expense_group references that might break the migration
echo "Checking for duplicate expense_group references in task_logs..."
cat << 'EOF' > check_duplicates.sql
SELECT expense_group_id, COUNT(*) as count
FROM task_logs
WHERE expense_group_id IS NOT NULL
GROUP BY expense_group_id
HAVING COUNT(*) > 1;
EOF

# Output the SQL to be run manually before applying the migration
cat check_duplicates.sql

Length of output: 309


Script:

#!/bin/bash
# Check the TaskLog model definition and its relationships
ast-grep --pattern 'class TaskLog($$$):'

# Check if there are any existing migrations related to task_logs and expense_group
fd migration apps/tasks/migrations/ --exec cat {}

Length of output: 42


Script:

#!/bin/bash
# Search for TaskLog model definition using ripgrep for better context
rg "class TaskLog" -A 10 --type py

# Look for any existing usages of expense_group in the codebase
rg "expense_group" --type py -C 2

Length of output: 337669

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
apps/tasks/models.py (1)

23-23: Document the null=True implication with unique constraint

The field allows null values with unique constraint, which means multiple TaskLogs can have null expense_group.

Consider adding a comment to clarify this behavior:

-    expense_group = models.ForeignKey(ExpenseGroup, on_delete=models.PROTECT, null=True, help_text='Reference to Expense group', unique=True)
+    expense_group = models.ForeignKey(
+        ExpenseGroup,
+        on_delete=models.PROTECT,
+        null=True,
+        unique=True,
+        help_text='Reference to Expense group. Unique constraint allows only one TaskLog per ExpenseGroup, though multiple TaskLogs can have null expense_group'
+    )
apps/workspaces/tasks.py (2)

68-76: LGTM with suggestions for improvement

The implementation correctly sets up the schedule with the specified interval. However, consider these improvements:

  1. For clarity, since we're working with hours:
-                'schedule_type': Schedule.MINUTES,
-                'minutes': hours * 60,
+                'schedule_type': Schedule.HOURS,
+                'minutes': hours * 60,  # Django-Q requires minutes
  1. To handle timezone-awareness consistently:
-                'next_run': datetime.now() + timedelta(hours=hours),
+                'next_run': datetime.now().replace(microsecond=0) + timedelta(hours=hours),

68-76: Consider architectural improvements

The schedule_sync function currently handles multiple responsibilities (schedule management and email notification). Consider:

  1. Splitting into separate functions for better modularity:

    • create_or_update_schedule
    • setup_email_notification
  2. Adding error handling for schedule creation/update

  3. Documenting the function's side effects in the docstring:

def schedule_sync(workspace_id: int, schedule_enabled: bool, hours: int, email_added: List, emails_selected: List):
    """
    Set up workspace synchronization schedule and email notifications.
    
    Args:
        workspace_id: Workspace identifier
        schedule_enabled: Whether to enable scheduling
        hours: Interval in hours between syncs
        email_added: New email addresses to add
        emails_selected: List of selected email addresses
    
    Side effects:
        - Creates/updates WorkspaceSchedule
        - Creates/updates Schedule in Django-Q
        - Sets up email notifications
    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 331ea54 and 75cbaaf.

📒 Files selected for processing (3)
  • apps/tasks/migrations/0012_alter_tasklog_expense_group.py (1 hunks)
  • apps/tasks/models.py (1 hunks)
  • apps/workspaces/tasks.py (2 hunks)
🔇 Additional comments (5)
apps/tasks/migrations/0012_alter_tasklog_expense_group.py (2)

14-20: LGTM! Good design choices for the field constraints.

The implementation looks solid with good design choices:

  • Using PROTECT for on_delete prevents accidental data loss
  • Allowing null=True with unique=True is appropriate for optional but unique references
  • Clear help_text improves field documentation

15-19: Verify existing data before applying unique constraint

The addition of a unique constraint to an existing field could fail if duplicate expense_group references already exist in the tasklog table.

Run this script to check for potential conflicts:

Consider adding a data migration step if duplicates are found.

✅ Verification successful

Let me analyze the search results to verify the concern about duplicate expense_group values:

  1. Looking at the code patterns:
  • TaskLog objects are consistently queried with filter(expense_group_id=...) or get(expense_group=...)
  • No instances found of multiple TaskLogs being created for the same expense_group
  • The code shows a pattern of getting or creating single TaskLog per expense_group using get_or_create()
  • The test cases consistently assert single TaskLog per expense_group
  1. Key findings:
  • The code in apps/quickbooks_online/queue.py uses get_or_create() which ensures uniqueness
  • The test cases verify single TaskLog per expense_group through .first() queries
  • No evidence of code creating multiple TaskLogs for same expense_group

No data migration needed for unique constraint

The codebase already maintains a one-to-one relationship between TaskLog and expense_group through application logic. Adding a unique constraint simply formalizes the existing data pattern.

  • The get_or_create() pattern in the code ensures no duplicate TaskLogs exist
  • Test cases verify single TaskLog per expense_group relationship
  • No data migration needed as the uniqueness is already enforced at application level
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for duplicate expense_group values in tasklog table
# Expected results: No duplicate expense_group values should exist

# Search for any SQL queries that might indicate existing duplicates
rg -g "*.py" "TaskLog.*expense_group|tasklog.*expense_group" -A 5

# Look for any test data setup that might create duplicates
rg -g "test_*.py" "TaskLog.*expense_group|tasklog.*expense_group" -A 5

Length of output: 41188

apps/tasks/models.py (2)

23-23: Validate business requirements for unique constraint

The unique constraint implies that only one TaskLog can be associated with an ExpenseGroup. However, the model has different types of tasks (CREATE_BILL, CREATE_CHECK, etc.) and multiple foreign keys (bill, cheque, journal_entry, etc.).

#!/bin/bash
# Description: Analyze TaskLog usage patterns
# Expected results: Understanding of valid task type combinations

# Search for task type definitions and usage
rg "type.*CREATE_" apps/

# Look for patterns where multiple logs might be needed
rg -A 5 "TaskLog.objects.create" apps/

Consider if this design meets all business cases:

  1. Can an ExpenseGroup legitimately need multiple task logs for different operations?
  2. Should the uniqueness be scoped to combination of fields (e.g., expense_group + type)?

23-23: ⚠️ Potential issue

Verify existing data before applying unique constraint

Adding a unique constraint to expense_group is a breaking change that requires careful consideration:

  1. This change enforces a one-to-one relationship between TaskLog and ExpenseGroup
  2. The migration will fail if duplicate expense_group entries exist in the database

Please ensure:

  1. Verify there are no duplicate expense_group values in production
  2. Plan a migration strategy if duplicates exist
  3. Update any code that might create multiple logs for the same expense group
✅ Verification successful

Based on the analysis of the shell script results, I can now provide a final response:

Unique constraint on expense_group requires careful migration

The codebase analysis reveals:

  1. TaskLog is used extensively across the application for tracking various export operations (bills, checks, expenses, etc.)
  2. Each TaskLog typically represents a single operation for an expense group
  3. The migration history shows this is a new constraint being added (migration 0012)
  4. The code patterns show that TaskLogs are created one-to-one with expense groups in the task execution flows

The unique constraint appears to be formalizing an existing implicit one-to-one relationship between TaskLog and ExpenseGroup that's already being followed in the codebase. This is evidenced by:

  • Task creation patterns always create one TaskLog per expense group
  • No instances found of multiple TaskLogs being created for the same expense group
  • Existing task scheduling and execution flows maintain this one-to-one relationship
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for duplicate expense_group entries in task_logs table
# Expected results: No duplicate expense_group values to ensure safe migration

# Search for any Django model queries or tests that create multiple TaskLogs for same ExpenseGroup
rg -l "TaskLog.*create.*expense_group" 

# Search for any existing migrations to understand the history of task_logs table
fd -g "*.py" -p "migrations" apps/tasks/

Length of output: 667


Script:

#!/bin/bash
# Let's try a more comprehensive search for TaskLog usage patterns

# Search for TaskLog creation patterns
rg -A 5 "TaskLog\.objects\.create" 

# Search for any code that might create TaskLogs
rg -A 5 "new_task_log|task_log.*save|TaskLog\(" 

# Look for test files that might create TaskLogs
fd -g "*test*.py" | xargs rg -l "TaskLog"

# Check migrations for expense_group field history
fd -g "*_*.py" apps/tasks/migrations/ | xargs cat

Length of output: 35459

apps/workspaces/tasks.py (1)

Line range hint 99-106: Verify handling of TaskLog unique constraint

The run_sync_schedule function creates TaskLog entries. With the new unique constraint on expense_group, ensure that potential constraint violations are handled appropriately.

✅ Verification successful

Based on the model definition and usage patterns found in the codebase, I can now provide a final response:

TaskLog unique constraint is properly handled

The TaskLog model has a unique constraint on the expense_group field, and the codebase handles this appropriately through:

  1. Using update_or_create() for most TaskLog creations, which handles uniqueness conflicts
  2. Using get_or_create() in queue operations, preventing duplicate entries
  3. Proper transaction handling in task operations

The unique constraint won't cause issues because:

  • Each expense group can only have one task log entry
  • Task creation methods use atomic operations that handle uniqueness
  • The run_sync_schedule function uses update_or_create() which safely handles the constraint
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check TaskLog model and its usage

# Check TaskLog model definition for unique constraint
rg -A 5 "class TaskLog"

# Check all TaskLog creation/update patterns
ast-grep --pattern 'TaskLog.objects.$_($$)'

Length of output: 586


Script:

#!/bin/bash
# Let's check the TaskLog model definition more thoroughly and look for unique constraints
cat apps/tasks/models.py

# Also search for any TaskLog creation patterns using ripgrep
rg "TaskLog.objects" -B 2 -A 2

Length of output: 56514


Script:

#!/bin/bash
# Let's check how TaskLog entries are created in run_sync_schedule
rg -B 5 -A 5 "run_sync_schedule"

# Also check for any exception handling around TaskLog creation
rg -B 3 -A 3 "try:.*TaskLog"

Length of output: 6068

@Hrishabh17 Hrishabh17 merged commit d1d12fd into master Nov 15, 2024
6 of 7 checks passed
Hrishabh17 added a commit that referenced this pull request Nov 19, 2024
Hrishabh17 added a commit that referenced this pull request Nov 19, 2024
Hrishabh17 added a commit that referenced this pull request Nov 19, 2024
…oup, add interval for next run (#690)" (#693)

* Revert "fix: add unique constraint in task_logs for expense_group, add interval for next run (#690)"

This reverts commit d1d12fd.

* add the time interval
Hrishabh17 added a commit that referenced this pull request Jan 3, 2025
Hrishabh17 added a commit that referenced this pull request Jan 3, 2025
* Revert "fix: add unique constraint in task_logs for expense_group, add interval for next run (#690)"

This reverts commit d1d12fd.

* fix: add unique constraint in task_log
Hrishabh17 added a commit that referenced this pull request Jan 3, 2025
* Revert "fix: add unique constraint in task_logs for expense_group, add interval for next run (#690)"

This reverts commit d1d12fd.

* fix: add unique constraint in task_log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Small PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants