Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support all tar archives in create and extract #166

Merged
merged 2 commits into from
Jan 9, 2025

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Jan 7, 2025

This adds support to not only create and extract tar.gz files, but all other tar files as well.

In addition, audeer.create_archive() and audeer.extract_archive() were slightly refactored to improve the code quality.

Updated docstring of audeer.create_archive():

image

Updated docstring of audeer.extract_archive():

image

Updated docstring of audeer.extract_archives():

image

Summary by Sourcery

Extend archive creation and extraction to support all tar formats, including tar, tar.gz, tar.bz2, and tar.xz.

New Features:

  • Support for creating and extracting all tar archive formats, including .tar, .tar.gz, .tar.bz2, and .tar.xz, has been added.

Tests:

  • Added tests for the new tar archive formats

Copy link

sourcery-ai bot commented Jan 7, 2025

Reviewer's Guide by Sourcery

This pull request extends the support for TAR archives in the create_archive and extract_archive functions to include all supported TAR formats, not just TAR.GZ. It also includes minor refactoring to improve code quality and updates the docstrings of related functions.

Sequence diagram for improved archive creation process

sequenceDiagram
    participant Client
    participant CreateArchive
    participant ArchiveHandlers
    participant FileSystem

    Client->>CreateArchive: create_archive(root, files, archive)
    CreateArchive->>ArchiveHandlers: Get handler for extension
    alt zip format
        ArchiveHandlers-->>CreateArchive: Return ZIP handler
        CreateArchive->>FileSystem: Write files using ZIP_DEFLATED
    else tar format
        ArchiveHandlers-->>CreateArchive: Return appropriate TAR handler
        Note right of ArchiveHandlers: Supports tar, tar.gz, tar.bz2, tar.xz
        CreateArchive->>FileSystem: Write files using selected compression
    end
    CreateArchive-->>Client: Archive created
Loading

Sequence diagram for enhanced archive extraction process

sequenceDiagram
    participant Client
    participant ExtractArchive
    participant Archive
    participant FileSystem

    Client->>ExtractArchive: extract_archive(archive, destination)
    alt zip format
        ExtractArchive->>Archive: Open as ZIP
        Archive->>FileSystem: Extract members
    else tar format
        ExtractArchive->>Archive: Check if tarfile
        Archive->>FileSystem: Extract members with appropriate handler
    end
    ExtractArchive-->>Client: Return extracted files
Loading

Class diagram for archive handlers structure

classDiagram
    class ArchiveHandlers {
        +Dict handlers
        +handle_zip()
        +handle_tar()
        +handle_tar_gz()
        +handle_tar_bz2()
        +handle_tar_xz()
    }

    class CreateArchive {
        +create_archive(root, files, archive)
    }

    class ExtractArchive {
        +extract_archive(archive, destination)
        +extract_zip(archive)
        +extract_tar(archive)
    }

    CreateArchive --> ArchiveHandlers
    ExtractArchive --> ArchiveHandlers
    note for ArchiveHandlers "New unified handler system"
    note for ExtractArchive "Refactored with separate
extraction methods"
Loading

File-Level Changes

Change Details Files
Added support for all TAR archive formats
  • Modified create_archive to handle '.tar', '.tar.gz', '.tar.bz2', and '.tar.xz' formats.
  • Updated extract_archive to handle all TAR formats using tarfile.is_tarfile().
  • Added tests for the new TAR formats in test_io.py.
  • Updated docstrings to reflect the broader TAR support
audeer/core/io.py
tests/test_io.py
Refactored archive handling logic
  • Simplified the archive creation and extraction logic by using a dictionary to map file extensions to archive handlers.
  • Improved code readability and maintainability
audeer/core/io.py
Updated docstrings
  • Clarified the supported archive formats in the docstrings of create_archive, extract_archive, and extract_archives.
  • Included examples of usage for the different archive types
audeer/core/io.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hagenw - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟡 Testing: 1 issue found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

tests/test_io.py Show resolved Hide resolved
Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (4592c91) to head (9fddb1d).
Report is 1 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
audeer/core/io.py 100.0% <100.0%> (ø)

@hagenw hagenw requested a review from ChristianGeng January 7, 2025 11:58
Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single issues have been discussed in separate threads.
As these are resolved I think that it will be safe to proceed and give approval.

@hagenw hagenw merged commit 1ce71e4 into main Jan 9, 2025
22 checks passed
@hagenw hagenw deleted the support-all-tar-archives branch January 9, 2025 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants