Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detection for null/empty files #425

Open
atruskie opened this issue Oct 8, 2024 · 2 comments
Open

Add detection for null/empty files #425

atruskie opened this issue Oct 8, 2024 · 2 comments
Labels
area: fixes enhancement New feature or request good first issue Good for newcomers

Comments

@atruskie
Copy link
Member

atruskie commented Oct 8, 2024

Sometimes sensors produce allocated space but otherwise empty files. We should add a detector for those.

This problem seems to be distinct from OE004.

corrupt_to_investigate.zip

@atruskie atruskie added enhancement New feature or request good first issue Good for newcomers area: fixes labels Oct 8, 2024
@Adityapandeya
Copy link

import os
import math
from collections import Counter

def file_is_empty(filepath):
    try:
        with open(filepath, 'rb') as file:
            data = file.read(1024)  # Read the first 1024 bytes
            if not data:
                return True  # File is empty or has no readable content
            if data.count(data[:1]) == len(data):
                return True  # All bytes in the sample are the same
    except IOError as e:
        print(f"Error reading {filepath}: {e}")
    return False

def calculate_entropy(filepath):
    try:
        with open(filepath, 'rb') as file:
            data = file.read()
            if not data:
                return 0
            entropy = 0
            for x in Counter(data).values():
                p_x = x / len(data)
                entropy -= p_x * math.log2(p_x)
            return entropy
    except IOError as e:
        print(f"Error reading {filepath}: {e}")
    return None

# Example usage
path = '/path/to/suspected/file'
print("Empty:", file_is_empty(path))
print("Entropy:", calculate_entropy(path))

@atruskie
Copy link
Member Author

atruskie commented Oct 8, 2024

@Adityapandeya I'm not sure what the point to your comment was but without further context, it was not helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: fixes enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants