Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random access uncompressed unencrypted ZipExtFile #128131

Open
vvb2060 opened this issue Dec 20, 2024 · 3 comments
Open

random access uncompressed unencrypted ZipExtFile #128131

vvb2060 opened this issue Dec 20, 2024 · 3 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@vvb2060
Copy link

vvb2060 commented Dec 20, 2024

Feature or enhancement

Proposal:

# Fast seek uncompressed unencrypted file
elif self._compress_type == ZIP_STORED and self._decrypter is None and read_offset > 0:
# disable CRC checking after first seeking - it would be invalid
self._expected_crc = None
# seek actual file taking already buffered data into account
read_offset -= len(self._readbuffer) - self._offset
self._fileobj.seek(read_offset, os.SEEK_CUR)
self._left -= read_offset
read_offset = 0
# flush read buffer
self._readbuffer = b''
self._offset = 0
elif read_offset < 0:
# Position is before the current position. Reset the ZipExtFile

if read_offset < 0, ZipExtFile is reset and read from the beginning. I think read_offset > 0 is unnecessary.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

@danifus
Copy link
Contributor

danifus commented Dec 22, 2024

There is a bug in zipfile._SharedFile.seek() that affects concurrent reads of uncompressed, unencrypted files which should be merged along with this PR: #127856

@jaraco
Copy link
Member

jaraco commented Dec 26, 2024

Can you revise the original post to clarify what’s going on here? I read it, but I don’t understand: what is wrong with the current behavior? Under what conditions do the problems occur and thus who is affected? What do you expect instead? Just articulate as much as you can so it’s clear from the problem description what the proposed improvement is.

@vvb2060
Copy link
Author

vvb2060 commented Dec 31, 2024

#27737 does not really support random access, currently zipfile can only read uncompressed unencrypted ZipExtFile sequentially (only forward seek).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
Status: No status
Development

No branches or pull requests

4 participants