-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support for safe symlink extraction #763
Conversation
We can rewrite symlinks to ensure they are always relative and remain within the extraction directory.
Explicitly use extract_root in output path instead of ./ to avoid issues with symlinks within directories.
0e6e026
to
1b98609
Compare
@AndrewFasano from within the unblob directory, you can do pre-commit can be installed with It will modify the code and you can create fixups for existing commits depending on the part it touches. |
Ruff seems to be quite unhappy with my use of |
I think there is some confusion on how
Maybe there was some confusion in some handlers on the argument order - or names, but these tests show the intention, and they should keep working after modifications. |
FileSystem.create_symlink(src, dst) is supposed to be following the unix command line order and naming
Except these are named differently in the
So |
Thanks for the info @e3krisztian. These names are definitely confusing and now that I cleaned up my changes, I can see that you're right - the order was fine before. Sorry about that! There might still be value in revising the change from bbe18c6 to add the I think the other changes are still relevant though! |
@AndrewFasano this is a valued input and contribution for You might also be correct with some hidden argument swapping somewhere still. The below symlinks are intentionally marked as problematic in the current code ( # 3) Symlink with extra parent directories that would still be valid
ln -s ../../../bin/busybox "$WORKDIR/sbin/symlink_extra_up_to_busybox"
# 7) Circular symlink (A -> B, B -> A)
ln -s symlink_circular_b "$WORKDIR/bin/symlink_circular_a"
ln -s symlink_circular_a "$WORKDIR/bin/symlink_circular_b" |
Cherry-picking these commit over I have tried to reproduce the problems locally based on the description given with the tests below applied to diff --git a/tests/test_file_utils.py b/tests/test_file_utils.py
index 9a20da74..1fc94138 100644
--- a/tests/test_file_utils.py
+++ b/tests/test_file_utils.py
@@ -25,7 +25,7 @@ from unblob.file_utils import (
round_down,
round_up,
)
-from unblob.report import PathTraversalProblem
+from unblob.report import LinkExtractionProblem, PathTraversalProblem
@pytest.mark.parametrize(
@@ -503,6 +503,30 @@ class TestFileSystem:
assert os.readlink(output_path) == "target file"
assert sandbox.problems == []
+ def test_create_symlink_target_inside_sandbox(self, sandbox: FileSystem):
+ # ./sbin/shell -> ../bin/sh
+ sandbox.mkdir(Path("bin"))
+ sandbox.write_bytes(Path("bin/sh"), b"posix shell")
+ sandbox.mkdir(Path("sbin"))
+ sandbox.create_symlink(Path("../bin/sh"), Path("sbin/shell"))
+
+ output_path = sandbox.root / "sbin/shell"
+ assert output_path.read_bytes() == b"posix shell"
+ assert output_path.exists()
+ assert os.readlink(output_path) == "../bin/sh"
+ assert sandbox.problems == []
+
+ def test_create_symlink_target_outside_sandbox(self, sandbox: FileSystem):
+ # /shell -> ../bin/sh
+ sandbox.mkdir(Path("bin"))
+ sandbox.write_bytes(Path("bin/sh"), b"posix shell")
+ sandbox.create_symlink(Path("../bin/sh"), Path("/shell"))
+
+ assert any(p for p in sandbox.problems if isinstance(p, LinkExtractionProblem))
+ output_path = sandbox.root / "shell"
+ assert not output_path.exists()
+ assert not output_path.is_symlink()
+
def test_create_symlink_absolute_paths(self, sandbox: FileSystem):
sandbox.write_bytes(Path("target file"), b"test content")
sandbox.create_symlink(Path("/target file"), Path("/symlink")) |
@e3krisztian I can try updating those commits to work on main and try digging up a filesystem where I was seeing the issue that was supposed to address. At a minimum bbe18c6 would need to be updated to swap src/dst since this PR has them swapped. |
This PR aims to fix a few issues around symlink extraction discussed in #761. I don't think this PR is perfect, but I hope it's an improvement over the current state of things.
Now in #768
954c1cd rewrites the logic to sanitize symlinks to be relative and kept within the extraction directory. This is done using theos
module instead ofPathlib
as Pathlib.resolve would fail if a symlink target was missing (which doesn't prevent us from safely converting it to a relative link). With this change I no longer see false positives around MaliciousSymlinks, instead symlinks are created safely within the extraction directory. If a relative symlink originally tried accessing a directory above its own root (i.e.,./bin/sh -> ../../../../../bin/bash
), we update the link so it remains within the extraction directory.This may have just been an artifact of swapping src/dst?
bbe18c6 and 76c29fe change how the.dst
field of a symlink is calculated in file_utils and in _safe_tarfile - previously it was made by combining the extraction root with the symlink destination. This would lose critical information about the path of the symlink source. For example a symlink at./sbin/shell -> ../bin/sh
is safely within the extraction directory while a symlink at/shell -> ../bin/sh
is trying to go up too high.Now in #770
fc60755 fixes a bug where tarfile absolute symlinks would be improperly dropped which I observed with a system that had/var/log -> /tmp
. 0e6e026 fixes a bug with relative symlinks that I observed on a system that had/var/tmp -> ../../tmp
56617a3 and 2ce66d6 are trying to fix a mix up between symlink source and destination when callingcreate_symlink
. I fixed the CPIO extractor but it looks like things may be backwards in other extractors as well.To test these changes, I created a CPIO archive with the following script:
If I extract this with the head of unblob (d0f3086) and run
find ../test/test_archive.cpio_extract/ -type f,l -exec ls -al {} \;
I get:After applying the changes in this PR I get the following result with two additional (and expected) files extracted:
symlink_circular_b
andsymlink_extra_up_to_busybox
. All the files are still contained within the extraction directory.