-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Checks for Open/MMAPed Files - CLI & Checksum #1148
base: criu-dev
Are you sure you want to change the base?
Restrict Checks for Open/MMAPed Files - CLI & Checksum #1148
Conversation
When saddr.ss_family is AF_INET6 we should cast &saddr to (struct sockaddr_in6 *). Signed-off-by: Radostin Stoyanov <[email protected]> Signed-off-by: Andrei Vagin <[email protected]>
When CRIU calls the ip tool on restore, it passes the fd of remote socket by replacing the STDIN before execvp. The stdin is used by the ip tool to receive input. However, the ip tool calls ftell(stdin) which fails with "Illegal seek" since UNIX sockets do not support file positioning operations. To resolve this issue, read the received content from the UNIX socket and store it into temporary file, then replace STDIN with the fd of this tmp file. # python test/zdtm.py run -t zdtm/static/env00 --remote -f ns === Run 1/1 ================ zdtm/static/env00 ========================= Run zdtm/static/env00 in ns ========================== Start test ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST Adding image cache Adding image proxy Run criu dump Run criu restore =[log]=> dump/zdtm/static/env00/31/1/restore.log ------------------------ grep Error ------------------------ RTNETLINK answers: File exists (00.229895) 1: do_open_remote_image RDONLY path=route-9.img snapshot_id=dump/zdtm/static/env00/31/1 (00.230316) 1: Running ip route restore Failed to restore: ftell: Illegal seek (00.232757) 1: Error (criu/util.c:712): exited, status=255 (00.232777) 1: Error (criu/net.c:1479): IP tool failed on route restore (00.232803) 1: Error (criu/net.c:2153): Can't create net_ns (00.255091) Error (criu/cr-restore.c:1177): 105 killed by signal 9: Killed (00.255307) Error (criu/mount.c:2960): mnt: Can't remove the directory /tmp/.criu.mntns.dTd7ak: No such file or directory (00.255339) Error (criu/cr-restore.c:2119): Restoring FAILED. ------------------------ ERROR OVER ------------------------ ################# Test zdtm/static/env00 FAIL at CRIU restore ################## ##################################### FAIL ##################################### Fixes checkpoint-restore#311 Signed-off-by: Radostin Stoyanov <[email protected]> Signed-off-by: Andrei Vagin <[email protected]>
Instead of erroring, we should loop until we get the desired number of bytes written, like regular I/O loops. Signed-off-by: Nicolas Viennot <[email protected]>
This adds the ability to stream images with criu-image-streamer The workflow is the following: 1) criu-image-streamer is started, and starts listening on a UNIX socket. 2) CRIU is started. img_streamer_init() is invoked, which connects to the socket. During dump/restore operations, instead of using local disk to open an image file, img_streamer_open() is called to provide a UNIX pipe that is sent over the UNIX socket. 3) Once the operation is done, img_streamer_finish() is called, and the UNIX socket is disconnected. criu-image-streamer can be found at: https://github.com/checkpoint-restore/criu-image-streamer Signed-off-by: Nicolas Viennot <[email protected]>
One can pass --stream to zdtm.py for testing criu with image streaming. criu-image-streamer should be installed in ../criu-image-streamer relative to the criu project directory. But any path will do providing that criu-image-streamer can be found in the PATH env. Added a few tests to run on travis-ci to make sure streaming works. We run test that are likely to fail. However, it would be good to once in a while run all tests with `--stream -a`. Signed-off-by: Nicolas Viennot <[email protected]>
So, here's the enhanced version of the first try. Changes are: 1. The wrapper name is criu-ns instead of crns.py 2. The CLI is absolutely the same as for criu, since the script re-execl-s criu binary. E.g. scripts/criu-ns dump -t 1234 ... just works 3. Caller doesn't need to care about substituting CLI options, instead, the scripts analyzes the command line and a) replaces -t|--tree argument with virtual pid __if__ the target task lives in another pidns b) keeps the current cwd (and root) __if__ switches to another mntns. A limitation applies here -- cwd path should be the same in target ns, no "smart path mapping" is performed. So this script is for now only useful for mntns clones (which is our main goal at the moment). Signed-off-by: Pavel Emelyanov <[email protected]> Looks-good-to: Andrey Vagin <[email protected]>
Acked-by: Mike Rapoport <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> Signed-off-by: Andrei Vagin <[email protected]>
In Py2 `range` returns a list and `xrange` creates a sequence object that evaluates lazily. In Py3 `range` is equivalent to `xrange` in Py2. Signed-off-by: Radostin Stoyanov <[email protected]> Signed-off-by: Andrei Vagin <[email protected]>
The print_data() function was part of the deprecated (and removed) 'show' action, and it was moved in util.c with the following commit: a501b48 The 'show' action has been deprecated since 1.6, let's finally drop it. The print_data() routine is kept for yet another (to be deprecated too) feature called 'criu exec'. The criu exec feature was removed with: 909590a Remove criu exec code It's now obsoleted by compel library. Maybe-TODO: Add compel tool exec action? Therefore, now we can drop print_data() as well. Signed-off-by: Radostin Stoyanov <[email protected]>
Signed-off-by: Radostin Stoyanov <[email protected]>
Signed-off-by: Radostin Stoyanov <[email protected]>
class ctypes.c_char_p Represents the C char * datatype when it points to a zero- terminated string. For a general character pointer that may also point to binary data, POINTER(c_char) must be used. The constructor accepts an integer address, or a bytes object. https://docs.python.org/3/library/ctypes.html#ctypes.c_char_p Signed-off-by: Radostin Stoyanov <[email protected]>
criu-3.12/criu/net.c:2043: overwrite_var: Overwriting "img" in "img = open_image_at(-1, CR_FD_IP6TABLES, 0UL, pid)" leaks the storage that "img" points to. Signed-off-by: Adrian Reber <[email protected]>
…braries This patch only adds the support but does not enable it for building. Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Guoyun Sun <[email protected]>
Oddly, one of the test had a typo which should be fatal. Signed-off-by: Nicolas Viennot <[email protected]>
We permanently have issues like this: ./test/jenkins/criu-iter.sh: 3: source: not found It looks like a good idea to use one shell to run our jenkins scripts. Signed-off-by: Andrei Vagin <[email protected]>
On MIPS CPUs with VIPT caches also has aliasing issues, just like ARMv6. To overcome this issue, page coloring 0x40000 align for shared mappings was introduced (SHMLBA) in kernel. https://github.com/torvalds/linux/blob/master/arch/mips/include/asm/shmparam.h Related to this, zdtm test suites ipc.c shm.c shm-unaligned.c and shm-mp.c are passed. Signed-off-by: Guoyun Sun <[email protected]>
k_rtsigset_t is 16Bytes in mips architecture but not 8Bytes. so blk_sigset_extended be added in TaskCoreEntry and ThreadCoreEntry for dumping extern 8Bytes data in parasite-syscall.c, restore extern 8Bytes data in cr-restore.c Signed-off-by: Guoyun Sun <[email protected]>
Signed-off-by: Joshua Abraham <[email protected]>
A similar one is already printed in check_options(). Before this patch: > $ ./criu/criu -vvvvvv --deprecated --log-file=/dev/stdout xxx > (00.000000) Turn deprecated stuff ON > ... > (00.029680) DEPRECATED ON > (00.029687) Error (criu/crtools.c:284): unknown command: xxx Signed-off-by: Kir Kolyshkin <[email protected]>
A few tests were still running on xenial because at some point they were hanging. This switches now all tests to bionic except one docker test which still uses xenial to test with overlayfs. Signed-off-by: Adrian Reber <[email protected]>
This moves the cross compilation tests to github actions, to slightly reduce the number of Travis tests and run them in parallel on github actions. Signed-off-by: Adrian Reber <[email protected]>
The message "Overwriting RPC settings with values from <filename>" is misleading, giving the impression that file is being read and consumed. It really puzzled me, since <filename> didn't exist. What it needs to say is "Would overwrite", i.e. if a file with such name is present, it would be used. Also, add actual "Parsing file ..." so it will be clear which files are being used. Signed-off-by: Kir Kolyshkin <[email protected]>
While working on runc checkpointing, I incorrectly closed status_fd prematurely, and received an error from CRIU, but it was non-descriptive. Do print the error from open(). Signed-off-by: Kir Kolyshkin <[email protected]>
The orphan pts master option was introduced with commit [1] to enable checkpoint/restore of containers with a pty pair used as a console. [1] checkpoint-restore@6afe523 Signed-off-by: Radostin Stoyanov <[email protected]>
This commit defines constants and includes necessary headers to c/r BPF maps Source files modified: * magic.h - Defining BPFMAP_FILE_MAGIC and BPFMAP_DATA_MAGIC * image-desc.h - Defining CR_FD_BPFMAP_FILE and CR_FD_BPFMAP_DATA * image-desc.c - Create new entries for bpfmap-file and bpfmap-data in CRIU's file descriptor set * protobuf-desc.h - Defining PB_BPFMAP_FILE and PB_BPFMAP_DATA * protobuf-desc.c - Including headers for BPF map protobuf images Signed-off-by: Abhishek Vijeev <[email protected]>
Source files modified: * Makefile.config - Checks whether libbpf is installed on the system. If so, we add -lbpf to LIBS_FEATURES, -DCONFIG_HAS_LIBBPF to FEATURE_DEFINES and set CONFIG_HAS_LIBBPF. This allows us to check for the presence of libbpf before compiling or executing BPF c/r code and ZDTM tests. * Makefile - Set CONFIG_HAS_LIBBPF to clean all files. Signed-off-by: Abhishek Vijeev <[email protected]>
This commit enables CRIU to: (a) identify an anonymous inode as being a BPF map (b) parse information about BPF maps from procfs Source files modified: * files.c - Checks anonymous inodes to see whether they are BPF maps. If so, sets struct fdtype_ops *ops to a structure that knows how to dump BPF maps * proc_parse.c - Function parse_fdinfo_pid_s() now checks whether the current file being processed is a BPF map. If so, it calls a newly defined function parse_bpfmap() which knows how to parse information about BPF maps from procfs Signed-off-by: Abhishek Vijeev <[email protected]>
This commit enables CRIU to dump meta-data about BPF maps files by prividing the structures and functions needed by other parts of the code-base. Source files added: * bpfmap.c - defines new structures and functions: (a) struct fdtype_ops bpfmap_dump_ops: sets up the function handler to dump BPF maps (b) is_bpfmap_link(): checks whether an anonymous inode is a BPF map file (c) dump_one_bpfmap(): parses information for a BPF map file from procfs and dumps it * include/bpfmap.h - structure and function declarations Source files modified: * Makefile.crtools - generates build artifacts for bpfmap.c Signed-off-by: Abhishek Vijeev <[email protected]>
This commit enables CRIU to dump data(key-value) pairs stored in BPF maps Source files modified: * bpfmap.c - Function dump_one_bpfmap_data() reads the map's keys and values into two buffers using bpf_map_lookup_batch() and then writes them out to a protobuf image along with the number of key-value pairs read - Function dump_one_bpfmap() now dumps the data as well before returning * include/bpfmap.h - Includes headers and declares functions needed to dump BPF map data Signed-off-by: Abhishek Vijeev <[email protected]>
This commit enables CRIT to decode the contents of a protobuf image that stores information related to BPF map Signed-off-by: Abhishek Vijeev <[email protected]>
This commit enables CRIU to restore a process' BPF map file descriptors. Source files modified: * bpfmap.c - Structure and function definitions needed to: (a) collect a BPF map's information from its protobuf image (b) create and open a BPF map with the same parameters as when it was dumped (c) add the newly opened BPF map to the process' file descriptor list * include/bpfmap.h - Structure declarations for restoring BPF maps * files.c - Collects a BPF map's file entry during the restoration phase Signed-off-by: Abhishek Vijeev <[email protected]>
This commit restores the data of BPF maps. A hash table (indexed by the map's id) is used to store data objects for multiple BPF map files that a process may have opened. Collisions are resolved with chaining using a linked list. Source files modified: * bpfmap.c - Structure and function definitions needed to: (a) collect the protobuf image containing BPF map data (b) read the BPF map's data from the image and store it in the hash table (c) restore the map's data using bpf_map_update_batch() * include/bpfmap.h - Defines the size of the hash table and maks to be used while indexing into it - Structure and function declarations that are used while restoring BPF map data * cr-restore.c - Collects the protobuf image containing BPF map data during the restoration phase Signed-off-by: Abhishek Vijeev <[email protected]>
This commit adds ZDTM tests for c/r of processes with BPF maps as open files Source files added: * zdtm/static/bpf_hash.c - Tests for c/r of the data and meta-data of BPF map type BPF_MAP_TYPE_HASH * zdtm/static/bpf_array.c - Tests for c/r of the data and meta-data of BPF map type BPF_MAP_TYPE_ARRAY Source files modified: * zdtm/static/Makefile - Generating build artifacts for BPF tests Signed-off-by: Abhishek Vijeev <[email protected]>
Source files modified: * travis/vagrant.sh - Adding libbpf-devel Signed-off-by: Abhishek Vijeev <[email protected]>
This change fixes the error: error: 'security_context_t' is deprecated [-Werror=deprecated-declarations] Source files modified: * lsm.c * net.c Please refer to: SELinuxProject/selinux@9eb9c9327 Signed-off-by: Abhishek Vijeev <[email protected]>
Since commit cdd08cd ("uffd: use userns_call() to execute ioctl(UFFDIO_API)") UFFD_API ioctl() is wrapped with userns_call() and this allows runing lazy-pages tests on recent kernels in uns. Restore testing of lazy-pages in uns in travis. Signed-off-by: Mike Rapoport <[email protected]>
The docker hub container registry is not updated as fast as Fedora's registry at registry.fedoraproject.org. Fedora's registry gets a new image whenever there is a new version of rawhide, docker hub's rawhide image can take a couple of weeks because the process is not automated. Especially when Fedora branches of a new release we see lot's of errors in CRIU's Fedora rawhide based Travis runs. Switch to Fedora's registry to always have the newest rawhide images for our tests. Signed-off-by: Adrian Reber <[email protected]>
file_validation_chksm_config and file_validation_chksm_parameter fields added to cr_options structure in criu/include/cr_options.h along with the constants: FILE_VALIDATION_CHKSM FILE_VALIDATION_CHKSM_FULL FILE_VALIDATION_CHKSM_FIRST FILE_VALIDATION_CHKSM_PERIOD FILE_VALIDATION_CHKSM_CONFIG_DEFAULT (Equal to FILE_VALIDATION_CHKSM_FIRST) FILE_VALIDATION_CHKSM_PARAM_DEFAULT (Equal to 1024) Usage and description information is yet to be added. Usage: --file-validation="checksum" --checksum-parameter=[N] (To calculate checksum with the first N bytes of the file, at most 10MB) --file-validation="checksum-full" (To calculate checksum with the entire file, at most 10MB) --file-validation="checksum-period" --checksum-parameter=[N] (To calculate checksum with every Nth byte of the file) Signed-off-by: Ajay Bharadwaj <[email protected]>
This adds functions to find, store and compare with the stored checksum. The file is mapped 10MB at a time. Signed-off-by: Ajay Bharadwaj <[email protected]>
This adds the flag 'crfail-restore' which is used to indicate that the test is expected to fail during restore. Signed-off-by: Ajay Bharadwaj <[email protected]>
This adds 4 tests (2 each for 32 bit and 64 bit ELF files): 1. ELF file has only 1 PT_NOTE header 2. ELF file has 2 PT_NOTE headers and the build-ID is present in the 2nd header The build-ID is altered after dump and before restore, causing the restore to fail in all 4 tests. Signed-off-by: Ajay Bharadwaj <[email protected]>
This adds 6 tests: 1. The last byte of the file is altered after dump and before restore, causing restore to fail (checksum-full) 2. checksum, checksum-parameter = 10485800 (First 10485800 bytes are checksummed) 3. checksum-period, checksum-parameter = 10485800 (Every 10485800th byte is checksummed) 4. checksum, checksum-parameter = 10485700 (First 10485700 bytes are checksummed) 5. checksum-period, checksum-parameter = 10485700 (Every 10485700th byte is checksummed) 6. The last byte of a file of size 10485800 bytes is altered after dump and before restore, causing restore to fail checksum, checksum-parameter = 10485800 (First 10485800 bytes are checksummed) Signed-off-by: Ajay Bharadwaj <[email protected]>
This moves the file validation helper function declarations to files-validation.h and the function definitions to files-validation.c. It also updates Makefile.crtools. Signed-off-by: Ajay Bharadwaj <[email protected]>
ddb47e6
to
4a8cb20
Compare
A friendly reminder that this PR had no activity for 30 days. |
f392ea1
to
4d137b8
Compare
If I understand correctly, this change is not merged in. Anyone knows why? CRIU's wiki has it though... https://criu.org/index.php?title=Validate_files_on_restore |
@xiongzubiao @avagin @checkpoint-restore/maintainers The PR has unresolved conversation #1148 (comment) about "how fast is the algorithm?", it seem to only work fast for files with BUILDID, checksumming is not so fast obviously. So it seems we need to update wiki that this feature is not available. |
This adds CLI options and the checksum file validation method in addition to the previous build-ID and file size checks as part of GSoC 2020.
CRIU uses the build-ID validation method by default whenever possible and if that cannot be done, it will try and use the checksum method with the first 1 MB of the file. If this fails as well, it will use only the file size check.
To use the checksum method first and make the build-ID method the fallback:
--file-validation="chksm" --chksm-parameter=[N]
(To calculate checksum with the first N bytes of the file, at most 10 MB)
--file-validation="chksm-full"
(To calculate checksum with the entire file, at most 10 MB)
--file-validation="chksm-period" --chksm-parameter=[N]
(To calculate checksum with every Nth byte of the file)
To explicitly use only the file size check all the time, the following command line option can be used: --file-validation="filesize"
In other words, --file-validation="buildid" will not make a difference as this is the default method
Using the time command to measure the time taken for 2 tests.
zdtm/transition/shmem:
file size: 3.782s
build-id: 4.153s
chksum (Parameter = 1024): 4.465s
chksum-full: 4.722s
chksum-period (Parameter = 1024): 4.498s
zdtm/static/maps04:
file size: 35.317s
build-id: 35.720s
chksum (Parameter = 1024): 35.919s
chksum-full: 36.679s
chksum-period (Parameter = 1024): 36.476s
Mentors: @avagin @0x7f454c46
This also adds ZDTM tests for build-ID and checksum validation methods.
Link to the previous PR:
#1131
Changes from the previous PR:
rebase on the current criu-dev
cosmetic changes
other minor changes