forked from ananjaser1211/Apollo
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement IncrementalFS #4
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fully working incremental fs filesystem Signed-off-by: Eugene Zemtsov <ezemtsov@google.com> Signed-off-by: Paul Lawrence <paullawrence@google.com> Bug: 133435829 Change-Id: I14741a61ce7891a0f9054e70f026917712cbef78
Fixed incfs_test build errors Fixed Kconfig errors Readded .gitignore Test: With just enabling CONFIG_INCREMENTAL_FS, both defconfig and cuttlefish_defconfig build and incfs_test runs and passes Bug: 133435829 Change-Id: Id3247ffcc63a095f66dcedf554017a06c5a9ce4a Signed-off-by: Paul Lawrence <paullawrence@google.com>
Removed WARN_ONs Removed compatibilty code Fixed tab issue Bug: 133435829 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I8a9e9ead48a65fd09c2d01d22f65d9a352f118e2
Fix all sparse errors in fs/incfs except fs/incfs/integrity.c:192:9: warning: Variable length array is used Test: incfs_test passes Bug: 133435829 Change-Id: I9c2e26e4e1a06a894977f11a3c8559b968dd115e Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I89e1dc6020e596fb36694f8646f78b98f7ad4a7f Bug: 133435829 Signed-off-by: Paul Lawrence <paullawrence@google.com>
- added chmod() to +0222 to make all backing files and dirs writable. vold/system_server have a umask that clears those flags, making incfs unusable Signed-off-by: Yurii Zubrytskyi <zyy@google.com> Signed-off-by: Paul Lawrence <paullawrence@google.com> Bug: 133435829 Change-Id: Id9258401570cc2cc7cd5735aace89d379a9b043d
Don't call dput on error code Change-Id: Ie63645c9ed67fa231829917ae8ca154e049b4921 Signed-off-by: Paul Lawrence <paullawrence@google.com> Bug: 133435829
Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Bug: 133435829 Change-Id: I824286b77f665d2409c5e88930057a97da82ce82
To make selinux work, add xattr support. This is a bit clunky - it seems like it would be better for the log and pending read functionality to be ioctls rather than this mixture of real and virtual files. Bug: 133435829 Change-Id: I56579fabe2ae7efb88f0344553948dc9573299aa Signed-off-by: Paul Lawrence <paullawrence@google.com>
They provide no value and simply duplicate a check in copy_from/to_user Test: incfs_test passes Bug: 138149732 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Icc6054a2d6a495c9a03cd1507dda1ab8ca0b0dc4
Filling blocks is not equivalent to writing a file, since they are constrained by the root hash. selinux policy may wish to treat them differently, for instance. Test: incfs_test passes Bug: 138149732 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ic369b84b92547b1cfefe422bd881c4e466090aed
Test: incfs_test passes Bug: 133435829 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I4e6fbd0938f00e7e6883ce1a26cbfd38fdcaa9a5
Test: selftests pass Bug: 133435829 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ia7e69b1b0176202da4b418ea815b370cbdacd5c2
Provide a securable way to open a file for filling Test: incfs_test passes Bug: 138149732 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ib4b6fd839ad30ce08e31121d19e2c0d7066d302f
Test: incfs_test passes Bug: 151240628 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I627e683b562329fd57aedc8393e22449ff09ee1f (cherry picked from commit 06c715e275dc65e720759cfeacb4120289e2a306)
Test: incfs_test passes Bug: 151240628 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I66d0ba1911adc5d68ed404585222e6a81a7eb94f (cherry picked from commit 8d963bb24076b60cb2de0f2d49deaff1d52e8270)
Without these, you can't unmount a volume on which incfs was mounted and the tests run. Also incfs_tests would fail sporadically without the fix to test_inode Test: Run incfs_test and unmount underlying volume 1000 times Bug: 152636070 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I88f11f5d4269c22d9073e5eb671d0c7cc4629f6c (cherry picked from commit c062bc8e769f0f6e47cc41fb9b7ab30e3e7f2689)
When returning incomplete results index_out has to be usable to call the function again and resume from the same location. This means that if the output buffer was too small the function needs to check for that when encountering the _beginning_ of a next output range, not the end of it. Otherwise resuming from the end of the range that didn't fit into the buffer would cause the call to never return that range + Make the backing file header flags update thread safe Bug: 152691988 Test: libincfs-test, incfs_test passes Signed-off-by: Yurii Zubrytskyi <zyy@google.com> Change-Id: I351156beba0b74e1942a39117279d3fcdb5e0c78 Signed-off-by: Paul Lawrence <paullawrence@google.com>
When read log is 0 sized, we still need to init the wait queue to avoid kernel panics if someone does decide to poll on the read log. Test: Added test for this condition, incfs_test crashes With fix, incfs_test doesn't crash Signed-off-by: Paul Lawrence <paullawrence@google.com> Bug: 152909243 Change-Id: Ic3250523bb7ddb1839f8e95852c17103e5ffb782
Since INCFS_IOC_GET_FILLED_BLOCKS potentially leaks information about usage patterns, and is only useful to someone filling the file, best protect it in the same way as INCFS_IOC_FILL_BLOCKS. Add useful field data_block_out as well Test: incfs_test passes Bug: 152983639 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I126a8cf711e56592479093e9aadbfd0e7f700752
Bug: 153017385 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I13f3a3c91d746d725e0e21b1e2bcfe0a64a13716
This change was never cherry-picked to other branches, is just syntactic sugar, and clashes with another change. Revert, and then apply cleanly over all branches as needed. This reverts commit 2664a43. Change-Id: I51dca9cb046cc15302c22e8f73e58dcf4849f84d Signed-off-by: Paul Lawrence <paullawrence@google.com>
…sh blocks Bug: 153210803 Test: manual Change-Id: Iafc888dbe906cd37e5b28dc2814f52aace175c0f Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Found by sparse Bug: 153174547 Test: make C=2 fs/incfs/incrementalfs.ko no errors, incfs_test pass Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I9ff4f4f35975fe09936724488b96cd8bdeeb719e
This led to a 20x speed improvement on QEMU. 512 is somewhat arbitrary - most of the gains are already there reading 64 records at a time, but since the record size is 10 bytes, 512 is just over a page and seems a good choice. Bug: 153170997 Test: incfs_test passes. Adding logging to incfs_get_filled_blocks to measure performance shows a 20x improvement Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ifb2da77cfd8c9d653c7047ba1eb7f39d795fa1c2
Read log buffer can have multiple threads doing any of these operations simultaneously: - Polling for changes - Reading log records - Adding new log records - Updating log buffer size, or enabling/disabling it completely As we don't control the userspace, and it turns out that they all currently originate from different processes, code needs to be safe against parallel access to a read buffer and a request for reallocating it. This CL add an r/w spinlock to protect the buffer and its size. Each remount takes the write lock, while everything else takes a read lock. Remount makes sure it doesn't take too long by preallocating and precalculating all updates, while other operations don't care much about their critical section size - they all can still run together. Bug: 152633648 Test: manual remount + reading Signed-off-by: Yurii Zubrytskyi <zyy@google.com> Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I7271b4cb89f1ae2cbee6e5b073758f344c4ba66a
Bug: 153560805 Test: incfs_test passes on qemu and Pixel 4 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I1b55341e4e4247a74f3f539b9d190fef0ca409b8
incfs only syncs at createfile time. This was making createfile take a very long time. It also appears to offer little actual value - whether we flush or no, if the device crashes the header will be partial. Bug: 15356649 Test: incfs_test passes, createfile takes less than half the time Change-Id: I8f1fa138226868ebfb4a6a41254444af453070c8 Signed-off-by: Paul Lawrence <paullawrence@google.com> (cherry picked from commit 4cc78c93ada6d0d8744c5b1ae12fddb50ea6a620)
Bug: 154342202 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ibcc641dd92596018c9f10b5bc7bd0db2642a80c7 (cherry picked from commit b6b4a3a404ccd9c62347e27c4fc7883d776c2cbb)
As was, chmod would change the cached inode's mode, which would persist until the inode was uncached. Fix to change mode of backing file, but make sure mount files are read only, backing files are always writeable. Test: App no longer fails with incfs errors Bug: 154972299 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I40517331f24329484387c6b880f1517f887b29f6 (cherry picked from commit fe4fae35fe307a15cacc5e6693a98bf5140e643b)
Test: incfs_test passes Bug: 155590527 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Iaecfcd40e8c089d11b34c7aff2090fbfe0c36219 (cherry picked from commit 3e4fa206ce8ae4d3141af7514cb5d3d813cd4290)
With a verified file (use incfs_perf to create a verified file), throughput measured using dd after dropping caches increases from 200M/s to 290M/s Test: incfs_test passes Bug: 155996534 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I7abb5ad92e4167f82f3452acc9db322fec8307dd
If an incfs file is created, then the file system is sync'd,on opening the incfs file inode_set reads the size from the backing file from within iget5_locked, causing this error. Test: incfs_test passes, this no longer occurs Bug: 156413528 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I8939c4afa514d39d251c044d7680cfc69272669e
Waking up the waiters accounts for 80+% of the total logging time, and about 40% of overall read_single_page() with no signature verification. By throttling it to once every 16ms we get back all read performance, reduce the waiter's CPU usage and still leave it enough time to pull the logs out. Bug: 155996534 Test: adb install megacity.apk & dd from the installed apk Signed-off-by: Yurii Zubrytskyi <zyy@google.com> Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I4a118dc226d7ca318cf099ba3e239f0120bb23c2
This reverts commit ab185e45f637ba0b0239268f1130890c8837981d. This change used the PageChecked flag to mark the Merkle tree as checked. However, f2fs uses this internally. This caused file system hangs on devices after installs. Test: incfs_test passes, installs no longer hang Bug: 157589629 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I980a700d65eb4f4a77434715d61dda4b8e80658c
Test: incfs_test passes Bug: 158242405 Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ib53e867fb2681489f720f6255354c1bce1d33997 (cherry picked from commit 60bc6eaf985358638af7f0602a7aece907ae8207)
Bug: 155996534 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Ic508e6fa07c90decb29e07647dd3b0fc4d243ce8 (cherry picked from commit 21e6d932da379416bab8ea8d78f14525e7bf564d)
Incremental fs appears to not depend on pkcs7 anymore. Bug: 151584760 Signed-off-by: Daniel Mentz <danielmentz@google.com> Change-Id: I809b4b5651d84ca70fd8bf837765e33df8547418 Signed-off-by: Paul Lawrence <paullawrence@google.com> (cherry picked from commit 83c1d9116ec0d695b02f36cda51f305257718e09)
Incfs's matic is bigger than 32-bit, but super block structure's s_magic is unsigned long which is 32-bit in ARM 32-bit platform. Do the cast for magic. Bug: 159772865 Signed-off-by: Peng Zhou <Peng.Zhou@mediatek.com> Signed-off-by: mtk81325 <peng.zhou@mediatek.com> Change-Id: Iae4f38774440c7d6ae44529d4f0f8ebb2ec5dacc
It's still magic number issue which cannot be compatible with arm-32 platform, although we try to fix it in Iae4f3877444 ("ANDROID: Incremental fs: magic number compatible 32-bit"), there is still incompatible scenario, such as: get_incfs_node(), it will return NULL then kernel exception will be trigger because of NULL pointer access. (inode_set() -> get_incfs_node(), then used node->xxx directly) We change magic number directly, otherwise, we must fix above issues one by one. Bug: 159772865 Fixes: Iae4f3877444("ANDROID: Incremental fs: magic number compatible 32-bit") Signed-off-by: Peng Zhou <Peng.Zhou@mediatek.com> Signed-off-by: mtk81325 <peng.zhou@mediatek.com> Change-Id: I71f279c1bb55ea296ab33a47644f30df4a9f60a6 Signed-off-by: Paul Lawrence <paullawrence@google.com>
…WRITE_ONCE READ/WRITE_ONCE are for atomic data types, not for structures. Fix this up by doing a memcpy to make it explicit just how messy this copy is... This fixes a build error on 5.8-rc1, as things are more strict, odds are it's also wrong in other kernel versions as well... Cc: Daniel Mentz <danielmentz@google.com> Cc: Paul Lawrence <paullawrence@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I7ecd3d05bd94c936dd5e69c63028458786f37a78
Use same selinux scheme as incfs v2 Fix memory leak Bug: 174692664 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I6058ddad9d43ba01b2eabd7d3c576f2cc9b42292
…failure Syz{bot,kaller} reports[0]: BUG: Dentry ffff888119d8a000{i=0,n=.index} still in use (1) [unmount of ramfs ramfs] ------------[ cut here ]------------ WARNING: CPU: 0 PID: 367 at fs/dcache.c:1616 umount_check+0x18d/0x1d0 fs/dcache.c:1607 Modules linked in: CPU: 0 PID: 367 Comm: syz-executor388 Not tainted 5.10.75-syzkaller-01082-g234d53d2bb60 #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:umount_check+0x18d/0x1d0 fs/dcache.c:1607 Code: 8b 0b 49 81 c6 f8 03 00 00 48 c7 c7 00 40 2e 85 4c 89 e6 48 8b 55 d0 4c 89 e1 45 89 f8 31 c0 41 56 e8 ae d9 9e ff 48 83 c4 08 <0f> 0b e9 f1 fe ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c c9 fe ff RSP: 0018:ffffc9000096f770 EFLAGS: 00010292 RAX: 0000000000000055 RBX: ffffffff866af200 RCX: 1ad6b89836e5b500 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffffc9000096f7a0 R08: ffffffff81545368 R09: 0000000000000003 R10: fffff5200012de41 R11: 0000000000000004 R12: ffff888119d8a000 R13: dffffc0000000000 R14: ffff88811d7373f8 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f01b7bddb68 CR3: 000000010c4f0000 CR4: 00000000003506b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: d_walk+0x309/0x540 fs/dcache.c:1326 do_one_tree fs/dcache.c:1623 [inline] shrink_dcache_for_umount+0x8e/0x1b0 fs/dcache.c:1639 generic_shutdown_super+0x66/0x2c0 fs/super.c:447 kill_anon_super fs/super.c:1108 [inline] kill_litter_super+0x75/0xa0 fs/super.c:1117 ramfs_kill_sb+0x44/0x50 fs/ramfs/inode.c:270 deactivate_locked_super+0xb0/0x100 fs/super.c:335 deactivate_super+0xa5/0xd0 fs/super.c:366 cleanup_mnt+0x45f/0x510 fs/namespace.c:1118 __cleanup_mnt+0x19/0x20 fs/namespace.c:1125 task_work_run+0x147/0x1b0 kernel/task_work.c:154 exit_task_work include/linux/task_work.h:30 [inline] do_exit+0x70e/0x23a0 kernel/exit.c:813 do_group_exit+0x16a/0x2d0 kernel/exit.c:910 get_signal+0x133e/0x1f80 kernel/signal.c:2790 arch_do_signal+0x8d/0x620 arch/x86/kernel/signal.c:805 exit_to_user_mode_loop kernel/entry/common.c:161 [inline] exit_to_user_mode_prepare+0xaa/0xe0 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x24/0x40 kernel/entry/common.c:266 do_syscall_64+0x3d/0x70 arch/x86/entry/common.c:56 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f01b7b884f9 Code: Unable to access opcode bytes at RIP 0x7f01b7b884cf. RSP: 002b:00007f01b7b19308 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 00007f01b7c103f8 RCX: 00007f Which was due to a missing dput() before returning from a vfs_mkdir() failure. Bug: 203827798 Link: [0] https://syzkaller.appspot.com/bug?extid=81b5ca9b2848f4dad8fa Reported-by: syzbot+81b5ca9b2848f4dad8fa@syzkaller.appspotmail.com Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iaef9aa0aecc964645aaca5fe8d79388ae28527bd
Syzbot recently found a number of issues related to incremental-fs (see bug numbers below). All have to do with the fact that incr-fs allows mounts of the same source and target multiple times. The correct behavior for a file system is to allow only one such mount, and then every subsequent attempt should fail with a -EBUSY error code. In case of the issues listed below the common pattern is that the reproducer calls: mount("./file0", "./file0", "incremental-fs", 0, NULL) many times and then invokes a file operation like chmod, setxattr, or open on the ./file0. This causes a recursive call for all the mounted instances, which eventually causes a stack overflow and a kernel crash: BUG: stack guard page was hit at ffffc90000c0fff8 kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN The reason why many mounts with the same source and target are possible is because the incfs_mount_fs() as it is allocates a new super_block for every call, regardless of whether a given mount already exists or not. This happens every time the sget() function is called with a test param equal to NULL. The correct behavior for an FS mount implementation is to call appropriate mount vfs call for it's type, i.e. mount_bdev() for a block device backed FS, mount_single() for a pseudo file system, like sysfs that is mounted in a single, well know location, or mount_nodev() for other special purpose FS like overlayfs. In case of incremental-fs the open coded mount logic doesn't check for abusive mount attempts such as overlays. To fix this issue the logic needs to be changed to pass a proper test function to sget() call, which then checks if a super_block for a mount instance has already been allocated and also allows the VFS to properly verify invalid mount attempts. Bug: 211066171 Bug: 213140206 Bug: 213215835 Bug: 211914587 Bug: 211213635 Bug: 213137376 Bug: 211161296 Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Change-Id: I66cfc3f1b5aaffb32b0845b2dad3ff26fe952e27
Cleanup incremental-fs left overs on umount, otherwise incr-fs will complain as below: BUG: Dentry {i=47a,n=.incomplete} still in use [unmount of incremental-fs] This requires vfs_rmdir() of the special index dir. Since set_anon_super() was used in incfs_mount_fs() the incfs_kill_sb() should use kill_anon_super() instead of generic_shutdown_super() otherwise it will leak the pseudo dev_t that set_anon_super() allocates. Bug: 211066171 Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Change-Id: I7ea54db63513fc130e1997cbf79121015ee12405
Syzbot recently found a number of issues related to incremental-fs (see bug numbers below). All have to do with the fact that incr-fs allows mounts of the same source and target multiple times. This is a design decision and the user space component "Data Loader" expects this to work for app re-install use case. The mounting depth needs to be controlled, however, and only allowed to be two levels deep. In case of more than two mount attempts the driver needs to return an error. In case of the issues listed below the common pattern is that the reproducer calls: mount("./file0", "./file0", "incremental-fs", 0, NULL) many times and then invokes a file operation like chmod, setxattr, or open on the ./file0. This causes a recursive call for all the mounted instances, which eventually causes a stack overflow and a kernel crash: BUG: stack guard page was hit at ffffc90000c0fff8 kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN This change also cleans up the mount error path to properly clean allocated resources and call deactivate_locked_super(), which causes the incfs_kill_sb() to be called, where the sb is freed. Bug: 211066171 Bug: 213140206 Bug: 213215835 Bug: 211914587 Bug: 211213635 Bug: 213137376 Bug: 211161296 Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Change-Id: I08d9b545a2715423296bf4beb67bdbbed78d1be1
Since evicting inodes triggers writes to the backing file, which uses the mi_owner field from the mount_info struct, make sure inodes are evicted before we free the mount_info data Test: incfs_test Bug: 270117845 Change-Id: I673b2e0e04b5adc3998caf6f22443598a30338af Signed-off-by: Paul Lawrence <paullawrence@google.com> (cherry picked from commit 7899985277527b29c47929a6d6a89c5c89b406ad) (cherry picked from commit faf3626b8e34df3dfff3a99e6582a9abd24410ce) Signed-off-by: Lee Jones <joneslee@google.com>
fb4fa22
to
49b31ff
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm EOL'ing my kernel until further notice, fuck school mate. Gl with kernel crap and other.