-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS has a 2GB maximum file size on 32-bit hosts #136
Comments
On 64-bit kernels Linux automatically sets O_LARGEFILE for you at open time which allows you to create >2GB files. On 32-bit kernels this doesn't happen. Can you try explicitly setting O_LARGEFILE and see if that solves the issue. SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, int, mode) { long ret; if (force_o_largefile()) flags |= O_LARGEFILE; ret = do_sys_open(AT_FDCWD, filename, flags, mode); /* avoid REGPARM breakage on x86: */ asmlinkage_protect(3, ret, filename, flags, mode); return ret; } |
The real root cause here is MAXOFFSET_T which is defined as 0x7fffffffl (2GiB) on 32-bit systems. This limit is being enforced by zfs at the top of zfs_write() with no regard for the Linux flag O_LARGEFILE which dd is passing correctly. So this really should be a limit on 32-bit OpenSolaris systems as well. I don't think there is any harm in relaxing this limit when O_LARGEFILE is set but since this is a 32-bit issue I'm going to hold off on this. Making sure this is safe would require a fair bit of code inspection and testing. That can wait until the 64-bit version is working well. |
Perhaps this is a barnacle from an older OpenSolaris release. The last OpenSolaris release does not have this limit: $ uname -a $ isainfo -k -v $ dd if=/dev/zero of=zero bs=1M |
I moved a ZFS filesystem from a 64-bit system into a 32-bit system and large files were readable past 2GB and sizes were properly reported. Consider this pathology: A large database is migrated onto ZFS for Linux. The system has a bias for read queries, so the system runs properly for a long while, but faults when the first write goes past 2GB. This would be frustrating to diagnose and difficult to reverse. Regarding the ifdef for MAXOFFSET_T in libspl/include/sys/param.h: The Solaris Porting Guide says that _LP64 implies 64-bit pointers, but not necessarily a 64-bit word size. Discussion regarding ILP32/ILP64 on the GCC help list agrees that using it to check word size can get unexpected results. I recall that _LP64 has been the system default since Solaris 9, so maybe it should be a compile error if it is undefined on Linux. The onnv-gate defines the maximum offset using long-long as an alternative like this: #define MAXBSIZE 8192 |
Interesting, well I'm all for getting these constants defined correctly and you make a good case. Your explanation also nicely explains why this isn't as issue on 32-bit Solaris boxes as I'd assumed. To test the fix your going to need to update the defines in two places. The kernel modules pick up this define from the spl in include/sys/sysmacro.h. Around line 69 MAXOFFSET_T is defined and it is simply based on the definition of _LP64. If you want to factor in the LONGLONG_TYPE define to set MAXOFFSET_T more precisely you'll need to make sure that also gets defined properly. Currently it will never be set because I'm using a stripped down include/sys/isa_defs.h file. You'll need to define is as appropriate for the available architectures.(x86, x86_64, powerpc). You will then need to do something very similar in the zfs libspl/include/sys/param.h header. This header is used for the users pace build of zfs for things like ztest and the command line utilities. After your change a little testing should show if things are working as expected. |
Giving a name to this enum makes it discoverable from debugging tools like DRGN and SDB. For example, with the name proposed on this patch we can iterate over these values in DRGN: ``` >>> prog.type('enum kmc_bit').enumerators (('KMC_BIT_NOTOUCH', 0), ('KMC_BIT_NODEBUG', 1), ('KMC_BIT_NOMAGAZINE', 2), ('KMC_BIT_NOHASH', 3), ('KMC_BIT_QCACHE', 4), ('KMC_BIT_KMEM', 5), ('KMC_BIT_VMEM', 6), ('KMC_BIT_SLAB', 7), ... ``` This enables SDB to easily pretty-print the flags of the spl_kmem_caches in the system like this: ``` > spl_kmem_caches -o "name,flags,total_memory" name flags total_memory ------------------------ ----------------------- ------------ abd_t KMC_NOMAGAZINE|KMC_SLAB 4.5MB arc_buf_hdr_t_full KMC_NOMAGAZINE|KMC_SLAB 12.3MB ... <cropped> ... ddt_cache KMC_VMEM 583.7KB ddt_entry_cache KMC_NOMAGAZINE|KMC_SLAB 0.0B ... <cropped> ... zio_buf_1048576 KMC_NODEBUG|KMC_VMEM 0.0B ... <cropped> ... ``` Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes openzfs#9478
The native ZFS filesystem for Linux has a 2GB file size limit on 32-bit hosts.
For example:
uname -a Linux ubuntu-virtual-machine 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:29 UTC 2011 i686 GNU/Linux
dd if=/dev/zero of=/tank/zero bs=1M
dd: writing `/tank/zero': File too large
2048+0 records in
2047+0 records out
2147483647 bytes (2.1 GB) copied, 16.5788 s, 130 MB/s
The text was updated successfully, but these errors were encountered: