Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS has a 2GB maximum file size on 32-bit hosts #136

Closed
dajhorn opened this issue Mar 2, 2011 · 5 comments
Closed

ZFS has a 2GB maximum file size on 32-bit hosts #136

dajhorn opened this issue Mar 2, 2011 · 5 comments
Labels
Type: Feature Feature request or new feature
Milestone

Comments

@dajhorn
Copy link
Contributor

dajhorn commented Mar 2, 2011

The native ZFS filesystem for Linux has a 2GB file size limit on 32-bit hosts.

For example:

uname -a Linux ubuntu-virtual-machine 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:29 UTC 2011 i686 GNU/Linux

dd if=/dev/zero of=/tank/zero bs=1M

dd: writing `/tank/zero': File too large
2048+0 records in
2047+0 records out
2147483647 bytes (2.1 GB) copied, 16.5788 s, 130 MB/s

@behlendorf
Copy link
Contributor

On 64-bit kernels Linux automatically sets O_LARGEFILE for you at open time which allows you to create >2GB files. On 32-bit kernels this doesn't happen. Can you try explicitly setting O_LARGEFILE and see if that solves the issue.

SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, int, mode)
{
        long ret;

        if (force_o_largefile())
                flags |= O_LARGEFILE;

        ret = do_sys_open(AT_FDCWD, filename, flags, mode);
        /* avoid REGPARM breakage on x86: */
        asmlinkage_protect(3, ret, filename, flags, mode);
        return ret;
}

@behlendorf
Copy link
Contributor

The real root cause here is MAXOFFSET_T which is defined as 0x7fffffffl (2GiB) on 32-bit systems. This limit is being enforced by zfs at the top of zfs_write() with no regard for the Linux flag O_LARGEFILE which dd is passing correctly. So this really should be a limit on 32-bit OpenSolaris systems as well.

I don't think there is any harm in relaxing this limit when O_LARGEFILE is set but since this is a 32-bit issue I'm going to hold off on this. Making sure this is safe would require a fair bit of code inspection and testing. That can wait until the 64-bit version is working well.

@dajhorn
Copy link
Contributor Author

dajhorn commented Mar 11, 2011

Perhaps this is a barnacle from an older OpenSolaris release. The last OpenSolaris release does not have this limit:

$ uname -a
SunOS opensolaris 5.11 snv_134 i86pc i386 i86pc Solaris

$ isainfo -k -v
32-bit i386 kernel modules

$ dd if=/dev/zero of=zero bs=1M
dd: writing `zero': No space left on device
11446+0 records in
11445+0 records out
12001607680 bytes (12 GB) copied, 217.241 s, 55.2 MB/s

@dajhorn
Copy link
Contributor Author

dajhorn commented Mar 11, 2011

I moved a ZFS filesystem from a 64-bit system into a 32-bit system and large files were readable past 2GB and sizes were properly reported.

Consider this pathology: A large database is migrated onto ZFS for Linux. The system has a bias for read queries, so the system runs properly for a long while, but faults when the first write goes past 2GB. This would be frustrating to diagnose and difficult to reverse.

Regarding the ifdef for MAXOFFSET_T in libspl/include/sys/param.h:

The Solaris Porting Guide says that _LP64 implies 64-bit pointers, but not necessarily a 64-bit word size. Discussion regarding ILP32/ILP64 on the GCC help list agrees that using it to check word size can get unexpected results.

I recall that _LP64 has been the system default since Solaris 9, so maybe it should be a compile error if it is undefined on Linux.

The onnv-gate defines the maximum offset using long-long as an alternative like this:

#define MAXBSIZE 8192
#define DEV_BSIZE 512
#define DEV_BSHIFT 9 /* log2(DEV_BSIZE) _/
#define MAXFRAG 8
#ifdef _SYSCALL32
#define MAXOFF32_T 0x7fffffff
#endif
#ifdef _LP64
#define MAXOFF_T 0x7fffffffffffffffl
#define MAXOFFSET_T 0x7fffffffffffffffl
#else
#define MAXOFF_T 0x7fffffffl
#ifdef LONGLONG_TYPE
#define MAXOFFSET_T 0x7fffffffffffffffLL
#else
#define MAXOFFSET_T 0x7fffffff
#endif
#endif /
_LP64 */

@behlendorf
Copy link
Contributor

Interesting, well I'm all for getting these constants defined correctly and you make a good case. Your explanation also nicely explains why this isn't as issue on 32-bit Solaris boxes as I'd assumed. To test the fix your going to need to update the defines in two places.

The kernel modules pick up this define from the spl in include/sys/sysmacro.h. Around line 69 MAXOFFSET_T is defined and it is simply based on the definition of _LP64. If you want to factor in the LONGLONG_TYPE define to set MAXOFFSET_T more precisely you'll need to make sure that also gets defined properly. Currently it will never be set because I'm using a stripped down include/sys/isa_defs.h file. You'll need to define is as appropriate for the available architectures.(x86, x86_64, powerpc).

You will then need to do something very similar in the zfs libspl/include/sys/param.h header. This header is used for the users pace build of zfs for things like ztest and the command line utilities.

After your change a little testing should show if things are working as expected.

ahrens pushed a commit to ahrens/zfs that referenced this issue Apr 6, 2020
Giving a name to this enum makes it discoverable from
debugging tools like DRGN and SDB. For example, with
the name proposed on this patch we can iterate over
these values in DRGN:
```
>>> prog.type('enum kmc_bit').enumerators
(('KMC_BIT_NOTOUCH', 0), ('KMC_BIT_NODEBUG', 1),
('KMC_BIT_NOMAGAZINE', 2), ('KMC_BIT_NOHASH', 3),
('KMC_BIT_QCACHE', 4), ('KMC_BIT_KMEM', 5),
('KMC_BIT_VMEM', 6), ('KMC_BIT_SLAB', 7),
...
```
This enables SDB to easily pretty-print the flags of
the spl_kmem_caches in the system like this:
```
> spl_kmem_caches -o "name,flags,total_memory"
name                                       flags total_memory
------------------------ ----------------------- ------------
abd_t                    KMC_NOMAGAZINE|KMC_SLAB        4.5MB
arc_buf_hdr_t_full       KMC_NOMAGAZINE|KMC_SLAB       12.3MB
... <cropped> ...
ddt_cache                               KMC_VMEM      583.7KB
ddt_entry_cache          KMC_NOMAGAZINE|KMC_SLAB         0.0B
... <cropped> ...
zio_buf_1048576             KMC_NODEBUG|KMC_VMEM         0.0B
... <cropped> ...
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#9478
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

2 participants