Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS features support #778

Closed
mmatuska opened this issue Jun 7, 2012 · 25 comments
Closed

ZFS features support #778

mmatuska opened this issue Jun 7, 2012 · 25 comments
Labels
Type: Feature Feature request or new feature
Milestone

Comments

@mmatuska
Copy link
Contributor

mmatuska commented Jun 7, 2012

Illumos has recently introduced a new concept called ZFS features support. This way new functionality can be added to ZFS pools independently providing or not providing backward compatibility. I am integrating this support into FreeBSD.

You can read more about the concept here:
http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011568.html

Illumos changesets:
https://hg.openindiana.org/upstream/illumos/illumos-gate/rev/2889e2596bd6
https://hg.openindiana.org/upstream/illumos/illumos-gate/rev/1949b688d5fb

@behlendorf
Copy link
Contributor

@mmatuska Thanks for the references, yes I've been following the feature flag work done by Delphix for a while now. We absolutely plan to integrate those changes since we already have a few changes in out tree which probably warrant the use of a feature flag.

@Vilbrekin
Copy link

Hi,
I do second this feature request, as it's currently impossible to import pools created on the latest version of OI.
Rgds,
V

@behlendorf
Copy link
Contributor

Presumably the OI pools fail to import because they claim zpool/zfs version 1000?

@Vilbrekin
Copy link

Yep, 5000 if I'm correct. Would be really nice to have a linux alternative.

@behlendorf
Copy link
Contributor

Yup, we'll get there. In the meanwhile I'd suggest explicitly creating pools with version 28 if you want to use them on Linux. Incidentally, your OI pools may never work on Solaris since they might not adopt any of the feature flag changes.

@mmatuska
Copy link
Contributor Author

Pull request #908 is one of the last milestones before feature flags support.
With that code committed, the feature flag diffs have only very few conflicts.

@behlendorf
Copy link
Contributor

@mmatuska Alright, #908 has been merged

@csiden
Copy link

csiden commented Sep 26, 2012

You probably also eventually want:
https://hg.openindiana.org/upstream/illumos/illumos-gate/rev/2aba784c276b

Without it you need to manually enable individual features through their properties, zpool upgrade won't do it. It also adds more detail to the zpool status and zpool import output.

@dechamps
Copy link
Contributor

This is blocking #939.

@ryao
Copy link
Contributor

ryao commented Sep 30, 2012

@dechamps It doesn't appear that this is blocking #939. It is true that the diffs themselves depend on this, but nothing prevents them from being reworked to permit their inclusion ahead of this.

@behlendorf Will pool version 5000 be used by default when creating new pools or will users be required to explicitly request feature flag support as we do with the xattr extension? As you stated, this will break Solaris compatibility. My plan for Gentoo is to recommend the use of pool version 28 unless people have an explicit need for extensions.

@behlendorf
Copy link
Contributor

@ryao My inclination is to go slow initially and leave the default pool version at 28. As we gain more confidence in the implementation we can consider making it the default. We're also going to need to update grub to understand feature flags.

@ryao
Copy link
Contributor

ryao commented Dec 14, 2012

Illumos GRUB already supports feature flags. The ZFS code in Illumos GRUB and GRUB2 is fairly similar, so it should be possible to port feature flag support from Illumos GRUB to GRUB2.

I do not plan to be the one to do that port. Most Gentoo Linux users prefer GRUB 0.97 while most Gentoo FreeBSD users prefer FreeBSD's bootloader. Anything that I do in the area of bootloaders is generally focused on one of those two. I am gradually developing port of Illumos GRUB to Gentoo Linux in ryao/grub-illumos. After that is done, I want to modify FreeBSD's bootloader to boot Gentoo Linux.

@lundman
Copy link
Contributor

lundman commented Dec 14, 2012

I did hear a rumour that FBSD's grub can also boot more advanced vdev setups. Ie, not only single and mirrors, but raidz? That would be desirable to port as well.

Has IllumOS discussed more broadly how the feature-flags would work, for example with possible future crypto support? Their own specs say all features should be enabled by default, which seems undesirable in this (and possibly other) case. Especially since encryption is optional on each ZFS "filesystem" and might not be in use at all in the pool. But having it enabled stops importing pools on other systems.

@csiden
Copy link

csiden commented Dec 14, 2012

@behlendorf @ryao Is the intention to eventually switch to enabling all features by default (on new pools) once you're more comfortable with feature flags? The implementation in illumos is the way it is because the average user won't want to worry about every available feature flag, these won't all be exciting things like encryption support or new compression algorithms, for example feature@empty_bpobj is just a minor modification to the way we create snapshots that wastes less space/memory. Ideally we wouldn't expose feature@empty_bpobj to the user at all, but we need to in case the user cares about the portability of the pool. The average user though won't care if their pool will work on FreeBSD or illumos and just wants the newest functionality to work for them. The one piece of compatibility such a user would care about is "I just upgraded to a newer Linux/Gentoo/ZFS release and it's not working for me, I want to go back to the older release that doesn't support all the same ZFS feature flags", but that case is taken care of since new features are not automatically enabled during software upgrades. The user has to run 'zpool upgrade mypool' to enable all new features (but the user doesn't need to worry what the features are), which they can do once they are satisfied they won't want to downgrade.

@lundman I wrote this mostly in response to your question, although your answer got buried near the bottom:

Feature flags are meant as a replacement for the old zpool disk-format version number (the one that went from 1-28 and Oracle has continued in to the 30's) and should work in a very similar way. So for comparison:

With the version number, when new pools are created they automatically have the latest supported version number, unless the user specifies a lower version (zpool create -o version=20 ...). With feature flags, when a new pool is created all features are enabled, unless a pre-feature flags version number is selected (zpool create -o version=20 ...) or the feature is explicitly marked as disabled (zpool create -o feature@async_destroy=disabled ...).

With the version number, If a pool was created with an older software version and the software has since been upgraded to support a newer version number, the pool still retains the old version number and attempts to use features of ZFS on a pool whose on-disk version does not support them results in some ENOTSUP-like error. For example trying to do:

$ zpool create -o version=20 mypool
$ zfs create -o dedup=on mypool/myfs

You get an error:

cannot create 'mypool/myfs': pool must be upgraded to set this property or value

Because, while the software supports dedup, we could not write the dedup table to disk and still be compatible with on-disk versions prior to 21. Similarly with feature flags, if you upgrade the software to support a new feature that was not supported when a pool was first created that feature will remain disabled. If encryption were implemented in the future it would be done similarly to dedup, there would be a pool-wide feature flag to indicate that the on-disk format has changed, and a per-filesystem property to actually enable encryption on data, if you tried to set the filesystem property before the feature flag was enabled on the pool you would get a similar error to the one we got above for dedup.

Total side note:
There are also some big differences between the old version number and feature flags (other than the obvious one were version numbers imply a dependency between on-disk format changes even if there isn't one, while feature flags allows on-disk changes to be developed, ported, and enabled independently of each other). Mostly the differences deal with making feature flags more flexible than version numbers, pretty much everything described in https://github.com/illumos/illumos-gate/blob/master/usr/src/man/man5/zpool-features.5 isn't supported by version numbers.

@lundman
Copy link
Contributor

lundman commented Dec 14, 2012

@csiden I must have missed your reply earlier, my apologies.

I was more concerned that, in the future, if you create a default pool, be it on FBSD, IllumOS or ZOL, then most combinations are ok, but should ZOL have crypto support, FBSD and IllumOS can no longer import the pools. Even though the user may never have used crypto, or care to use it ever.

It would seem a better solution that some options, like that of crypto is set to "available", and only once the first ZFS is created with encryption=on, is it changed to "in use". Then default pools can be imported, until such a time that the user chooses to use an option that makes it incompatible.

@csiden
Copy link

csiden commented Dec 14, 2012

@lundman What you described is supported by the framework. There are actually 3 states a feature can be in: disabled, enabled, and active. 'enabled' actually just means "the administrator has indicated that ZFS is allowed to make this on-disk format change". 'active' means the format change has been made. A system that does not support feature@a can still import a pool that has that feature 'enabled' as long as it is not 'active'. In fact if the on-disk format is (very carefully and perfectly) undone (e.g. all file systems with crypto enabled are destroyed and any other metadata is cleaned up) the framework supports moving the feature back to the 'enabled' state.

Of course, how easy it is to move a feature between the 'enabled' and 'active' depends on the on-disk format change being made. For example, the on disk-format change for feature@async_destroy was to add a new piece of metadata containing a list of file systems that are being asynchronously destroyed. While we have file systems on this list feature@async_destroy=active, but once the list has been emptied we destroy all traces of this new piece of metadata and feature@async_destroy=enabled (in fact at this point we could support setting the feature back to disabled, that just hasn't been implemented).

feature@empty_bpobj on the other hand changes the way all snapshots and file systems are represented, so the first time you create a new snapshot or file system after enabling the feature it becomes 'active' and does not return to the 'enabled' state unless you destroy every file system and snapshot created after the feature became active (so basically it will be 'active' forever).

See https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/fs/zfs/zfeature.c for a giant comment describing how this is implemented.

@lundman
Copy link
Contributor

lundman commented Dec 14, 2012

Ah excellent. I guess it came down to my misunderstanding of "enabled" in this case, as to me it implied it was already on, therefor not importable.

Cheers

@behlendorf
Copy link
Contributor

@csiden Yes, I absolutely want to make feature flags the default behavior. But before doing that I want to get comfortable with it and see if there are going to be any problems. My plan is to merge this code in to our next tag (rc13) but leave the default version set to v28. Then the most interested members of the ZoL community can kick the tires a bit and verify the code works as expected. If no major issues come up and we're able to resolve any minor ones we'll make it the default in our next tag (rc14).

Related to this I've been porting the patches over and have found a few issues. Well, more accurately, gcc found several issues which I've fixed. These are aside from the changes needed to make it C90 compliant for the Linux kernel and to silence all the various gcc warnings we leave enabled. You'll probably want to make these fixes as well.

  • The enum spa_feature { ... } spa_feature_t definition is missing its typedef keyword.
  • In zpool_prop_get_feature() the first ASSERT() should check zpool_prop_unsupported() not zfs_prop_unsupported(). I don't see how it could build currently on Illumos with debugging enabled.
  • The new fnvpair_value_uint8_t() and fnvlist_lookup_uint8_t() wrapper functions shouldn't have that trailing _t in their names.

@ryao
Copy link
Contributor

ryao commented Dec 15, 2012

@lundman FreeBSD does not use GRUB. It uses its own bootloader. It does have GRUB available in its ports tree, but that requires manual installation.

@csiden My plan for Gentoo is to keep the default as v28 to maintain compatibility with other platforms for the foreseeable future. Users that require features available through feature flags will be able to upgrade their pools to make use of them.

@lundman
Copy link
Contributor

lundman commented Dec 17, 2012

I tested creating my own feature, and can confirm that it does indeed work as advertised, so I no longer have any concerns about features:

# zpool create -f mypool ~/src/pool-image.bin
# zpool get feature@roger
mypool  feature@roger  enabled                local
# zfs create -o roger=on mypool/BOOM
# zpool get feature@roger
mypool  feature@roger  active                 local
# zfs destroy mypool/BOOM
# zpool get feature@roger
mypool  feature@roger  enabled                local

Where I call spa_feature_incr() and spa_feature_decr() in dsl_dataset_create, and dsl_dataset_destroy, respectively.

Creating three pools, A, B and C. A is just created and exported. B has roger set, and C has roger set then destroyed.

ZOL feature-flags agree at import:

   pool: poolA
     id: 13160178373042122073
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.

   pool: poolB
     id: 6090834749004568854
  state: UNAVAIL
status: The pool can only be accessed in read-only mode on this system. It
        cannot be accessed in read-write mode because it uses the following
        feature(s) not supported on this system:
        net.lundman:roger

   pool: poolC
     id: 10054371123916758958
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.

However, the changes involved does make this change:

    DMU_OT_BPOBJ_SUBOBJ,        /* UINT64 */

+      DMU_OT_DSL_KEYCHAIN,            /* ZAP */
    /*
     * Do not allocate new object types here. Doing so makes the on-disk
     * format incompatible with any other format that uses the same object
     * type number.

Do I have to find a new compatible way to support "DMU_OT_DSL_KEYCHAIN", or does it talk about "types" only?

@behlendorf
Copy link
Contributor

@lundman The compatible way going forward to avoid is to use the DMU_OTN_ types as described by the comment. So in your case you'll want to use DMU_OTN_ZAP_DATA or DMU_OTN_ZAP_METADATA.

@lundman
Copy link
Contributor

lundman commented Dec 19, 2012

Hmm nope, its still Greek to me. So I understand that not defining "DMU_OT_DSL_KEYCHAIN" is the better way (although, I do notice I can import such a pool into regular ZOL anyway?) and I should use DMU_OTN_ZAP_*

but well, how? I don't think we have any examples on how to use DMU_OTN_ZAP_*, does it go into the same structure in dmu.h, or is it more of a dynamic list maintained by (as yet written) source ?

Is that what it does with this code?

        spa->spa_feat_for_read_obj = zap_create_link(spa->spa_meta_objset,
            DMU_OTN_ZAP_METADATA, DMU_POOL_DIRECTORY_OBJECT,
            DMU_POOL_FEATURES_FOR_READ, tx);

@csiden
Copy link

csiden commented Dec 19, 2012

@lundman When you create the object you should use DMU_OTN_ZAP_METADATA instead of DMU_OT_DSL_KEYCHAIN. Basically we shouldn't create new dmu object types and all future objects should use one of the generic DMU_OTN_* types. This was done because otherwise every implementation of ZFS would need to agree on what each number means, and if two implementations used the same number for two different types they would be incompatible forever. We considered a solution similar to the one we're working on for checksum and compression algorithm numbers https://www.illumos.org/issues/2761 but the use of object type numbers is much more wide-spread throughout the code base, making them much harder to indirect. Luckily it turns out object type numbers are really only used for:

  1. Debugging (i.e. "what type is object 23")
  2. Determining which byteswap function to use
  3. Determining if the block is data or metadata, which determines how important it is to have multiple copies of it
  4. There are a few places in the code that look at the object type for other reasons, but with the exception of a few core ZFS routines they could have gotten the same information through other means.

So basically we're giving up 1 and 4 on new objects so that nobody needs to coordinate on what type numbers they are using.

@behlendorf Thanks for the tips about the warnings, I'll look in to what's going on there.

@behlendorf
Copy link
Contributor

Merged. After a lot of thought I've opted to leave the default pool version as 5000.

c1cdd99 Merge branch 'feature-flags'
1eb5bfa Illumos #3145, #3212
753c383 Illumos #3104: eliminate empty bpobjs
9157970 Fix __zio_execute() asynchronous dispatch
ea0b253 Illumos #3349: zpool upgrade -V bumps the on disk version number
29809a6 Illumos #3086: unnecessarily setting DS_FLAG_INCONSISTENT on async
b9b24bb Illumos #2762: zpool command should have better support for feature flag
3bc7e0f Illumos #3090 and #3102
5ac0c30 Revert "Temporarily disable the reguid test."
9ae529e Illumos #2619 and #2747

@imp
Copy link
Contributor

imp commented Jan 8, 2013

Oh, happy day !

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](jeromefroe/lru-rs@0.9.0...0.10.0)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

8 participants