-
Notifications
You must be signed in to change notification settings - Fork 82
Flexible Volgen (Old)
NOTE: The information on this page may be outdated. This is archived for historical reference.
Volgen is the part of GlusterD that generates the volume graphs for a GlusterFS volume and other related services like the Gluster-NFS server, self-heal daemon etc.
The volgen code in GlusterD is hard coded in the source. This is in the form of tables and special functions for services. To add a new translator into the GlusterFS graph, a developer is required to modify GlusterD code. This is also not particularly easy, as the current code is also not really easy to understand.
In addition, GlusterD has a hard-coded volume options table, that is used by volgen to correctly set volume options on the right translators and by the 'volume set' command. This requires a developer to modify GlusterD code to even add a simple new option to their translators.
The volgen in GD2 needs the following characteristics,
- No code changes must be required in GlusterD to add a new translator into the volume graph
- Provide way to set dependency between translators ie. set up ordering of translators
- The volume graph structure should be easily modifiable
- (additionally) Adding a new option to a translator shouldn't require changes to GlusterD
These solutions were discussed among a bunch of GlusterFS developers who were gathered in Brno during DevConf.cz 2016. Details were originally captures in this etherpad.
- Init system style
- Systemd-units style
- Template volfile
- Filters
GD2 will provide a directory into which translator can drop their config files. The config files are text files which contain information about the translators. The files will be prefixed with numbers, which define the order of the translators in the graph. GD2 will read this directory, sort the file list and use that order as the volume graph order.
- New translators can easily drop in their config files
- Translator options can be added in the config files
- Deciding on a proper prefixes will not be an easy task
- Not suitable for building asymmetric graphs.
Similar to the above init style, but instead of using the filename prefixes to specify ordering, the config files will contain systemd style requirements, ie. Before, After. GD2 will read these config files and build a graph
- Same as init style
- Need to write a robust dependency solving graph builder
- Not suitable for building asymmetric graphs
A template volfile just gives the order of the translators. GD2 will use this template to generate actual volfiles.
- Easy to implement. Lots of text template packages available which will make this easier.
- New translators will need to modify the template, which is relatively more complex than add a new config file
- Will require an alternate method to specify translator options
A filter is a script/executable which given a generated volfile and other volume information, will return a modified volfile. GD2 will generate a volfile based on some internal logic, and pass it through the filters, one after another. The final returned volfile will be used.
- Already done for the HekaFS project. Python libraries exist to do this.
- New translators can just drop in their filters
- (Possibly) Harder to use filters to add new cluster translators (translators with more than 1 child)
- Still doesn't solve the problem of hardcoded graph in GD2.
- Need an alternative method for translator options
From the initial discussion, we had agreed that the SysVinit style approach provided a flexible and easy approach to solve the problem. But I had some later discussions with Csaba, we concluded that ordering based on prefixes is not reliable. The SystemD-units style solution is a better choice, even though we have some complexity implementing it.
An example of a possible config file,
name: dht
default: yes
options:
- option: dht.subvols-per-directory
type: int
max:
min:
description:
default:
- option: dht.min-free-disk
type: percent or size
max:
min:
description:
default:
childcount: "distribute"
before: afr stripe ec
after: performance fuse
- name : The name for the translator. This is the base key on which GD2 is going to operate on.
-
default : GD2 will search the volume options for
<name>.enabled
to decide if this translator should be enabled or not. If the options is not present, the default value is used. - options : A list of translator options. The options should be prefixed with name. GD2 will provide simple validation for a fixed set of option types.
- childcount: The number of children this translator expects. This will be a key that GD2 can lookup in the volume info to find the number of children, as this differs from volume to volume
- before: The translators that should not be parents of this translator
- after: The translators that should only be parents to this translator
- New translators will create their own config files and drop it in the directory specified by GD2
- GD2 will read in all the config files, solve dependencies and form a template graph
- Using the template graph, and the volume info of the volume, GD2 will generate the actual volume graph.
- GD2 will write out this graph to a volfile
GD2 will only generate the volume graph for bricks and fuse client. Most of the other services like the Gluster-NFS server, self-heal daemon, snapshot daemon etc. are just combinations of the basic brick and client graphs.
Generation of these graphs can be implemented separately from GD2. Some possible approaches to this are,
- The management of these service daemons could be handled as micro-services, to which GD2 will send notifications when it does a graph change, and the service will take care of regenerating it's volfile.
- We could use the filter method to generate volfiles for these services.
When GD2 forms the template graph, it will also create a table of volume options which is a combination of all the options read from the config files. This table will be used to assign the correct options to the correct translator.
Ensuring that all options are prefixed by their translator names only, should ensure that no options can be duplicated.
A prototype implementation of this design has been started at kshlm/glusterd2-volgen. The implementation may have additions to and/or divergence from the design. This wiki will be kept updated with the changes to the design as quickly as possible.