From 46c6b6eda469e0f61116cd9c27b714b142919178 Mon Sep 17 00:00:00 2001 From: Mikhail Uvarov Date: Mon, 25 Sep 2023 19:18:52 +0200 Subject: [PATCH] =?UTF-8?q?=F0=9F=93=96=20Document=20CETS=20as=20an=20alte?= =?UTF-8?q?rnative=20to=20Mnesia?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add information to User’s Guide / Tutorials how to set up MIM with RDBMS + CETS. Update all docs relying on Mnesia configuration and commands like running_db_nodes to also show the CETS alternatives. Uses tabs in documentation. --- .../database-backends-configuration.md | 27 ++- doc/configuration/general.md | 3 +- ...uster-configuration-and-node-management.md | 206 +++++++++++++----- .../MongooseIM-metrics.md | 21 ++ doc/tutorials/CETS-configure.md | 36 +++ mkdocs.yml | 1 + priv/graphql/schemas/admin/cets.gql | 2 +- 7 files changed, 226 insertions(+), 70 deletions(-) create mode 100644 doc/tutorials/CETS-configure.md diff --git a/doc/configuration/database-backends-configuration.md b/doc/configuration/database-backends-configuration.md index 7a2c81617bd..3659126b7ac 100644 --- a/doc/configuration/database-backends-configuration.md +++ b/doc/configuration/database-backends-configuration.md @@ -21,14 +21,16 @@ Subsequent sections go into more depth on each database: what they are suitable Transient data: -* Mnesia - we highly recommend Mnesia (a highly available and distributed database) over Redis for storing **transient** data. - Being an Erlang-based database, it's the default persistence option for most modules in MongooseIM. - - !!! Warning - We **strongly recommend** keeping **persistent** data in an external DB (RDBMS) for production. - Mnesia is not suitable for the volumes of **persistent** data which some modules may require. - Sooner or later a migration will be needed which may be painful. - It is possible to store all data in Mnesia, but only for testing purposes, not for any serious deployments. +* CETS - a library to synchronise records from the ETS tables between nodes. + A new choice to share the live data across the MongooseIM cluster. + We recommend to use this backend for transient data. + This backend requires an RDBMS database configured because we use an external database to discover nodes in the cluster. + Check for CETS config example in [tutorials](../tutorials/CETS-configure.md). + +* Mnesia - a built-in Erlang Database. + Mnesia is fine for the cluster of the fixed size with reliable networking between nodes and with nodes rarely restarted. + There are some issues when nodes are restarting or joining the cluster. So, we recommend to use CETS instead. + Mnesia is still a default backend for modules for config compatibility reasons. * Redis - A fantastic choice for storing live data. It's highly scalable and it can be easily shared by multiple MongooseIM nodes. @@ -38,6 +40,12 @@ Transient data: Persistent Data: + !!! Warning + We **strongly recommend** keeping **persistent** data in an external DB (RDBMS) for production. + Mnesia is not suitable for the volumes of **persistent** data which some modules may require. + Sooner or later a migration will be needed which may be painful. + It is possible to store all data in Mnesia, but only for testing purposes, not for any serious deployments. + * RDBMS - MongooseIM has a strong backend support for relational databases. Reliable and battle proven, they are a great choice for regular MongooseIM use cases and features like `privacy lists`, `vcards`, `roster`, `private storage`, `last activity` and `message archive`. Never loose your data. @@ -47,12 +55,13 @@ Persistent Data: * ElasticSearch - Only for MAM (Message Archive Management). +* Mnesia - some backends support Mnesia to store data, but it is not recommended. + User Data: * LDAP - Used for: users, shared rosters, vCards - ## RDBMS ### MySQL diff --git a/doc/configuration/general.md b/doc/configuration/general.md index cd712fb3aff..6e60fa9b83d 100644 --- a/doc/configuration/general.md +++ b/doc/configuration/general.md @@ -147,7 +147,8 @@ These options can be used to configure the way MongooseIM manages user sessions. * **Example:** `sm_backend = "redis"` Backend for storing user session data. All nodes in a cluster must have access to a complete session database. -Mnesia is sufficient in most cases, use Redis only in large deployments when you notice issues with the mnesia backend. Requires a redis pool with the `default` tag defined in the `outgoing_pools` section. +CETS is a new backend, requires RDBMS configured to work properly. +Mnesia is a legacy backend, sufficient in most cases, use Redis only in large deployments when you notice issues with the mnesia backend. Requires a redis pool with the `default` tag defined in the `outgoing_pools` section. See the section about [redis connection setup](./outgoing-connections.md#redis-specific-options) for more information. !!! Warning diff --git a/doc/operation-and-maintenance/Cluster-configuration-and-node-management.md b/doc/operation-and-maintenance/Cluster-configuration-and-node-management.md index 89bae04b589..2d139de94c4 100644 --- a/doc/operation-and-maintenance/Cluster-configuration-and-node-management.md +++ b/doc/operation-and-maintenance/Cluster-configuration-and-node-management.md @@ -64,98 +64,186 @@ Checklist: - the same cookie across all nodes (`vm.args` `-setcookie` parameter) - each node should be able to ping other nodes using its sname (ex. `net_adm:ping('mongoose@localhost')`) +- RDBMS backend is configured, so CETS could discover nodes ### Initial node -There is no action required on the initial node. +=== "CETS" -Just start MongooseIM using `mongooseim start` or `mongooseim live`. + Clustering is automatic. There is no difference between nodes. + +=== "Mnesia" + + There is no action required on the initial node. + + Just start MongooseIM using `mongooseim start` or `mongooseim live`. ### New node - joining cluster +=== "CETS" -```bash -mongooseimctl start -mongooseimctl started #waits until MongooseIM starts -mongooseimctl join_cluster ClusterMember -``` + Clustering is automatic. -`ClusterMember` is the name of a running node set in `vm.args` file, for example `mongooseim@localhost`. -This node has to be part of the cluster we'd like to join. +=== "Mnesia" -First, MongooseIM will display a warning and a question if the operation should proceed: + ```bash + mongooseimctl start + mongooseimctl started #waits until MongooseIM starts + mongooseimctl join_cluster ClusterMember + ``` -```text -Warning. This will drop all current connections and will discard all persistent data from Mnesia. Do you want to continue? (yes/no) -``` + `ClusterMember` is the name of a running node set in `vm.args` file, for example `mongooseim@localhost`. + This node has to be part of the cluster we'd like to join. -If you type `yes` MongooseIM will start joining the cluster. -Successful output may look like the following: + First, MongooseIM will display a warning and a question if the operation should proceed: -```text -You have successfully joined the node mongooseim2@localhost to the cluster with node member mongooseim@localhost -``` + ```text + Warning. This will drop all current connections and will discard all persistent data from Mnesia. Do you want to continue? (yes/no) + ``` -In order to skip the question you can add option `-f` which will perform the action -without displaying the warning and waiting for the confirmation. + If you type `yes` MongooseIM will start joining the cluster. + Successful output may look like the following: + + ```text + You have successfully joined the node mongooseim2@localhost to the cluster with node member mongooseim@localhost + ``` + + In order to skip the question you can add option `-f` which will perform the action + without displaying the warning and waiting for the confirmation. ### Leaving cluster -To leave a running node from the cluster, call: +=== "CETS" -```bash -mongooseimctl leave_cluster -``` + Stopping the node is enough to leave the cluster. + If you want to avoid the node to join the cluster again, you have to specify a different `cluster_name` + option in the CETS backend configuration. A different Erlang cookie is a good idea too. -It only makes sense to use it if the node is the part of a cluster, e.g `join_cluster` was called from that node before. +=== "Mnesia" -Similarly to `join_cluster` a warning and a question will be displayed unless the option `-f` is added to the command. + To leave a running node from the cluster, call: -The successful output from the above command may look like the following: + ```bash + mongooseimctl leave_cluster + ``` -```text -The node mongooseim2@localhost has successfully left the cluster -``` + It only makes sense to use it if the node is the part of a cluster, e.g `join_cluster` was called from that node before. -### Removing a node from the cluster + Similarly to `join_cluster` a warning and a question will be displayed unless the option `-f` is added to the command. -To remove another node from the cluster, call the following command from one of the cluster members: + The successful output from the above command may look like the following: -```bash -mongooseimctl remove_from_cluster RemoteNodeName -``` + ```text + The node mongooseim2@localhost has successfully left the cluster + ``` -where `RemoteNodeName` is a name of the node that we'd like to remove from our cluster. -This command could be useful when the node is dead and not responding and we'd like to remove it remotely. -The successful output from the above command may look like the following: +### Removing a node from the cluster -```text -The node mongooseim2@localhost has been removed from the cluster -``` +=== "CETS" -### Cluster status + A stopped node would be automatically removed from the node discovery table in RDBMS database after some time. + It is needed so other nodes would not try to connect to the stopped node. -You can use the following commands on any of the running nodes to examine the cluster -or to see if a newly added node is properly clustered: +=== "Mnesia" -```bash -mongooseimctl mnesia info | grep "running db nodes" -``` + To remove another node from the cluster, call the following command from one of the cluster members: -This command shows all running nodes. -A healthy cluster should contain all nodes here. -For example: -```bash -running db nodes = [mongooseim@node1, mongooseim@node2] -``` -To see stopped or misbehaving nodes following command can be useful: + ```bash + mongooseimctl remove_from_cluster RemoteNodeName + ``` -```bash -mongooseimctl mnesia info | grep "stopped db nodes" -``` + where `RemoteNodeName` is a name of the node that we'd like to remove from our cluster. + This command could be useful when the node is dead and not responding and we'd like to remove it remotely. + The successful output from the above command may look like the following: + + ```text + The node mongooseim2@localhost has been removed from the cluster + ``` + +### Cluster status -This command shows which nodes are considered stopped. -This does not necessarily indicate that they are down but might be a symptom of a communication problem. +=== "CETS" + + Run the command: + + ```bash + mongooseimctl cets systemInfo + ``` + + `joinedNodes` should contain a list of properly joined nodes: + + ```json + "joinedNodes" : [ + "mongooseim@node1", + "mongooseim@node2" + ] + ``` + + It should generally be equal to the list of `discoveredNodes`. + + If it is not equal, you could have some configuration or networking issues. + You can check `unavailableNodes`, `remoteNodesWithUnknownTables`, + `remoteNodesWithMissingTables` lists for more information (generally, these lists should be empty). + + You can read the description for other fields of `systemInfo` in the GraphQL schema file. + + For a properly configured 2 nodes cluster the metrics would show something like that: + + ```json + mongooseimctl metric getMetrics --name '["global", "cets", "system"]' + { + "data" : { + "metric" : { + "getMetrics" : [ + { + "unavailable_nodes" : 0, + "type" : "cets_system", + "remote_unknown_tables" : 0, + "remote_nodes_without_disco" : 0, + "remote_nodes_with_unknown_tables" : 0, + "remote_nodes_with_missing_tables" : 0, + "remote_missing_tables" : 0, + "name" : [ + "global", + "cets", + "system" + ], + "joined_nodes" : 2, + "discovery_works" : 1, + "discovered_nodes" : 2, + "conflict_tables" : 0, + "conflict_nodes" : 0, + "available_nodes" : 2 + } + ] + } + } + } + ``` + +=== "Mnesia" + + You can use the following commands on any of the running nodes to examine the cluster + or to see if a newly added node is properly clustered: + + ```bash + mongooseimctl mnesia info | grep "running db nodes" + ``` + + This command shows all running nodes. + A healthy cluster should contain all nodes here. + For example: + ```bash + running db nodes = [mongooseim@node1, mongooseim@node2] + ``` + To see stopped or misbehaving nodes following command can be useful: + + ```bash + mongooseimctl mnesia info | grep "stopped db nodes" + ``` + + This command shows which nodes are considered stopped. + This does not necessarily indicate that they are down but might be a symptom of a communication problem. ## Load Balancing diff --git a/doc/operation-and-maintenance/MongooseIM-metrics.md b/doc/operation-and-maintenance/MongooseIM-metrics.md index 639246c4371..e2b6159ac5b 100644 --- a/doc/operation-and-maintenance/MongooseIM-metrics.md +++ b/doc/operation-and-maintenance/MongooseIM-metrics.md @@ -179,6 +179,27 @@ Metrics specific to an extension, e.g. Message Archive Management, are described | `[global, data, dist]` | proplist | Network stats for an Erlang distributed communication. A proplist with values: `recv_oct`, `recv_cnt`, `recv_max`, `send_oct`, `send_max`, `send_cnt`, `send_pend`, `connections`. | | `[global, data, rdbms, PoolName]` | proplist | For every RDBMS pool defined, an instance of this metric is available. It is a proplist with values `workers`, `recv_oct`, `recv_cnt`, `recv_max`, `send_oct`, `send_max`, `send_cnt`, `send_pend`. | +### CETS system metrics + +| Metric name | Type | Description | +| ----------- | ---- | ----------- | +| `[global, cets, system]` | proplist | A proplist with a list of stats. Description is below. | + +| Stat Name | Description | +| ----------- | ----------- | +| `available_nodes` | Available nodes (nodes that are connected to us and have the CETS disco process started). | +| `unavailable_nodes` | Unavailable nodes (nodes that do not respond to our pings). | +| `joined_nodes` | Joined nodes (nodes that have our local tables running). | +| `discovered_nodes` | Discovered nodes (nodes that are extracted from the discovery backend). | +| `remote_nodes_without_disco` | Nodes that have more tables registered than the local node. | +| `remote_nodes_with_unknown_tables` | Nodes that have more tables registered than the local node. | +| `remote_unknown_tables` | Unknown remote tables. | +| `remote_nodes_with_missing_tables` | Nodes that are available, but do not host some of our local tables. | +| `remote_missing_tables` | Nodes that replicate at least one of our local tables to a different list of nodes. | +| `conflict_nodes` | Nodes that replicate at least one of our local tables to a different list of nodes. | +| `conflict_tables` | Tables that have conflicting replication destinations. | +| `discovery_works` | Returns 1 if the last discovery attempt is successful (otherwise returns 0). | + ### VM metrics | Metric name | Type | Description | diff --git a/doc/tutorials/CETS-configure.md b/doc/tutorials/CETS-configure.md new file mode 100644 index 00000000000..bfeffa66358 --- /dev/null +++ b/doc/tutorials/CETS-configure.md @@ -0,0 +1,36 @@ +## CETS Config Example + +[CETS](https://github.com/esl/cets/) is a library, which allows to replicate in-memory data +across the MongooseIM cluster. It could be used to store a list of online XMPP sessions, a list +of outgoung S2S connections, steam management session IDs, a list of online MUC rooms. + +If you want to use CETS instead of Mnesia, ensure that these options are set: + +```ini +[general] + sm_backend = "cets" + component_backend = "cets" + s2s_backend = "cets" + +[internal_databases.cets] + +# The list of modules that use CETS +# You should enable only modules that you use +[modules.mod_stream_management] + backend = "cets" + +[modules.mod_bosh] + backend = "cets" + +[modules.mod_muc] + online_backend = "cets" + +[modules.mod_jingle_sip] + backend = "cets" +``` + +Ensure that `outgoing_pools` are configured with RDBMS, so CETS could get a list of MongooseIM nodes, which use the same +relational database and cluster them together. + +A preferred way to install MongooseIM is [Helm Charts](https://github.com/esl/MongooseHelm/) on Kubernetes, so it allows +to set `volatileDatabase` to `cets` and the values would be applied using Helm's templates diff --git a/mkdocs.yml b/mkdocs.yml index 35a43c3e7b8..65204b15412 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -53,6 +53,7 @@ nav: - Tutorials: - 'How to Build MongooseIM from source code': 'tutorials/How-to-build.md' - 'How to build and run MongooseIM docker image': 'tutorials/Docker-build.md' + - 'How to configure MongooseIM to use CETS instead of Mnesia': 'tutorials/CETS-configure.md' - 'How to Set up Push Notifications': 'tutorials/push-notifications/Push-notifications.md' - 'How to Set up Push Notifications on the client side': 'tutorials/push-notifications/Push-notifications-client-side.md' - 'How to Set up MongoosePush': 'tutorials/push-notifications/MongoosePush-setup.md' diff --git a/priv/graphql/schemas/admin/cets.gql b/priv/graphql/schemas/admin/cets.gql index c2fea7bb91b..8f4f3b86acd 100644 --- a/priv/graphql/schemas/admin/cets.gql +++ b/priv/graphql/schemas/admin/cets.gql @@ -26,7 +26,7 @@ type CETSSystemInfo { unavailableNodes: [String] "Joined nodes (nodes that have our local tables running)" joinedNodes: [String] - "Discovered nodes (nodes that are extracted from the discovery backend)." + "Discovered nodes (nodes that are extracted from the discovery backend)" discoveredNodes: [String] "Nodes with stopped CETS discovery" remoteNodesWithoutDisco: [String]