Skip to content

Commit

Permalink
📖 Document CETS as an alternative to Mnesia
Browse files Browse the repository at this point in the history
Add information to User’s Guide / Tutorials how to set up MIM with RDBMS + CETS.
Update all docs relying on Mnesia configuration and commands like running_db_nodes to also show the CETS alternatives.
Uses tabs in documentation.
  • Loading branch information
arcusfelis committed Sep 25, 2023
1 parent dd13079 commit 46c6b6e
Show file tree
Hide file tree
Showing 7 changed files with 226 additions and 70 deletions.
27 changes: 18 additions & 9 deletions doc/configuration/database-backends-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,16 @@ Subsequent sections go into more depth on each database: what they are suitable

Transient data:

* Mnesia - we highly recommend Mnesia (a highly available and distributed database) over Redis for storing **transient** data.
Being an Erlang-based database, it's the default persistence option for most modules in MongooseIM.

!!! Warning
We **strongly recommend** keeping **persistent** data in an external DB (RDBMS) for production.
Mnesia is not suitable for the volumes of **persistent** data which some modules may require.
Sooner or later a migration will be needed which may be painful.
It is possible to store all data in Mnesia, but only for testing purposes, not for any serious deployments.
* CETS - a library to synchronise records from the ETS tables between nodes.
A new choice to share the live data across the MongooseIM cluster.
We recommend to use this backend for transient data.
This backend requires an RDBMS database configured because we use an external database to discover nodes in the cluster.
Check for CETS config example in [tutorials](../tutorials/CETS-configure.md).

* Mnesia - a built-in Erlang Database.
Mnesia is fine for the cluster of the fixed size with reliable networking between nodes and with nodes rarely restarted.
There are some issues when nodes are restarting or joining the cluster. So, we recommend to use CETS instead.
Mnesia is still a default backend for modules for config compatibility reasons.

* Redis - A fantastic choice for storing live data.
It's highly scalable and it can be easily shared by multiple MongooseIM nodes.
Expand All @@ -38,6 +40,12 @@ Transient data:

Persistent Data:

!!! Warning
We **strongly recommend** keeping **persistent** data in an external DB (RDBMS) for production.
Mnesia is not suitable for the volumes of **persistent** data which some modules may require.
Sooner or later a migration will be needed which may be painful.
It is possible to store all data in Mnesia, but only for testing purposes, not for any serious deployments.

* RDBMS - MongooseIM has a strong backend support for relational databases.
Reliable and battle proven, they are a great choice for regular MongooseIM use cases and features like `privacy lists`, `vcards`, `roster`, `private storage`, `last activity` and `message archive`.
Never loose your data.
Expand All @@ -47,12 +55,13 @@ Persistent Data:

* ElasticSearch - Only for MAM (Message Archive Management).

* Mnesia - some backends support Mnesia to store data, but it is not recommended.


User Data:

* LDAP - Used for: users, shared rosters, vCards


## RDBMS

### MySQL
Expand Down
3 changes: 2 additions & 1 deletion doc/configuration/general.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,8 @@ These options can be used to configure the way MongooseIM manages user sessions.
* **Example:** `sm_backend = "redis"`

Backend for storing user session data. All nodes in a cluster must have access to a complete session database.
Mnesia is sufficient in most cases, use Redis only in large deployments when you notice issues with the mnesia backend. Requires a redis pool with the `default` tag defined in the `outgoing_pools` section.
CETS is a new backend, requires RDBMS configured to work properly.
Mnesia is a legacy backend, sufficient in most cases, use Redis only in large deployments when you notice issues with the mnesia backend. Requires a redis pool with the `default` tag defined in the `outgoing_pools` section.
See the section about [redis connection setup](./outgoing-connections.md#redis-specific-options) for more information.

!!! Warning
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,98 +64,186 @@ Checklist:
- the same cookie across all nodes (`vm.args` `-setcookie` parameter)
- each node should be able to ping other nodes using its sname
(ex. `net_adm:ping('mongoose@localhost')`)
- RDBMS backend is configured, so CETS could discover nodes

### Initial node

There is no action required on the initial node.
=== "CETS"

Just start MongooseIM using `mongooseim start` or `mongooseim live`.
Clustering is automatic. There is no difference between nodes.

=== "Mnesia"

There is no action required on the initial node.

Just start MongooseIM using `mongooseim start` or `mongooseim live`.

### New node - joining cluster

=== "CETS"

```bash
mongooseimctl start
mongooseimctl started #waits until MongooseIM starts
mongooseimctl join_cluster ClusterMember
```
Clustering is automatic.

`ClusterMember` is the name of a running node set in `vm.args` file, for example `mongooseim@localhost`.
This node has to be part of the cluster we'd like to join.
=== "Mnesia"

First, MongooseIM will display a warning and a question if the operation should proceed:
```bash
mongooseimctl start
mongooseimctl started #waits until MongooseIM starts
mongooseimctl join_cluster ClusterMember
```

```text
Warning. This will drop all current connections and will discard all persistent data from Mnesia. Do you want to continue? (yes/no)
```
`ClusterMember` is the name of a running node set in `vm.args` file, for example `mongooseim@localhost`.
This node has to be part of the cluster we'd like to join.

If you type `yes` MongooseIM will start joining the cluster.
Successful output may look like the following:
First, MongooseIM will display a warning and a question if the operation should proceed:

```text
You have successfully joined the node mongooseim2@localhost to the cluster with node member mongooseim@localhost
```
```text
Warning. This will drop all current connections and will discard all persistent data from Mnesia. Do you want to continue? (yes/no)
```

In order to skip the question you can add option `-f` which will perform the action
without displaying the warning and waiting for the confirmation.
If you type `yes` MongooseIM will start joining the cluster.
Successful output may look like the following:

```text
You have successfully joined the node mongooseim2@localhost to the cluster with node member mongooseim@localhost
```

In order to skip the question you can add option `-f` which will perform the action
without displaying the warning and waiting for the confirmation.

### Leaving cluster

To leave a running node from the cluster, call:
=== "CETS"

```bash
mongooseimctl leave_cluster
```
Stopping the node is enough to leave the cluster.
If you want to avoid the node to join the cluster again, you have to specify a different `cluster_name`
option in the CETS backend configuration. A different Erlang cookie is a good idea too.

It only makes sense to use it if the node is the part of a cluster, e.g `join_cluster` was called from that node before.
=== "Mnesia"

Similarly to `join_cluster` a warning and a question will be displayed unless the option `-f` is added to the command.
To leave a running node from the cluster, call:

The successful output from the above command may look like the following:
```bash
mongooseimctl leave_cluster
```

```text
The node mongooseim2@localhost has successfully left the cluster
```
It only makes sense to use it if the node is the part of a cluster, e.g `join_cluster` was called from that node before.

### Removing a node from the cluster
Similarly to `join_cluster` a warning and a question will be displayed unless the option `-f` is added to the command.

To remove another node from the cluster, call the following command from one of the cluster members:
The successful output from the above command may look like the following:

```bash
mongooseimctl remove_from_cluster RemoteNodeName
```
```text
The node mongooseim2@localhost has successfully left the cluster
```

where `RemoteNodeName` is a name of the node that we'd like to remove from our cluster.
This command could be useful when the node is dead and not responding and we'd like to remove it remotely.
The successful output from the above command may look like the following:
### Removing a node from the cluster

```text
The node mongooseim2@localhost has been removed from the cluster
```
=== "CETS"

### Cluster status
A stopped node would be automatically removed from the node discovery table in RDBMS database after some time.
It is needed so other nodes would not try to connect to the stopped node.

You can use the following commands on any of the running nodes to examine the cluster
or to see if a newly added node is properly clustered:
=== "Mnesia"

```bash
mongooseimctl mnesia info | grep "running db nodes"
```
To remove another node from the cluster, call the following command from one of the cluster members:

This command shows all running nodes.
A healthy cluster should contain all nodes here.
For example:
```bash
running db nodes = [mongooseim@node1, mongooseim@node2]
```
To see stopped or misbehaving nodes following command can be useful:
```bash
mongooseimctl remove_from_cluster RemoteNodeName
```

```bash
mongooseimctl mnesia info | grep "stopped db nodes"
```
where `RemoteNodeName` is a name of the node that we'd like to remove from our cluster.
This command could be useful when the node is dead and not responding and we'd like to remove it remotely.
The successful output from the above command may look like the following:

```text
The node mongooseim2@localhost has been removed from the cluster
```

### Cluster status

This command shows which nodes are considered stopped.
This does not necessarily indicate that they are down but might be a symptom of a communication problem.
=== "CETS"

Run the command:

```bash
mongooseimctl cets systemInfo
```

`joinedNodes` should contain a list of properly joined nodes:

```json
"joinedNodes" : [
"mongooseim@node1",
"mongooseim@node2"
]
```

It should generally be equal to the list of `discoveredNodes`.

If it is not equal, you could have some configuration or networking issues.
You can check `unavailableNodes`, `remoteNodesWithUnknownTables`,
`remoteNodesWithMissingTables` lists for more information (generally, these lists should be empty).

You can read the description for other fields of `systemInfo` in the GraphQL schema file.

For a properly configured 2 nodes cluster the metrics would show something like that:

```json
mongooseimctl metric getMetrics --name '["global", "cets", "system"]'
{
"data" : {
"metric" : {
"getMetrics" : [
{
"unavailable_nodes" : 0,
"type" : "cets_system",
"remote_unknown_tables" : 0,
"remote_nodes_without_disco" : 0,
"remote_nodes_with_unknown_tables" : 0,
"remote_nodes_with_missing_tables" : 0,
"remote_missing_tables" : 0,
"name" : [
"global",
"cets",
"system"
],
"joined_nodes" : 2,
"discovery_works" : 1,
"discovered_nodes" : 2,
"conflict_tables" : 0,
"conflict_nodes" : 0,
"available_nodes" : 2
}
]
}
}
}
```

=== "Mnesia"

You can use the following commands on any of the running nodes to examine the cluster
or to see if a newly added node is properly clustered:

```bash
mongooseimctl mnesia info | grep "running db nodes"
```

This command shows all running nodes.
A healthy cluster should contain all nodes here.
For example:
```bash
running db nodes = [mongooseim@node1, mongooseim@node2]
```
To see stopped or misbehaving nodes following command can be useful:

```bash
mongooseimctl mnesia info | grep "stopped db nodes"
```

This command shows which nodes are considered stopped.
This does not necessarily indicate that they are down but might be a symptom of a communication problem.

## Load Balancing

Expand Down
21 changes: 21 additions & 0 deletions doc/operation-and-maintenance/MongooseIM-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,27 @@ Metrics specific to an extension, e.g. Message Archive Management, are described
| `[global, data, dist]` | proplist | Network stats for an Erlang distributed communication. A proplist with values: `recv_oct`, `recv_cnt`, `recv_max`, `send_oct`, `send_max`, `send_cnt`, `send_pend`, `connections`. |
| `[global, data, rdbms, PoolName]` | proplist | For every RDBMS pool defined, an instance of this metric is available. It is a proplist with values `workers`, `recv_oct`, `recv_cnt`, `recv_max`, `send_oct`, `send_max`, `send_cnt`, `send_pend`. |

### CETS system metrics

| Metric name | Type | Description |
| ----------- | ---- | ----------- |
| `[global, cets, system]` | proplist | A proplist with a list of stats. Description is below. |

| Stat Name | Description |
| ----------- | ----------- |
| `available_nodes` | Available nodes (nodes that are connected to us and have the CETS disco process started). |
| `unavailable_nodes` | Unavailable nodes (nodes that do not respond to our pings). |
| `joined_nodes` | Joined nodes (nodes that have our local tables running). |
| `discovered_nodes` | Discovered nodes (nodes that are extracted from the discovery backend). |
| `remote_nodes_without_disco` | Nodes that have more tables registered than the local node. |
| `remote_nodes_with_unknown_tables` | Nodes that have more tables registered than the local node. |
| `remote_unknown_tables` | Unknown remote tables. |
| `remote_nodes_with_missing_tables` | Nodes that are available, but do not host some of our local tables. |
| `remote_missing_tables` | Nodes that replicate at least one of our local tables to a different list of nodes. |
| `conflict_nodes` | Nodes that replicate at least one of our local tables to a different list of nodes. |
| `conflict_tables` | Tables that have conflicting replication destinations. |
| `discovery_works` | Returns 1 if the last discovery attempt is successful (otherwise returns 0). |

### VM metrics

| Metric name | Type | Description |
Expand Down
36 changes: 36 additions & 0 deletions doc/tutorials/CETS-configure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## CETS Config Example

[CETS](https://github.com/esl/cets/) is a library, which allows to replicate in-memory data
across the MongooseIM cluster. It could be used to store a list of online XMPP sessions, a list
of outgoung S2S connections, steam management session IDs, a list of online MUC rooms.

If you want to use CETS instead of Mnesia, ensure that these options are set:

```ini
[general]
sm_backend = "cets"
component_backend = "cets"
s2s_backend = "cets"

[internal_databases.cets]

# The list of modules that use CETS
# You should enable only modules that you use
[modules.mod_stream_management]
backend = "cets"

[modules.mod_bosh]
backend = "cets"

[modules.mod_muc]
online_backend = "cets"

[modules.mod_jingle_sip]
backend = "cets"
```

Ensure that `outgoing_pools` are configured with RDBMS, so CETS could get a list of MongooseIM nodes, which use the same
relational database and cluster them together.

A preferred way to install MongooseIM is [Helm Charts](https://github.com/esl/MongooseHelm/) on Kubernetes, so it allows
to set `volatileDatabase` to `cets` and the values would be applied using Helm's templates
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ nav:
- Tutorials:
- 'How to Build MongooseIM from source code': 'tutorials/How-to-build.md'
- 'How to build and run MongooseIM docker image': 'tutorials/Docker-build.md'
- 'How to configure MongooseIM to use CETS instead of Mnesia': 'tutorials/CETS-configure.md'
- 'How to Set up Push Notifications': 'tutorials/push-notifications/Push-notifications.md'
- 'How to Set up Push Notifications on the client side': 'tutorials/push-notifications/Push-notifications-client-side.md'
- 'How to Set up MongoosePush': 'tutorials/push-notifications/MongoosePush-setup.md'
Expand Down
2 changes: 1 addition & 1 deletion priv/graphql/schemas/admin/cets.gql
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ type CETSSystemInfo {
unavailableNodes: [String]
"Joined nodes (nodes that have our local tables running)"
joinedNodes: [String]
"Discovered nodes (nodes that are extracted from the discovery backend)."
"Discovered nodes (nodes that are extracted from the discovery backend)"
discoveredNodes: [String]
"Nodes with stopped CETS discovery"
remoteNodesWithoutDisco: [String]
Expand Down

0 comments on commit 46c6b6e

Please sign in to comment.