data partition

1. Budgetting the data partition

Available

CCGX: 28M orso
Venus GX 1st version: 100MB
Cerbo GX: 512MB

Usage

Log files: We have around 40 processes that always run and log. Per v2.23; the maximum space that the logfiles for a particular process takes has been reduced to 4 files of 25kB each, 100kB. So in total for all 40 processes, this amounts to 4.000kB. Details of the change in commit 9ce14ef1e, which was backported to v2.23
Firmware & settings file cache (mqtt-rpc)
Settings
Factory installed files (negligible from a size point of view)
VRM Logger backlog

2. Factory installed files on the data partition

# cat /data/venus/installer-version 
v2.11
Victron Energy

# cat /data/venus/serial-number     
HQ1825ZUT5T

# cat /data/venus/wpa-psk       
gt5nyede

# cat /data/venus/part-number
BPP900400100

In the same folder there is also one other file, which is auto generated:

# cat /data/venus/unique-id  
985dxxxxx3a1

3. Writing files to data

Filesystem in Linux are typically asynchronous; the data is reported as written when it is in the page cache, but not yet on the storage itself. For user settings / keys which are generated once and distribute etc it is important that the data and meta-info is on the disk before using the data. See Ensuring data reaches disk

create a new temp file (on the same file system!)
write data to the temp file
fsync() the temp file
rename the temp file to the appropriate name
fsync() the containing directory

If the fsync on the data is omitted, the file meta-data might point to an invalid area after a power cycle and become an zero length file on ubifs or zero filled file on ext4. If the fsync on the directory is omitted, the file might still have the temp name after a power cycle. This might be omitted if the application looks for the temp file and can verify it is written completely.

4. Handling failures related to the data partition

In v2.30, various improvements have been added.

vrmlogger reads /run/data-partition-state, translates its content to a number, and sends it to VRM on boot and there-after only when different from its previously submitted value.

In case the data-partition is not mounted (state == failed or state == failed-to-mount); then an init script will stop vrmlogger; since it can't run without datapartition anyway; and uses curl to send dps to VRM itself.

While normally checking all data against a device-authorisation-token; vrm will accept dps transmissions always.

Note that curl sends it as a DPS-TRANSMISSION (c=100). Which causes it to be stored in the events table. Vrmlogger sends it as a normal data transmission; and then its not stored in the events table; instead its in the normal databases.

In VRM; this status is saved as dataAttribute dps; its different values are:

______State______	Description
0 - fine
1 - failed-once	Set on device reboot; A run-time read-only remount occurred and was stored in u-boot var `data-failed-count` and a second fail was not detected.
2 - recovered	This follows a 'failed-once', after 24 hours of no failure.
3 - failed	This is set on device reboot on a second run-time read-only remount, based on the u-boot var `data-failed-count`
4 - failed-to-mount	If `/data` wasn't even mounted at boot. It will mount a tmpfs for `/var/log`.

Primary reporting is done with report-data-failure.sh, where it ends up in the eventLog MySQL table. VRM logger also reportes the state of /run/data-partition-state, but failed and failed-to-mount are not (reliably) sent by vrmlogger. See report-data-failure.sh. This is because vrmlogger won't operate properly with a malfunctioning /data.

The test-data-partition.sh script contains more explanation of how the conclusions are reached.

To analyse status in the field; there is a Grafana dashboard.

Note: there is no authoritative status of which Venus/GX Devices are broken. As it stands, the curl script reports 'broken' events but no recovery, and vrmlogger doesn't report 'failed' and 'failed-to-mount'. So, it's half here, half there. Changes in VRM logger are underway to be able to handle /data getting read-only, after which we no longer need the curl reporting.

5. Used data files

vrmlogger

db/vrmlogger-backlog.sqlite3

sqlite3 does make sure the data is on the disk in unixSync. Both the data and meta-data.

vebus

var/lib/mk2-dbus/mkxport.settings

fsync on data and sync after rename.

serial starter

var/lib/serial-starter/ttyUSB0
var/lib/serial-starter/ttyUSB1
var/lib/serial-starter/ttyUSB2
var/lib/serial-starter/ttyUSB3
var/lib/serial-starter/ttyUSB4
var/lib/serial-starter/ttyO0
var/lib/serial-starter/ttyO1
var/lib/serial-starter/ttyO2

Not important to be on disk. Files will be recreated when missing.

Connman / glib

var/lib/connman/settings
var/lib/connman/ethernet_7c38665aa305_cable/data
var/lib/connman/ethernet_7c38665aa305_cable/settings

Connman uses the glib g_file_set_contents() which only fsyncs the data when a file gets replaced. This can lead to zero files when the file didn't exist yet. It doesn't do the directory / meta-data fsync after rename. Since they have concerns about performance on spinning disc etc. The glib in Venus is patched to make sure the connman settings hit the disk directly.

Qt4

home/root/Settings/Trolltech.conf

fine, not important file

VNC

conf/vncpassword.txt
conf/vrm_auth_token.txt
home/vnctunnel/.ssh/id_rsa
home/vnctunnel/.ssh/authorized_keys
home/vnctunnel/.ssh/id_rsa.pub

localsettings

conf/settings.xml

Didn't fsync the rename, but used to look at the tmp file as well. For consistency changed to flush the rename as well.

MQTT

conf/mqtt_password.txt
conf/mosquitto.d/vrm_bridge.conf
keys/mosquitto.crt
keys/mosquitto.key Fine will be recreated by start-mosquitto when corrupt.

opensshd

keys/ssh_host_dsa_key
keys/ssh_host_rsa_key.pub
keys/ssh_host_ecdsa_key.pub
keys/ssh_host_rsa_key
keys/ssh_host_dsa_key.pub
keys/ssh_host_ecdsa_key

Fine, files will be regenerated when invalid.

boot

venus/unique-id
etc/timestamp
var/lib/random-seed

production

venus/part-number
venus/installer-version
venus/serial-number

Provide feedback

Saved searches

Use saved searches to filter your results more quickly