Monitor deep learning model training and hardware usage from mobile.

🔥 Features

Monitor running experiments from mobile phone or laptop
Monitor hardware usage on any computer with a single command
Integrate with just 2 lines of code (see examples below)
Keeps track of experiments including infomation like git commit, configurations and hyper-parameters
API for custom visualizations
Pretty logs of training progress
Open source!

Hosting the experiments server

Prerequisites

To install MongoDB, refer to the official documentation here.

Installation

Install the package using pip:

pip install labml-app

Starting the server

# Start the server on the default port (5005)
labml app-server

# To start the server on a different port, use the following command
labml app-server --port PORT

Optional: to setup and configure Nginx in your server, please refer to this.

You can access the user interface either by visiting http://localhost:{port} or, if configured on a separate machine, by navigating to http://{server-ip}:{port}.

Monitor Experiments

Installation

Install the package using pip.

pip install labml

Create a file named .labml.yaml at the top level of your project folder, and add the following line to the file:

app_url: http://localhost:{port}/api/v1/default

# If you are setting up the project on a different machine, include the following line instead,
app_url: http://{server-ip}:{port}/api/v1/default

PyTorch example

from labml import tracker, experiment

with experiment.record(name='sample', exp_conf=conf):
    for i in range(50):
        loss, accuracy = train()
        tracker.save(i, {'loss': loss, 'accuracy': accuracy})

Distributed training example

from labml import tracker, experiment

uuid = experiment.generate_uuid() # make sure to sync this in every machine
experiment.create(uuid=uuid,
                  name='distributed training sample',
                  distributed_rank=0,
                  distributed_world_size=8,
                  )
with experiment.start():
    for i in range(50):
        loss, accuracy = train()
        tracker.save(i, {'loss': loss, 'accuracy': accuracy})

📚 Documentation

Guides

🖥 Screenshots

Formatted training loop output

Custom visualizations based on Tensorboard logs

Monitoring hardware usage

# Install packages and dependencies
pip install labml psutil py3nvml

# Start monitoring
labml monitor

Citing

If you use LabML for academic research, please cite the library using the following BibTeX entry.

@misc{labml,
 author = {Varuna Jayasiri, Nipun Wijerathne, Adithya Narasinghe, Lakshith Nishshanke},
 title = {labml.ai: A library to organize machine learning experiments},
 year = {2020},
 url = {https://labml.ai/},
}

Name	Name	Last commit message	Last commit date
Latest commit lakshith-403 Merge pull request #329 from labmlai/data_store Feb 22, 2025 2f286e7 · Feb 22, 2025 History 2,357 Commits
.github/workflows	.github/workflows	GitHub action fix	Jun 10, 2022
app	app	keep data store in the parent run	Feb 22, 2025
client-docs	client-docs	computer name	Sep 15, 2024
client	client	Fix JSON bug	Feb 20, 2025
docs	docs	computer name	Sep 15, 2024
guides	guides	merge master	Feb 20, 2025
helpers	helpers	docs	Aug 15, 2022
images	images	screenshots	Jul 23, 2021
remote	remote	🧹 slack	Aug 6, 2021
samples	samples	merge master	Feb 20, 2025
.gitattributes	.gitattributes	📇 git attributes	Jun 19, 2019
.gitignore	.gitignore	remove requirements	Jan 8, 2025
Makefile	Makefile	readme app	Jul 23, 2021
license	license	license	Jun 24, 2019
readme.md	readme.md	citing	Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monitor deep learning model training and hardware usage from mobile.

🔥 Features

Hosting the experiments server

Prerequisites

Installation

Starting the server

Monitor Experiments

Installation

PyTorch example

Distributed training example

📚 Documentation

Guides

🖥 Screenshots

Formatted training loop output

Custom visualizations based on Tensorboard logs

Monitoring hardware usage

Citing

About

Used by 209

Contributors 8

Languages

License

labmlai/labml

Folders and files

Latest commit

History

Repository files navigation

Monitor deep learning model training and hardware usage from mobile.

🔥 Features

Hosting the experiments server

Prerequisites

Installation

Starting the server

Monitor Experiments

Installation

PyTorch example

Distributed training example

📚 Documentation

Guides

🖥 Screenshots

Formatted training loop output

Custom visualizations based on Tensorboard logs

Monitoring hardware usage

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Used by 209

Contributors 8

Languages