Skip to content
/ labml Public

πŸ”Ž Monitor deep learning model training and hardware usage from your mobile phone πŸ“±

License

Notifications You must be signed in to change notification settings

labmlai/labml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

2f286e7 Β· Feb 22, 2025
Jun 10, 2022
Feb 22, 2025
Sep 15, 2024
Feb 20, 2025
Sep 15, 2024
Feb 20, 2025
Aug 15, 2022
Jul 23, 2021
Aug 6, 2021
Feb 20, 2025
Jun 19, 2019
Jan 8, 2025
Jul 23, 2021
Jun 24, 2019
Jun 11, 2024

Repository files navigation

Monitor deep learning model training and hardware usage from mobile.

PyPI - Python Version PyPI Status Docs Twitter

πŸ”₯ Features

  • Monitor running experiments from mobile phone or laptop
  • Monitor hardware usage on any computer with a single command
  • Integrate with just 2 lines of code (see examples below)
  • Keeps track of experiments including infomation like git commit, configurations and hyper-parameters
  • API for custom visualizations Open In Colab Open In Colab
  • Pretty logs of training progress
  • Open source!

Hosting the experiments server

Prerequisites

To install MongoDB, refer to the official documentation here.

Installation

Install the package using pip:

pip install labml-app

Starting the server

# Start the server on the default port (5005)
labml app-server

# To start the server on a different port, use the following command
labml app-server --port PORT

Optional: to setup and configure Nginx in your server, please refer to this.

You can access the user interface either by visiting http://localhost:{port} or, if configured on a separate machine, by navigating to http://{server-ip}:{port}.

Monitor Experiments

Installation

  1. Install the package using pip.
pip install labml
  1. Create a file named .labml.yaml at the top level of your project folder, and add the following line to the file:
app_url: http://localhost:{port}/api/v1/default

# If you are setting up the project on a different machine, include the following line instead,
app_url: http://{server-ip}:{port}/api/v1/default

PyTorch example

from labml import tracker, experiment

with experiment.record(name='sample', exp_conf=conf):
    for i in range(50):
        loss, accuracy = train()
        tracker.save(i, {'loss': loss, 'accuracy': accuracy})

Distributed training example

from labml import tracker, experiment

uuid = experiment.generate_uuid() # make sure to sync this in every machine
experiment.create(uuid=uuid,
                  name='distributed training sample',
                  distributed_rank=0,
                  distributed_world_size=8,
                  )
with experiment.start():
    for i in range(50):
        loss, accuracy = train()
        tracker.save(i, {'loss': loss, 'accuracy': accuracy})

πŸ“š Documentation

Guides

πŸ–₯ Screenshots

Formatted training loop output

Sample Logs

Custom visualizations based on Tensorboard logs

Analytics
# Install packages and dependencies
pip install labml psutil py3nvml

# Start monitoring
labml monitor

Citing

If you use LabML for academic research, please cite the library using the following BibTeX entry.

@misc{labml,
 author = {Varuna Jayasiri, Nipun Wijerathne, Adithya Narasinghe, Lakshith Nishshanke},
 title = {labml.ai: A library to organize machine learning experiments},
 year = {2020},
 url = {https://labml.ai/},
}