lehenbauer · May 24, 2023 · May 24, 2023 · May 24, 2023 · May 24, 2023 · May 26, 2023
Showing with 233 additions and 42 deletions.

+41 −0 readme.md

+6 −5 tools/api/README.md

+62 −37 tools/api/lalalai_splitter.py

+23 −0 tools/cpp-example/CManeLists.txt

+79 −0 tools/cpp-example/lalalai_upload.cpp

+22 −0 tools/nodejs-example/lalalai-upload.js
diff --git a/readme.md b/readme.md
@@ -0,0 +1,41 @@
+# [LALAL.AI](https://www.lalal.ai/)
+
+Extract vocal, accompaniment and various instruments from any audio and video
+High-quality stem splitting based on the world's #1 AI-powered technology.
+
+### About
+We are a team of specialists in the fields of artificial intelligence, machine learning, mathematical optimization, and digital signal processing. **Our goal is to make working with audio and video easier** for musicians, sound producers, music engineers, video bloggers, streamers, transcribers, translators, journalists, and many other professionals and creatives.
+
+In 2020, we developed a unique neural network called **Rocknet** using 20TB of training data to extract instrumentals and voice tracks from songs. In 2021, we created Cassiopeia, a next-generation solution superior to Rocknet that allowed improved splitting results with significantly fewer audio artifacts.
+
+Starting as a 2-stem splitter, LALAL.AI has grown significantly during 2021. In addition to **vocal and instrumental**, the service was enhanced with the capability to extract musical instruments – **drums, bass, acoustic guitar, electric guitar, piano, and synthesizer**. As a result of this upgrade, LALAL.AI became the [world’s first 8-stem splitter](https://www.lalal.ai/blog/lalal-ai-adds-the-8th-stem-for-separation-synthesizer/). In the same year, we also presented [business solutions](https://www.lalal.ai/business-solutions/), enabling owners of sites, services and applications to integrate our stem-splitting technology into their environments via API.
+
+Only available in English prior to 2021, LALAL.AI was translated into 7 other languages – Chinese, French, German, Italian, Japanese, Korean, and Spanish. Furthermore, we added new payment methods to make LALAL.AI easier to acquire and more accessible to people worldwide.
+
+In 2022, we created and released [Phoenix](https://www.lalal.ai/blog/phoenix-neural-network-vocal-separation/), a state-of-the-art audio source separation technology. In terms of stem-splitting accuracy, it surpassed not only our previous neural networks but also all other solutions on the market.
+
+Although Phoenix exclusively handled vocal/instrumental isolation at first, its powerful technology allowed us to continually introduce new stems on a regular basis. Throughout the year we trained Phoenix to extract all musical instruments that Cassiopeia supported.
+
+We also added two brand new stems, wind and string instruments, which no other service offered. With that update, LALAL.AI broke the record again and became the [world’s first 10-stem splitter](https://www.lalal.ai/blog/wind-string-instruments/).
+
+LALAL.AI’s innovative technologies are used not only for stem splitting. In July of 2022, we introduced [Voice Cleaner](https://www.lalal.ai/blog/voice-cleaner/), a noise cancellation solution that removes background music, mic rumble, vocal plosives, and many other types of extraneous noises from video and audio recordings.
+
+At the end of 2022, we created a [desktop version of LALAL.AI](https://www.lalal.ai/blog/lalalai-desktop-app/). The application enabled users to split audio and videos into stems in one convenient place on their Windows, macOS and Linux computers.
+
+In the two years since LALAL.AI was created, the project has grown tremendously, as has our workforce. Since the Rocknet neural network launch in 2020, the LALAL.AI team has doubled in size. We work hard to create unique high-quality solutions and we always have a lot of ideas and developments in store. Keep your eyes peeled for new possibilities and improvements!
+
+### Legal Entity
+OmniSale GmbH
+Rigistrasse 3, 6300, Zug, Switzerland.
+
+### Examples of API usage
+* [Python tool](/tools/api/)
+* [Node-js uploading example](/tools/nodejs-example/)
+* [C++ uploading example](/tools/cpp-example/)
+
+### Forks and third party tools
+
+* Modified Python tool for extract multiple stems for only one upload https://github.com/lehenbauer/lalalai (by @lehenbauer)
+* GUI frontend for Python script. Currently for Mac only https://github.com/lehenbauer/unmixer (by @lehenbauer)
+
+
diff --git a/tools/api/README.md b/tools/api/README.md
@@ -15,9 +15,9 @@ lalalai_splitter.py is an example of interacting with the LALAL.AI API as descri
                                  'piano', 'electric_guitar', 
                                 'acoustic_guitar', 'synthesizer', 'voice',
                                 'strings', 'wind']
-                            Stem option:
-                            vocals, drum, bass, voice, electric_guitar, acoustic_guitar, synthesizer, strings, wind.
-                            Stems voice, strings, wind are not supported by Cassiopeia
+                            Stem selection option.
+                            Note: the stems "vocal" and "voice" support the fourth generation of
+                            the neural network named "Orion" (see also the --splitter option)
                       --filter <post-processing filter> \
                             default: 1
                             choices:
@@ -26,6 +26,7 @@ lalalai_splitter.py is an example of interacting with the LALAL.AI API as descri
                       --splitter <splitter type>
                             default: 'phoenix'
                             choices: 
-                                ['phoenix', 'cassiopeia']
-                            The type of neural network used to split audio
+                                ['phoenix', 'orion']
+                            Neural network selection option. Currently, the "Orion" neural
+                            network only supports the stems "vocal" and "voice".
 ```
diff --git a/tools/api/lalalai_splitter.py b/tools/api/lalalai_splitter.py
@@ -1,17 +1,17 @@
 #!/usr/bin/python3
 
 # Copyright (c) 2021 LALAL.AI
-# 
+#
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
-# 
+#
 # The above copyright notice and this permission notice shall be included in all
 # copies or substantial portions of the Software.
-# 
+#
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
@@ -26,14 +26,16 @@
 import os
 import sys
 import time
-from argparse import ArgumentParser
+from argparse import ArgumentParser, ArgumentDefaultsHelpFormatter, SUPPRESS
 from urllib.parse import quote, unquote, urlencode
 from urllib.request import urlopen, Request
 
 
-CURRENT_DIR_PATH = os.path.dirname(os.path.realpath(__file__))
 URL_API = "https://www.lalal.ai/api/"
 
+_orion_stems = ('vocals', 'voice', 'drum', 'piano', 'bass', 'electric_guitar', 'acoustic_guitar')
+_phoenix_stems = ('vocals', 'voice', 'drum', 'piano', 'bass', 'electric_guitar', 'acoustic_guitar', 'synthesizer', 'strings', 'wind')
+
 
 def update_percent(pct):
     pct = str(pct)
@@ -71,12 +73,22 @@ def upload_file(file_path, license):
                 raise RuntimeError(upload_result["error"])
 
 
-def split_file(file_id, license, stem, filter_type, splitter):
+def split_file(file_id, license, stem, splitter, enhanced_processing, noise_cancelling):
     url_for_split = URL_API + "split/"
     headers = {
         "Authorization": f"license {license}",
     }
-    query_args = {'id': file_id, 'stem': stem, 'filter': filter_type, 'splitter': splitter}
+    query_args = {
+        'id': file_id,
+        'stem': stem,
+        'splitter': splitter
+    }
+
+    if enhanced_processing is not None:
+        query_args['enhanced_processing_enabled'] = enhanced_processing
+    if noise_cancelling is not None:
+        query_args['noise_cancelling_level'] = noise_cancelling
+
     encoded_args = urlencode(query_args).encode('utf-8')
     request = Request(url_for_split, encoded_args, headers=headers)
     with urlopen(request) as response:
@@ -101,22 +113,23 @@ def check_file(file_id):
 
         task_state = check_result["task"]["state"]
 
-        if task_state == "error":
+        if task_state == "success":
+            update_percent("Progress: 100%\n")
+            return check_result["split"]
+
+        elif task_state == "error":
             raise RuntimeError(check_result["task"]["error"])
 
-        if task_state == "progress":
+        elif task_state == "progress":
             progress = int(check_result["task"]["progress"])
             if progress == 0 and not is_queueup:
                 print("Queue up...")
                 is_queueup = True
             elif progress > 0:
                 update_percent(f"Progress: {progress}%")
 
-        if task_state == "success":
-            update_percent("Progress: 100%\n")
-            stem_track_url = check_result["split"]["stem_track"]
-            back_track_url = check_result["split"]["back_track"]
-            return stem_track_url, back_track_url
+        else:
+            raise NotImplementedError('Unknown track state', task_state)
 
         time.sleep(15)
 
@@ -144,52 +157,64 @@ def download_file(url_for_download, output_path):
     return file_path
 
 
-def batch_process_for_file(license, input_path, output_path, stem, filter_type, splitter):
+def batch_process_for_file(license, input_path, output_path, stem, splitter, enhanced_processing, noise_cancelling):
     try:
         print(f'Uploading the file "{input_path}"...')
         file_id = upload_file(file_path=input_path, license=license)
         print(f'The file "{input_path}" has been successfully uploaded (file id: {file_id})')
 
         print(f'Processing the file "{input_path}"...')
-        split_file(file_id, license, stem, filter_type, splitter)
-        stem_track_url, back_track_url = check_file(file_id)
-
-        print(f'Downloading the stem track file "{stem_track_url}"...')
-        downloaded_file = download_file(stem_track_url, output_path)
-        print(f'The stem track file has been downloaded to "{downloaded_file}"')
+        split_file(file_id, license, stem, splitter, enhanced_processing, noise_cancelling)
+        split_result = check_file(file_id)
 
-        print(f'Downloading the back track file "{back_track_url}"...')
-        downloaded_file = download_file(back_track_url, output_path)
-        print(f'The back track file has been downloaded to "{downloaded_file}"')
+        for url in (split_result['stem_track'], split_result['back_track']):
+            print(f'Downloading the track file "{url}"...')
+            downloaded_file = download_file(url, output_path)
+            print(f'The track file has been downloaded to "{downloaded_file}"')
 
         print(f'The file "{input_path}" has been successfully split')
     except Exception as err:
         print(f'Cannot process the file "{input_path}": {err}')
 
 
-def batch_process(license, input_path, output_path, stem, filter_type, splitter):
+def batch_process(license, input_path, output_path, stem, splitter, enhanced_processing, noise_cancelling):
     if os.path.isfile(input_path):
-        batch_process_for_file(license, input_path, output_path, stem, filter_type, splitter)
+        batch_process_for_file(license, input_path, output_path, stem, splitter, enhanced_processing, noise_cancelling)
     else:
         for path in os.listdir(input_path):
-            path = os.path.join(input_path, path)
-            if os.path.isfile(path):
-                batch_process_for_file(license, path, output_path, stem, filter_type, splitter)
+            full_path = os.path.join(input_path, path)
+            if os.path.isfile(full_path):
+                batch_process_for_file(license, full_path, output_path, stem, splitter, enhanced_processing, noise_cancelling)
+
+
+def _validate_stem(args):
+    if args.splitter == 'orion' and args.stem not in _orion_stems:
+        raise ValueError(f'Invalid stem option: {args.stem}. Should be one of {_orion_stems}')
+    if args.splitter == 'phoenix' and args.stem not in _phoenix_stems:
+        raise ValueError(f'Invalid stem option: {args.stem}. Should be one of {_phoenix_stems}')
 
 
 def main():
-    parser = ArgumentParser(description='Lalalai splitter')
-    parser.add_argument('--license', type=str, required=True, help='License key')
-    parser.add_argument('--input', type=str, required=True, help='Input directory or a file')
-    parser.add_argument('--output', type=str, default=CURRENT_DIR_PATH, help='Output directory')
-    parser.add_argument('--stem', type=str, default='vocals', choices=['vocals', 'drum', 'bass', 'piano', 'electric_guitar', 'acoustic_guitar', 'synthesizer', 'voice', 'strings', 'wind'], help='Stem option. Stems "voice", "strings", "wind" are not supported by Cassiopeia')
-    parser.add_argument('--filter', type=int, default=1, choices=[0, 1, 2], help='0 (mild), 1 (normal), 2 (aggressive)')
-    parser.add_argument('--splitter', type=str, default='phoenix', choices=['phoenix', 'cassiopeia'], help='The type of neural network used to split audio')
+    parser = ArgumentParser(description='Lalalai splitter', formatter_class=ArgumentDefaultsHelpFormatter)
+    parser.add_argument('--license', required=True, type=str, default=SUPPRESS, help='license key')
+    parser.add_argument('--input', required=True, type=str, default=SUPPRESS, help='input directory or a file')
+    parser.add_argument('--output', type=str, default=os.path.dirname(os.path.realpath(__file__)), help='output directory')
+    parser.add_argument('--splitter', type=str, default='orion', choices=['phoenix', 'orion'], help='the type of neural network used to split audio. automatically selects most efficient splitter if no value provided')
+    parser.add_argument('--stem', type=str, default='vocals', help=f'orion stems: {_orion_stems}; phoenix stems: {_phoenix_stems}')
+    parser.add_argument('--enhanced-processing', type=bool, default=False, choices=[True, False], help='all stems, except "voice"')
+    parser.add_argument('--noise-cancelling', type=int, default=1, choices=[0, 1, 2], help='noise cancelling level for "voice" stem: (0: mild, 1: normal, 2: aggressive)')
 
     args = parser.parse_args()
 
+    _validate_stem(args)
+
+    if args.stem == 'voice':
+        args.enhanced_processing = None
+    else:
+        args.noise_cancelling = None
+
     os.makedirs(args.output, exist_ok=True)
-    batch_process(args.license, args.input, args.output, args.stem, args.filter, args.splitter)
+    batch_process(args.license, args.input, args.output, args.stem, args.splitter, args.enhanced_processing, args.noise_cancelling)
 
 
 if __name__ == '__main__':

diff --git a/tools/cpp-example/CManeLists.txt b/tools/cpp-example/CManeLists.txt
@@ -0,0 +1,23 @@
+cmake_minimum_required(VERSION 3.0.0)
+project(lalalai_upload VERSION 0.1.0)
+
+include(CTest)
+enable_testing()
+
+find_package(CURL REQUIRED)
+
+
+add_executable(lalalai_upload lalalai_upload.cpp)
+
+if(CURL_FOUND)
+    target_include_directories(lalalai_upload PRIVATE ${CURL_INCLUDE_DIRS})
+    target_link_libraries(lalalai_upload ${CURL_LIBRARIES})
+else()
+    message(FATAL_ERROR "CURL library not found")
+endif()
+
+target_compile_features(lalalai_upload PRIVATE cxx_std_17)
+
+set(CPACK_PROJECT_NAME ${PROJECT_NAME})
+set(CPACK_PROJECT_VERSION ${PROJECT_VERSION})
+include(CPack)
diff --git a/tools/cpp-example/lalalai_upload.cpp b/tools/cpp-example/lalalai_upload.cpp
@@ -0,0 +1,79 @@
+#include <iostream>
+#include <fstream>
+#include <sstream>
+#include <string>
+#include <curl/curl.h>
+
+size_t WriteCallback(char *contents, size_t size, size_t nmemb, void *userp)
+{
+    ((std::string *)userp)->append((char *)contents, size * nmemb);
+    return size * nmemb;
+}
+
+void uploadData(std::string data)
+{
+    CURL *curl;
+    CURLcode res;
+
+    curl_global_init(CURL_GLOBAL_DEFAULT);
+    curl = curl_easy_init();
+
+    if (curl)
+    {
+        curl_easy_setopt(curl, CURLOPT_URL, "https://www.lalal.ai/api/upload/");
+        curl_easy_setopt(curl, CURLOPT_POST, 1L);
+
+        struct curl_slist *list = NULL;
+        list = curl_slist_append(list, "Content-Disposition: attachment; filename=file.mp3");
+        list = curl_slist_append(list, "Authorization: license <PASTE LICENSE HERE>"); // TODO: PASTE LICENSE HERE
+        curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list);
+
+        curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, data.size());
+        curl_easy_setopt(curl, CURLOPT_POSTFIELDS, data.c_str());
+
+        std::string readBuffer;
+        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
+        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
+
+        res = curl_easy_perform(curl);
+
+        std::cout << "[" << readBuffer << "]" << std::endl;
+
+        if (res != CURLE_OK)
+            fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));
+
+        curl_easy_cleanup(curl);
+        curl_slist_free_all(list);
+    }
+
+    curl_global_cleanup();
+}
+
+int main()
+{
+    const std::string fileName = "~/file.mp3"; // TODO: PASTE FILENAME HERE
+
+    std::ifstream inFile;
+    inFile.open(fileName, std::ios::binary);
+
+    if (!inFile)
+    {
+        std::cerr << "Error: Unable to open file " << fileName << std::endl;
+        return 1;
+    }
+
+    std::stringstream strStream;
+    strStream << inFile.rdbuf();
+    auto str = strStream.str();
+
+    try
+    {
+        uploadData(std::move(str));
+    }
+    catch (const std::exception &e)
+    {
+        std::cerr << "Error: " << e.what() << '\n';
+    }
+
+    return 0;
+}
diff --git a/tools/nodejs-example/lalalai-upload.js b/tools/nodejs-example/lalalai-upload.js
@@ -0,0 +1,22 @@
+const axios = require('axios');
+const fs = require('fs');
+
+const fileName = '~/file.mp3';
+
+try {
+    const data = fs.readFileSync(fileName);
+    try {
+        (axios.post('https://www.lalal.ai/api/upload/', data, {
+            headers: {
+                'Content-Disposition': 'attachment; filename=file.mp3',
+                'Authorization': 'license <PASTE LICENSE HERE>'
+            }
+        })).then(res => console.log('Result:', res.data))
+    }
+    catch (error) {
+        console.error('Error:', error);
+    }
+} catch (error) {
+    console.error('Error:', error);
+}
+