-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] How can you extend the length beyond 10 minutes? #114
Comments
I would love to, but I cannot locate anything relevant, which was the reason for the ticket. Can you link the area of the block of text that explains what needs changing? |
when you use separate command add - d 9000 (replace 9000 by duration in seconds you need) |
Awesome, @greatfinders, thank you! There's no errors or warnings showing, and I cannot find any logs, but when I have too large a duration it plain doesn't write any output. So setting |
i didn't really tried it , no result mean that the task was killed cause of power lacking (it required lot of ram and CPU) , for what i remember (read faq it's written inside) it is said you can only settle maximum or settle the duration limit, they will include a start offset later so if you have no result , just split into 600s duration files , use it and then fuse them |
Huh. That appears to be true, but doesn't make sense. Last night I could use I don't suppose setting up a swap file would necessarily help. I guess I'll have to split the files up into 15Min chunks before processing. |
FYI everything's good now. I added a 16GiB swap file, and was able to process longer. Trying an hour long file, it appears to have written a 29 minute file, which makes me happy, as at least I know how far it made it and can expand the file accordingly. |
Why can't Spleeter split a large file into 10 minute segments, process them separately, then merge it back together into a single file output? It would solve the extended length issue, without resorting to using up all the memory, and potentially hacky ways of manipulating memory swap file size (which can't be done in macOS anyways). |
I'd actually rather not to have that happen automatically. Each spleet has
a noise spike at the start and end, and I'd rather not have to search for
double-spikes on any file longer than 10 minutes.
I was actually able to process roughly an hour (length is dependant on
content) by installing `zram`. When I upgraded CPU I'd also upgraded RAM
50%, and now swap is barely touched on an hour (tens of megs).
…On Mon, Apr 26, 2021, 07:05 redbar0n ***@***.***> wrote:
Why can't Spleeter split a large file into 10 minute segments, process
them separately, then merge it back together into a single file output?
It would solve the extended length issue, without resorting to using up
all the memory, and potentially hacky ways of manipulating memory swap file
size (which can't be done in macOS anyways).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#114 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AILCPYLB7TL5E7HIA6W2LZDTKVCHPANCNFSM4JOSKQZQ>
.
|
I get that. But maybe spleeter should remove the noise spikes at the starts instead? It seems unreasonable to force people to mess with or upgrade their RAM just for it to work on longer files. |
Oh, yeah, sorry. I've never purchased anything Apple. This would be using
the Docker container on openSUSE Linux.
…On Mon, Apr 26, 2021, 09:29 redbar0n ***@***.***> wrote:
I get that. But maybe spleeter should remove the noise spikes at the
starts instead?
It seems unreasonable to force people to mess with or upgrade their RAM
just for it to work on longer files.
How did you get to use zram on macOS anyway? I thought it was just for
Linux.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#114 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AILCPYLVNZNNBJPV7MZVJC3TKVTDJANCNFSM4JOSKQZQ>
.
|
Seems like this bash script might help: Although it should ideally be a part of Spleeter itself. |
I recently contributed to the aforementioned bash script, which might help manage both RAM and HDD space concerns in general: https://github.com/amo13/spleeter-wrapper |
This worked for me. I do however think the default should be greater and should scale automatically based on CPU and RAM availability ... |
while I using GPU to handler LARGE .wav audio file (4 hours long, ~3GB), and the defult
Here is the code that handle multiple audios filesspeech_file_paths = []
audio_region_paths = []
self.step_seconds = 60 # separate file into 60s long
seconds = 0
file_index = 0
audio_region = None
all_audio_length = 0.0
while True:
# NOTE: Read the audio file. If the audio file is too short, merge it with the next audio file (if there is one).
# NOTE: If the audio_region is the remaining part of the previous region, there is no need to read the file again.
if audio_region == None:
logger.warning(f"Reading file {file_index+1}: {self.audio_file_paths[file_index]}")
audio_region = auditok.load(self.audio_file_paths[file_index])
file_index += 1
all_audio_length += audio_region.duration
# NOTE: If the audio file is too short, and there is another audio file, merge it with the next one.
while audio_region.duration < self.step_seconds and file_index < len(self.audio_file_paths):
logger.warning(f"Reading file {file_index+1}: {self.audio_file_paths[file_index]}")
next_audio_region = auditok.load(self.audio_file_paths[file_index])
all_audio_length += next_audio_region.duration
audio_region += next_audio_region
file_index += 1
current_file_seconds = 0
while True:
audio_region_path = os.path.join(clip_path, f"{get_time_point_name(seconds)}-{get_time_point_name(seconds+self.step_seconds)}.wav")
speech_file_path = os.path.join(clip_path, f"{get_time_point_name(seconds)}-{get_time_point_name(seconds+self.step_seconds)}_vocals.wav")
# NOTE: here is the feature of [auditok](https://github.com/amsehili/auditok)
clip_region:auditok.AudioRegion = audio_region.seconds[current_file_seconds:current_file_seconds+self.step_seconds]
is_end_of_clip = clip_region.duration < self.step_seconds
if os.path.exists(audio_region_path):
logger.debug(f"Audio segment already exists: {audio_region_path}")
seconds += self.step_seconds
current_file_seconds += self.step_seconds
continue
if not is_end_of_clip:
audio_region_paths.append(audio_region_path)
# NOTE: here I save the file
clip_region.save(audio_region_path)
logger.debug(f"Split audio into {self.step_seconds}-second segments: {audio_region_path}")
speech_file_paths.append(speech_file_path)
seconds += self.step_seconds
current_file_seconds += self.step_seconds
else:
# NOTE: If this is the last clip, merge it with the next audio file.
audio_region = clip_region
break
# NOTE: If all audio files have been read, exit the loop.
if file_index == len(self.audio_file_paths):
# NOTE: If there is a remaining part, save it.
if audio_region.duration > 0:
audio_region_paths.append(audio_region_path)
audio_region.save(audio_region_path)
logger.debug(f"Split audio into {self.step_seconds}-second segments: {audio_region_path}")
speech_file_paths.append(speech_file_path)
break
logger.warning(f"Start separating vocals. Total audio length: {get_time_point_name(all_audio_length)}")
self.separate_speech_from_audio(input_paths=audio_region_paths, output_folder=clip_path)
logger.warning(f"Separation completed. Total audio length: {get_time_point_name(all_audio_length)}")
# NOTE: Merge the separated audio clips into one complete file.
self.merge_speeches(audio_files=speech_file_paths) |
I'd solved this with a combination of `zram` and 48GiB RAM.
…On Wed, Apr 19, 2023, 14:35 DoodleBear ***@***.***> wrote:
while I using GPU to handler LARGE .wav audio file (4 hours long, ~3GB),
and the defult -d 600 will ends up Out Of Memory.
GPU: RTX 3060 LAPTOP (6BG VRAM)
1. So I set the step to 60 using auditok
<https://github.com/amsehili/auditok> to separate the file into small
60sec .wav file 0-60.wav, 60-120.wav, ...
2. pass to spleeter -> 0-60_vocals.wav, 60-120_vocals.wav, ...
3. use ffmepg to combine multiple wav files into one vocals.wav
4. hope this help
Here is the code that handle multiple audios files
speech_file_paths = []audio_region_paths = []
self.step_seconds = 60 # separate file into 60s longseconds = 0file_index = 0audio_region = Noneall_audio_length = 0.0while True:
# NOTE: Read the audio file. If the audio file is too short, merge it with the next audio file (if there is one).
# NOTE: If the audio_region is the remaining part of the previous region, there is no need to read the file again.
if audio_region == None:
logger.warning(f"Reading file {file_index+1}: {self.audio_file_paths[file_index]}")
audio_region = auditok.load(self.audio_file_paths[file_index])
file_index += 1
all_audio_length += audio_region.duration
# NOTE: If the audio file is too short, and there is another audio file, merge it with the next one.
while audio_region.duration < self.step_seconds and file_index < len(self.audio_file_paths):
logger.warning(f"Reading file {file_index+1}: {self.audio_file_paths[file_index]}")
next_audio_region = auditok.load(self.audio_file_paths[file_index])
all_audio_length += next_audio_region.duration
audio_region += next_audio_region
file_index += 1
current_file_seconds = 0
while True:
audio_region_path = os.path.join(clip_path, f"{get_time_point_name(seconds)}-{get_time_point_name(seconds+self.step_seconds)}.wav")
speech_file_path = os.path.join(clip_path, f"{get_time_point_name(seconds)}-{get_time_point_name(seconds+self.step_seconds)}_vocals.wav")
# NOTE: here is the feature of [auditok](https://github.com/amsehili/auditok)
clip_region:auditok.AudioRegion = audio_region.seconds[current_file_seconds:current_file_seconds+self.step_seconds]
is_end_of_clip = clip_region.duration < self.step_seconds
if os.path.exists(audio_region_path):
logger.debug(f"Audio segment already exists: {audio_region_path}")
seconds += self.step_seconds
current_file_seconds += self.step_seconds
continue
if not is_end_of_clip:
audio_region_paths.append(audio_region_path)
# NOTE: here I save the file
clip_region.save(audio_region_path)
logger.debug(f"Split audio into {self.step_seconds}-second segments: {audio_region_path}")
speech_file_paths.append(speech_file_path)
seconds += self.step_seconds
current_file_seconds += self.step_seconds
else:
# NOTE: If this is the last clip, merge it with the next audio file.
audio_region = clip_region
break
# NOTE: If all audio files have been read, exit the loop.
if file_index == len(self.audio_file_paths):
# NOTE: If there is a remaining part, save it.
if audio_region.duration > 0:
audio_region_paths.append(audio_region_path)
audio_region.save(audio_region_path)
logger.debug(f"Split audio into {self.step_seconds}-second segments: {audio_region_path}")
speech_file_paths.append(speech_file_path)
break
logger.warning(f"Start separating vocals. Total audio length: {get_time_point_name(all_audio_length)}")self.separate_speech_from_audio(input_paths=audio_region_paths, output_folder=clip_path)logger.warning(f"Separation completed. Total audio length: {get_time_point_name(all_audio_length)}")
# NOTE: Merge the separated audio clips into one complete file.self.merge_speeches(audio_files=speech_file_paths)
—
Reply to this email directly, view it on GitHub
<#114 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AILCPYP5UJ5IRXAS7C5QHXDXCAV6RANCNFSM4JOSKQZQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I'm working with live recordings, which looks promising so far, but have not worked out what file needs tweaking to read beyond 600.0 seconds.
I'd hate to have to break everything into a bunch of 10min. segments, process, merge back together.
The text was updated successfully, but these errors were encountered: