Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't open this bbx file (attached) #10

Open
limonspb opened this issue Aug 26, 2024 · 12 comments
Open

Can't open this bbx file (attached) #10

limonspb opened this issue Aug 26, 2024 · 12 comments
Assignees

Comments

@limonspb
Copy link

limonspb commented Aug 26, 2024

Hello @atomgomba, could you plz help me opening this bbx file.
Really want to play with data from python.
File (remove txt in file extension)
LOG00031.BFL.txt

Here is what im getting:
image

BF blackbox explorer from here: https://master.dev.blackbox.betaflight.com/ opens it no problem.

Thanks!

@limonspb limonspb changed the title Can't open bbx file Can't open this bbx file (attached) Aug 27, 2024
@atomgomba atomgomba self-assigned this Aug 27, 2024
@atomgomba
Copy link
Owner

atomgomba commented Aug 27, 2024

@limonspb The problem is that the headers used to be written as one header+value per line. But as you can see in the file if you open it with for instance vim, the last header (gps_rescu ?) does not end with a line feed. Nonetheless, if BB Explorer can digest it, I'm sure I can make orangebox to do it too. I'm only wondering why the gps_rescu header is different. By looking at the log file in vim, it looks like that header wasn't completely written to the file and it is truncated.

EDIT: Yea, so it looks like the headers part of the log is truncated, it wasn't completely written. Could be a Betaflight bug?

@limonspb
Copy link
Author

@limonspb The problem is that the headers used to be written as one header+value per line. But as you can see in the file if you open it with for instance vim, the last header (gps_rescu ?) does not end with a line feed. Nonetheless, if BB Explorer can digest it, I'm sure I can make orangebox to do it too. I'm only wondering why the gps_rescu header is different. By looking at the log file in vim, it looks like that header wasn't completely written to the file and it is truncated.

EDIT: Yea, so it looks like the headers part of the log is truncated, it wasn't completely written. Could be a Betaflight bug?

could be a bug, yeah... Is there a limitation on the string size in the header?

@atomgomba
Copy link
Owner

@limonspb The problem is that the headers used to be written as one header+value per line. But as you can see in the file if you open it with for instance vim, the last header (gps_rescu ?) does not end with a line feed. Nonetheless, if BB Explorer can digest it, I'm sure I can make orangebox to do it too. I'm only wondering why the gps_rescu header is different. By looking at the log file in vim, it looks like that header wasn't completely written to the file and it is truncated.
EDIT: Yea, so it looks like the headers part of the log is truncated, it wasn't completely written. Could be a Betaflight bug?

could be a bug, yeah... Is there a limitation on the string size in the header?

On the Python side the maximum size of a string is limited only by the available memory

@atomgomba
Copy link
Owner

@limonspb I have a plan for adding an option to allow parsing of incomplete headers and just print a warning in that case. But it seems like the Python ecosystem has evolved and my distribution method is deprecated, so doing a new release will take some time. Until then maybe you could have a look at why the headers aren't completely written in the log file.

@atomgomba
Copy link
Owner

@limonspb Just a heads up: as you can see from the latest commits, I started to prepare the new release with the necessary changes. I just need some more time for testing and touch ups (updating the docs if necessary, etc.).

@atomgomba
Copy link
Owner

Current master version is bugged, please don't use it

@limonspb
Copy link
Author

limonspb commented Sep 12, 2024

Current master version is bugged, please don't use it

I'm currently using CSV exported from BBX. But would love to eventually use BBX files directly. Let me know when I can try please

@atomgomba
Copy link
Owner

I think it now works correctly. Could you please install and test the master version? You can use the following command:

pip install git+https://github.com/atomgomba/orangebox --break-system-packages

The last option (--break-system-packages) can be omitted probably, it is only needed on some Linux systems I think.

@limonspb
Copy link
Author

limonspb commented Sep 18, 2024

I think it now works correctly. Could you please install and test the master version? You can use the following command:

pip install git+https://github.com/atomgomba/orangebox --break-system-packages

The last option (--break-system-packages) can be omitted probably, it is only needed on some Linux systems I think.

So the bbx problem above seems like a bug in 4.5, that is not the case anymore in 4.6.
But might still be useful to handle unfinished headers without crash, idk.

I have installed as you suggested with pip install git+https://github.com/atomgomba/orangebox --break-system-packages.

The other issue i found is that reading frames, especially in debug mode is noticeably slower than reading frames from exported CSV file. Idk if it's possible to speed up, im not a python expert, just thought that dealing with binary could be faster than parsing string. What do you think? Here is how i read from bbx CSV:

    import numpy as np
    
    file_path = .......
    identifier = "loopIteration"
    # Find the row where the header starts
    header_row = find_header_row(file_path, identifier)
    
    if header_row is None:
        raise ValueError(f"Header row with identifier '{identifier}' not found in the file.")
    
    # Read the header row to get the column names
    column_names = np.genfromtxt(file_path, delimiter=',', skip_header=header_row, max_rows=1, dtype=str)
    # Strip quotes from column names
    column_names = [name.strip('"').strip("'") for name in column_names]

    data_list = []
    total_lines = sum(1 for _ in open(file_path)) - (header_row + 1)
    
    with open(file_path, 'r') as file:
        for i, line in enumerate(file):
            if i <= header_row:
                continue
            data_list.append(np.array(line.strip().split(','), dtype=float))
            
            # Print progress
            if i % 1000 == 0 or i == total_lines + header_row:
                print(f"Reading file: {i - header_row} / {total_lines} rows")
    
    # Convert list of arrays to a numpy array
    data = np.vstack(data_list)

    # Create a dictionary where the keys are column names and the values are arrays of floats
    data_dict = {column_names[i]: data[:, i] for i in range(len(column_names))}

takes ~3-4 seconds for 150k rows in CSV in debug mode.

Here is how i read from binary with orangebox (takes ~40 seconds in debug mode), no processing, just reading:

from orangebox import Parser

# Load a file
parser = Parser.load("logs/btfl_001_good_GPS.bbl")
# or optionally select a log by index (1 is the default)
# parser = Parser.load("btfl_all.bbl", 1)

# Print headers
print("headers:", parser.headers)

# Print the names of fields
print("field names:", parser.field_names)

# Select a specific log within the file by index
print("log count:", parser.reader.log_count)
parser.set_log_index(1)

# Print field values frame by frame
i = 0
total = 0
for frame in parser.frames():
    #print("first frame:", frame.data)
    i = i + 1
    total = total + 1
    if i > 1000:
        i = 0
        print(total)

    #break

print("done reading:", frame.data)

# Complete list of events only available once all frames have been parsed
print("events:", parser.events)

# Selecting another log changes the header and frame data produced by the Parser
# and also clears any previous results and state
parser.set_log_index(1)

@atomgomba
Copy link
Owner

It feels natural to me that decoding compressed binary data can be much slower than reading a text file line by line. In debug mode the log will contain more data, more frames and it makes the process even slower. You can use the bb2csv utility provided in the package to convert a blackbox file to CSV.

@limonspb
Copy link
Author

limonspb commented Sep 18, 2024

It feels natural to me that decoding compressed binary data can be much slower than reading a text file line by line. In debug mode the log will contain more data, more frames and it makes the process even slower. You can use the bb2csv utility provided in the package to convert a blackbox file to CSV.

sorry, when i say "debug" i mean python with debugger. Without debugger both get faster, but the ratio is not as big.
Hmm i thought its not actually compressed data, just binary representation that doesn't require parsing string to numbers.

@atomgomba
Copy link
Owner

atomgomba commented Sep 18, 2024

It's using different kinds of variable length encodings where the number of bytes written depends on the magnitude of the value. So like 0 or 2000 would result in byte representations of different lengths; the way data is persisted in Betaflight logs is very efficient, because you got only so much time to persist stuff within a single loop cycle, so it has to be efficient to avoid skipping frames (although lost or corrupted frames are a thing). Writing to storage is the slowest and writing the less number of bytes the better.

Funnily enough, the reason for debug mode being slower could be very much the same as using debug mode in Betaflight. When the compiler is in debug mode it adds extra information to the resulting artifact, for instance matching stack frames with line numbers in the source code. Atm I'm not sure how I would approach making the current Python code faster. I have an abandoned project - a blackbox decoder implementation written in Rust. At least it could be something more maintainable than the current C implementation. Now I have a new laptop that can do everything I want, maybe I will return to that old project...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants