Format types #307

scott-griffiths · 2024-01-02T09:59:16Z

scott-griffiths
Jan 2, 2024
Maintainer

So there is a plan to make pack into a regular method, so that each class can use it to create a new instance (rather than the rather strange way that it's a module level function that creates a BitStream - I'm not sure why I did that).

The simple way would be to copy the current pack, but the first parameter is called fmt and that brought me back to the bit format ideas that have been circulating for various different reasons, and so I thought it a good idea to write some stuff down to see what makes sense.

The very simplest Format would be a list of Dtypes. For example [Dtype('uint12'), Dtype('bool'), Dtype('float16')]. This could then be used to construct a bitstring by combining it with the correct number of inputs. It could also be used to unpack that bitstring. This is quite possibly 80% of the cases already, but I would like to make it more flexible.

The next thing to add is bit literals. So basically any fixed set of bits, no matter how they are created. Any adjacent ones should be concatenated. The idea is that as much of the parsing is done in the creation of the Format as possible so it can be efficiently reused. This is all fine for constructing bitstrings, but asks some questions when unpacking them again:

f = Format('0x47, u16')
b = Bits.pack(f, 6)
b.unpack(f)  # -> [6]
b2 = Bits('0x48, u16=6')
b2.unpack(f)  # -> ???

In the second case it wants to unpack the literal 0x47 but it has the literal 0x48. It's not actually wanting to return the literal. I guess it should just raise an exception? Is ValueError appropriate here?

Constructing Format instances: Well we can just use a str. But it would be perverse to not allow a bitstring or a Dtype. And come to think of it another Format...

class Format:
    def __init__(self, fmt: str | Format | Dtype | Bits | Iterable[str | Format | Dtype | Bits], name:str)
        if isinstance(fmt, (str, Format, Dtype, Bits)):
            fmt = [fmt]
        for f in fmt:
            # Flatten out into List[Dtype | BitStore]

So internally we're storing a list of Dtype and BitStore, with no adjacent BitStores. A pack can then just expect a number of inputs equal to the number of Dtypes.

I think that it could be important that a Dtype or a Bits is also considered a valid Format. I've added a name field to the Format so that they can be referred to in other places. Exactly how I'm not sure - I don't want to reuse the = sign.

Let's say we had f = Format('0x47, u16, u16', name='header'). We can then use b = Bits.pack(f, 352, 288) without referencing the name. Let's say we want to name the fields as width and height: w = Format('u16', name='width') etc. The original format could then become f = Format(['0x47', w, h], name='header'). We could further make a = Format([w, h], name='area') so that f = Format(['0x47', a], name='header'). As an aside, what would be a good way to print this?

>>> print(f)
Format: header
    Bits: 0x47
    Format: area
        Format: width
            Dtype: u16
        Format: height
            Dtype: u16

Okay, so how do we use the names in Format definitions?

b1 = Bits.pack(f, width=352, height=288)
b2 = Bits.pack(f, area=(352, 288))
b3 = Bits.pack(f, 352, 288)

b3 should work simply enough. As f is parsed it looks for Dtypes and fills them in from *values.
b1 sees Formats and looks up the name in **kwargs. If it finds it then it uses it, otherwise it takes the next item from *values.
For b2 it sees Format(..., name='area') and can see that **kwargs['area'] returns (352, 288). It will then essentially be tmp = Bits.pack(a, (352, 288)) which won't work as the tuple is a single value...

Is there a better way to create these formats? Instead of w = Format('u16', name='width') we could say w = Format('width::u16') for example. I don't want to use an = as that's already in use to give a value. I can't use just : as that's a separator between the dtype name and its length. Could use () or [] or <> around the name, or -> after it. The <width>u16 does look at bit regex like I guess.

scott-griffiths · 2024-08-23T08:36:52Z

scott-griffiths
Aug 23, 2024
Maintainer Author

This idea has now expanded to an entire new project. See bitformat for much more on this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format types #307

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Format types #307

scott-griffiths Jan 2, 2024 Maintainer

Replies: 1 comment

scott-griffiths Aug 23, 2024 Maintainer Author

scott-griffiths
Jan 2, 2024
Maintainer

scott-griffiths
Aug 23, 2024
Maintainer Author