Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing Awkward Arrays of type unknown #822

Closed
jpivarski opened this issue Feb 1, 2023 · 5 comments · Fixed by #870
Closed

Writing Awkward Arrays of type unknown #822

jpivarski opened this issue Feb 1, 2023 · 5 comments · Fixed by #870
Labels
bug The problem described is something that must be fixed

Comments

@jpivarski
Copy link
Member

This is an oversight in the implementation (both Uproot and Awkward). The policy is that unknown should become float64 if it needs to be an array, because

np.array([])

makes an array of np.float64. This issue is to say that Uproot should do that automatically, and there will be an issue on Awkward to say that np.values_astype(empty_array, np.float64) should turn the EmptyArrays into NumpyArrays with the specified type.

Discussed in scikit-hep/awkward#2186

Originally posted by fleble February 1, 2023
Dear experts,

I am writing ROOT files using uproot. One problem I face is when a branch is empty, thus having unknown type, which cannot be written down with uproot.

>>> ak.__version__
1.5.1
>>> uproot.__version__
4.3.7

This script

ak_array = ak.Array([[], [], []])
ak_array = ak.values_astype(ak_array, np.float64)

tree = {"branch": ak_array}

with uproot.recreate("test.root") as file:
    file["test"] = tree

terminates with the following error:

TypeError: cannot write Awkward Array type to ROOT file:

   var * unknown

And:

>>> ak_array = ak.Array([[], [], []])
>>> ak.values_astype(ak_array, np.float64)
<Array [[], [], []] type='3 * var * unknown'> 

Is it possible to give a type to empty to empty awkward array?


After some attempt, I found a trick by concatenating an ak array of the desired type, and filtering out the part that was concatenated.

Clever! But we'll fix it anyway.

@jpivarski jpivarski added the bug (unverified) The problem described would be a bug, but needs to be triaged label Feb 1, 2023
@jpivarski jpivarski transferred this issue from scikit-hep/awkward Feb 1, 2023
@jpivarski jpivarski added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Feb 23, 2023
@veprbl
Copy link
Contributor

veprbl commented Mar 3, 2023

More generally, how can I specify a type for my array of zero length?

b = ak.ArrayBuilder()
b.begin_list()
for _ in []:
    b.begin_record()
    b.field("x")
    b.begin_list()
    b.null()
    b.end_list()
    b.end_record()
    raise RuntimeException("should not be reached")
b.end_list()
# b.snapshot()["x"] # ValueError: cannot slice EmptyArray by field name
b.snapshot().type
1 * var * unknown

The same principle of building a "full" element and slicing it away applies, but would be nice to have a better way. The docs don't seem to mention anything about manual construction/specification of types.

@jpivarski
Copy link
Member Author

ArrayBuilder doesn't build pre-defined types. LayoutBuilder does, although that's in C++. I think there will someday be a LayoutBuilder implemented in Numba, with a pure-Python implementation for testing workflows in Numba.

We've talked about having a general-purpose function for making any array conform to any specified type, but in the meantime, there are specialized functions for it, like ak.arrays_astype.

>>> array = ak.Array([[]])
>>> array
<Array [[]] type='1 * var * unknown'>

>>> ak.values_astype(array, np.float32, including_unknown=True)
<Array [[]] type='1 * var * float32'>

The above would turn all leaf-type nodes (unknown and numerical) to float32. There isn't a good way to do it for one record branch and not another—you'd have to deconstruct it and reconstruct it with ak.unzip and ak.zip.

@agoose77
Copy link
Collaborator

agoose77 commented Mar 4, 2023

There's also a brief discussion of this in the user-guide, with a suggestion to append a dummy element of the correct type.

@veprbl
Copy link
Contributor

veprbl commented Mar 4, 2023

@veprbl
Copy link
Contributor

veprbl commented Mar 4, 2023

So it does not cover the dynamic type case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants