Replies: 2 comments
-
There is nothing wrong (and indeed, it is sometimes favourable) with using Python routines to manipulate your JSON data. When loading anything from Python or JSON, Awkward already has to perform a loop, so there is no harm in doing it yourself, particularly when it's cleaner to do so. Awkward is designed to deal with columnar data, i.e. many rows, few columns, whereas this scheme has many columns, single rows. When Rather than doing all of this work, we can directly use a JSON reading library (such as import json
data = json.loads(json_str) Then we can restructure it. Python has long guaranteed that iterating over the various array = ak.with_field(data.values(), data.keys(), "id") One could also do this by first creating the array from the values, and then calling |
Beta Was this translation helpful? Give feedback.
-
I thought I'd add just a little more: The one thing that you want to avoid doing is making a record with a large number of fields (more than 1000, say) because different fields are different array buffers, and working with lots of little arrays gives you none of the performance benefit of working with arrays. This should be avoided even as an intermediate step: it's better to iterate over data in Python than to make a million little arrays (which then get iterated over in Python). Since each of your top-level fields points to structures that are all the same type (in your example), you'll want the same-type data to ask be in shared fields. The problem is just to turn the dict that you start with into a list (i.e. dropping top-level field names) before converting it into an Awkward Array. This is what ak.from_iter(json.loads(json_str).values()) (Okay, so some of the values of your top-level fields differ in that they have a "check" subfield, but this week be read in as though "check" were a missing value on all of the other records, and you can work with it in Awkward Array as an option-type.) |
Beta Was this translation helpful? Give feedback.
-
Some JSON file use Object instead Array to save data rows, for example:
from_json()
returns a Record:but what I really want is:
Are there any method that can convert from
data
todata2
without modify the original JSON data?Beta Was this translation helpful? Give feedback.
All reactions