You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following Jim's instructions, I was able to do this to merge some parquet files that wouldn't normally by an ak.from_parquet -> ak.to_parquet sequence due to memory limitations:
In [8]: folder="EGamma0_Run2024F-PromptEGMNano-v1_NANOAOD"In [9]: files= [f"{folder}/{x}"forxinos.listdir(folder) ifx.endswith(".parquet")][:3]
In [10]: filesOut[10]:
['EGamma0_Run2024F-PromptEGMNano-v1_NANOAOD/NTuples-part000.parquet',
'EGamma0_Run2024F-PromptEGMNano-v1_NANOAOD/NTuples-part001.parquet',
'EGamma0_Run2024F-PromptEGMNano-v1_NANOAOD/NTuples-part002.parquet']
In [11]: folderOut[11]: 'EGamma0_Run2024F-PromptEGMNano-v1_NANOAOD'In [12]: defgenerate():
...: forfinfiles:
...: array=ak.from_parquet(f)
...: yieldarray
...: delarray
...:
In [13]: ak.to_parquet_row_groups(generate(), f"{folder}.parquet")
Out[13]:
<pyarrow._parquet.FileMetaDataobjectat0x7fe93ec623b0>created_by: parquet-cpp-arrowversion13.0.0num_columns: 272num_rows: 57038num_row_groups: 3format_version: 2.6serialized_size: 0
I'm just posting this here because perhaps something around this logic can be implemented in hepconvert to get a bit smarter parquet merging.
The text was updated successfully, but these errors were encountered:
Hello,
Following Jim's instructions, I was able to do this to merge some parquet files that wouldn't normally by an
ak.from_parquet
->ak.to_parquet
sequence due to memory limitations:I'm just posting this here because perhaps something around this logic can be implemented in
hepconvert
to get a bit smarter parquet merging.The text was updated successfully, but these errors were encountered: