This repository has been archived by the owner on Jun 21, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 67
root file, written with uproot, number of events in tree issue #359
Labels
Comments
This simple reproduces my problem: In [1]: import uproot
In [2]: import numpy as np
In [3]: with uproot.recreate("example.root") as f:
...: f["t"] = uproot.newtree({"a": "float32", "b": "float32", "c": "float32", "d": "float32"})
...:
...: for i in range(5):
...: f["t"].extend({"a": np.random.normal(0, 1, 1000), "b":np.random.normal(0, 1, 1000),
...: "c": np.random.normal(0, 1, 1000), "d": np.random.normal(0, 1, 1000)})
...:
In [4]: uproot.numentries("example.root", "t")
Out[4]: 20000 However changing the flush size doesn't seem to do anything here. |
Does this only happen with |
This example does reproduce the effect of the flush size. In [1]: import uproot
In [2]: import numpy as np
In [3]: with uproot.recreate("example.root") as f:
...: f["t"] = uproot.newtree({f"branch_{i}": np.float32 for i in range(50)}, flushsize="10 MB")
...:
...: for j in range(5):
...: f["t"].extend({f"branch_{i}": np.random.normal(0, 1, 5000) for i in range(50)})
...:
In [4]: uproot.numentries("example.root", "t")
Out[4]: 100000
In [5]: with uproot.recreate("example.root") as f:
...: f["t"] = uproot.newtree({f"branch_{i}": np.float32 for i in range(50)}, flushsize="2 MB")
...:
...: for j in range(5):
...: f["t"].extend({f"branch_{i}": np.random.normal(0, 1, 5000) for i in range(50)})
...:
...:
In [6]: uproot.numentries("example.root", "t")
Out[6]: 58056 @jpivarski I will give it a try. |
Thanks @jpivarski with the low-level interface it works (I tend to forget about it 🤷♂️). In [7]: with uproot.recreate("example.root") as f:
...: f["t"] = uproot.newtree({f"branch_{i}": np.float32 for i in range(50)}, flushsize="2 MB")
...:
...: for j in range(5):
...: for i in range(50):
...: f["t"][f"branch_{i}"].newbasket(np.random.normal(0, 1, 5000))
...:
...:
In [8]: uproot.numentries("example.root", "t")
Out[8]: 25000 |
Ultimately, both should work, but we're trying to factorize bugs in low-level writing from bugs in the flushing logic. It looks like this one is in the flushing logic. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi,
A similar problem as in #345 but this time regarding the number of events in the tree that I write in a file, with the following code:
When I read back the file, there are more events in the tree than I wrote. Trying again by reducing the flushsize decrease the number of events read, but it is still not the same the number of event that I wrote ...
The text was updated successfully, but these errors were encountered: