You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to be able to perform searches on the results of a tiled backed catalog. For example:
fromtiled.clientimportfrom_profilefromdatabroker.queriesimportTimeRangec=from_profile("xyz") # note I am connecting directly to the mongodb here and not using the tiled serverresults=c.search(TimeRange(since="2022-04-10", until="2022-04-12")
run=results[5]
Current Behavior
Currently, if there are multiple runs in the catalog c with the scan_id 5, then calling:
run=results[5]
Will first search the catalog c for all the entries with the scan_id 5, then take the latest one (which might not lie in the time range we want) and then looks to see whether it lies in the time range specified in the original TimeRange search.
This is not the expected order of operations and leads to an error of "KeyError: 'No match for scan_id=5'"
Possible Solution
Naively, I can see that it's possible to achieve what I want by making the scan_id function return the entire list of entries which match the requested scan_ids. Then this subset of the original catalog is searched, in subsequent searches.
defScanID(*scan_ids, duplicates="all"):
# Wrap _ScanID to provide a nice usage for *one or more scan_ids*:# >>> ScanID(5)# >>> ScanID(5, 6, 7)# Placing a varargs parameter (*scan_ids) in the dataclass constructor# would cause trouble on the server side and generally feels "wrong"# so we have this wrapper function instead.return_ScanID(scan_ids=scan_ids, duplicates=duplicates)
I still don't understand why the original catalog and not the results catalog is searched though, and this solutution probably breaks other stuff.
It is of course possible to avoid these problems by using uid rather than scan_id.
Context
My users and I make use of agregated searches a lot to find subsets of our database. This is useful when trying to give users parts of the database, maybe becuase that's when they ran their investigation, or I only want them to see parts of it they have added with their user (username is added to metadata). It's quite common for my users to first search the database using a TimeRange, and then look for scan_id's within that time range of whenever their beamtime was.
We are currently using databroker backed by intake, and I am looking at using Tiled, which is why I ran some existing scripts against it and found the problem described above.
Your Environment
python 3.8
tiled==0.1.0a79
databroker ==2.0.0b12
The text was updated successfully, but these errors were encountered:
Expected Behavior
I would like to be able to perform searches on the results of a tiled backed catalog. For example:
Current Behavior
Currently, if there are multiple runs in the catalog
c
with the scan_id 5, then calling:Will first search the catalog
c
for all the entries with the scan_id 5, then take the latest one (which might not lie in the time range we want) and then looks to see whether it lies in the time range specified in the original TimeRange search.This is not the expected order of operations and leads to an error of "KeyError: 'No match for scan_id=5'"
Possible Solution
Naively, I can see that it's possible to achieve what I want by making the scan_id function return the entire list of entries which match the requested scan_ids. Then this subset of the original catalog is searched, in subsequent searches.
I still don't understand why the original catalog and not the results catalog is searched though, and this solutution probably breaks other stuff.
It is of course possible to avoid these problems by using uid rather than scan_id.
Context
My users and I make use of agregated searches a lot to find subsets of our database. This is useful when trying to give users parts of the database, maybe becuase that's when they ran their investigation, or I only want them to see parts of it they have added with their user (username is added to metadata). It's quite common for my users to first search the database using a TimeRange, and then look for scan_id's within that time range of whenever their beamtime was.
We are currently using databroker backed by intake, and I am looking at using Tiled, which is why I ran some existing scripts against it and found the problem described above.
Your Environment
python 3.8
tiled==0.1.0a79
databroker ==2.0.0b12
The text was updated successfully, but these errors were encountered: