Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segment_gatherer skips dataset messages when all_files_are_local is set #160

Open
gerritholl opened this issue Jan 10, 2025 · 0 comments · May be fixed by #161
Open

segment_gatherer skips dataset messages when all_files_are_local is set #160

gerritholl opened this issue Jan 10, 2025 · 0 comments · May be fixed by #161

Comments

@gerritholl
Copy link
Member

When all_files_are_local: True is set, the segment gatherer skips all dataset messages.

import datetime
from pytroll_collectors.segments import SegmentGatherer
from posttroll.message import Message
from satpy.utils import debug_on; debug_on()

sg = SegmentGatherer({
    "patterns": {
        "fdhsi": {
            "message_keys": ["repeat_cycle_in_day"],
            "critical_files": None,
            "wanted_files": None,
            "all_files": None,
            "is_critical_set": True,
            "topic": "/set/fci/fdhsi-fd"}},
    "timeliness": 10,
    "time_name": "start_time",
    "all_files_are_local": True}
    )
rawstr="""pytroll://set/fci/fdhsi-fd dataset [email protected] 2025-01-10T13:43:04.219944 v1.01 application/json {"purpose": "DIS", "proc_time": "2025-01-10T13:34:13", "facility_or_tool": "IDPFI", "environment": "OPE", "start_time": "2025-01-10T13:30:00", "end_time": "2025-01-10T13:39:24", "special_compression": "JLS", "disposition_mode": "O", "repeat_cycle_in_day": 82, "platform_name": "MTI1", "count_in_repeat_cycle": 1, "dataset": [{"uri": "/data/pytroll/IN/FCI_FDHSI/W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-BODY--DIS-NC4E_C_EUMT_20250110133413_IDPFI_OPE_20250110133007_20250110133017_N_JLS_O_0082_0001.nc", "uid": "W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-BODY--DIS-NC4E_C_EUMT_20250110133413_IDPFI_OPE_20250110133007_20250110133017_N_JLS_O_0082_0001.nc"}, {"uri": "/data/pytroll/IN/FCI_FDHSI/W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-BODY--DIS-NC4E_C_EUMT_20250110134044_IDPFI_OPE_20250110133819_20250110133856_N_JLS_O_0082_0036.nc", "uid": "W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-BODY--DIS-NC4E_C_EUMT_20250110134044_IDPFI_OPE_20250110133819_20250110133856_N_JLS_O_0082_0036.nc"}], "sensor": ["fci"]}"""

sg.process(Message(rawstr=rawstr))

With "all_files_are_local": False, this gives:

[DEBUG: 2025-01-10 14:49:36 : segment_gatherer] Adding new slot: 2025-01-10 13:30:00
[INFO: 2025-01-10 14:49:36 : segment_gatherer] Setting timeout to 2025-01-10 13:49:46.877501 for slot 2025-01-10 13:30:00.
[INFO: 2025-01-10 14:49:36 : segment_gatherer] 82 processed

With "all_files_are_local": True, this gives:

[DEBUG: 2025-01-10 14:49:47 : segment_gatherer] No key 'uri' in message.
[DEBUG: 2025-01-10 14:49:47 : segment_gatherer] No parser matching message, skipping.

Why do I have all_files_are_local: True when collecting dataset messages where the scheme has already been filtered? Because I have a segment gatherer that collects 1) a dataset message with 40 FDHSI files (scheme already removed with a previous segment gatherer), together with 2) a file message with a single CLM file (scheme still present).

The scheme removal functionality appears to assume the uri is present top-level in the message:

url_parts = urlparse(message_data['uri'])

this triggers a KeyError handled by:

except KeyError as err:
logger.debug("No key %s in message.", str(err))

As a solution, the scheme removal functionality could have an internal try/except and not remove a scheme if it doesn't find an URI, or it could go into a dataset and remove the URI from all components.

@gerritholl gerritholl linked a pull request Jan 10, 2025 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant