Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STAC hooks #386

Merged
merged 22 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
64a8fb7
Adding stac hooks for creating collection resource
Nazim-crim Sep 26, 2023
ab9c678
Adding create item rhook to create magpie resource when stac item is …
Nazim-crim Sep 27, 2023
85cd4ab
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Sep 27, 2023
5e91536
Adding exception handling and using collection id from request when c…
Nazim-crim Sep 28, 2023
4d8dc44
Adding method to extract thredd path from STAC links and removing unu…
Nazim-crim Sep 28, 2023
19813f0
Changing single quote by double quote and fixing spacing
Nazim-crim Sep 29, 2023
a826050
fixing regex and comments
Nazim-crim Oct 2, 2023
f298434
Adding recursive function to create resource tree, adding rollback of…
Nazim-crim Oct 2, 2023
b054e25
Adding error handling for extract_display_name and fixing other excep…
Nazim-crim Oct 2, 2023
afc39e0
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Oct 2, 2023
b63a9d7
Fixing comments
Nazim-crim Oct 2, 2023
b70c0c1
Adding missing quote and cleaning up code
Nazim-crim Oct 3, 2023
de8597b
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Oct 25, 2023
6891ca7
Modify create resource to use query filter for direct access to resource
Nazim-crim Oct 27, 2023
c4d6b6a
Fixing regex for collection_id
Nazim-crim Oct 30, 2023
f2bf000
Using full title to have the service name inside display_name
Nazim-crim Oct 30, 2023
5c18430
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Nov 1, 2023
b3be1c5
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Nov 9, 2023
f2efe27
Fixing import, removing redundant returns and moving get session outs…
Nazim-crim Nov 13, 2023
9b760c1
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Nov 13, 2023
3d883ef
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
Nazim-crim Nov 27, 2023
cb740de
Bump version: 1.38.0 → 1.39.0
Nazim-crim Nov 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,15 @@
[Unreleased](https://github.com/bird-house/birdhouse-deploy/tree/master) (latest)
------------------------------------------------------------------------------------------------------------------

[//]: # (list changes here, using '-' for each new entry, remove this when items are added)
## Changes

- Add a Magpie Webhook to create the Magpie resources corresponding to the STAC-API path elements when a `STAC-API`
`POST /collections/{collection_id}` or `POST /collections/{collection_id}/items/{item_id}` request is accomplished.
Nazim-crim marked this conversation as resolved.
Show resolved Hide resolved
- When creating the STAC `Item`, the `source` entry in `links` corresponding to a `THREDDS` file on the same instance
is used to define the Magpie `resource_display_name` corresponding to a file to be mapped later on
(eg: a NetCDF `birdhouse/test-data/tc_Anon[...].nc`).
- Checking same instance `source` path is necessary because `STAC` could refer to external assets, and we do not want
to inject Magpie resource that are not part of the active instance where the hook is running.

[1.38.0](https://github.com/bird-house/birdhouse-deploy/tree/1.38.0) (2023-11-21)
------------------------------------------------------------------------------------------------------------------
Expand Down
9 changes: 9 additions & 0 deletions birdhouse/components/stac/config/magpie/config.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ providers:
c4i: false
type: api
sync_type: api
hooks:
- type: response
path: "/stac/collections/?"
method: POST
target: /opt/birdhouse/src/magpie/hooks/stac_hooks.py:create_collection_resource
- type: response
path: "/stac/collections/[\\w-]+/items/?"
method: POST
target: /opt/birdhouse/src/magpie/hooks/stac_hooks.py:create_item_resource

permissions:
# create a default 'stac' resource under 'stac' service
Expand Down
136 changes: 136 additions & 0 deletions birdhouse/components/stac/config/magpie/stac_hooks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
These hooks will be running within Twitcher, using MagpieAdapter context, applied for STAC requests.

The code below can make use of any package that is installed by Magpie/Twitcher.

.. seealso::
Documentation about Magpie/Twitcher request/response hooks is available here:
https://pavics-magpie.readthedocs.io/en/latest/configuration.html#service-hooks
"""

import re
from typing import TYPE_CHECKING, List, Dict

from magpie.api.management.resource import resource_utils as ru
from magpie.api.requests import get_service_matchdict_checked
from magpie.models import Route
from magpie.utils import get_logger
from magpie.db import get_session_from_other
from ziggurat_foundations.models.services.resource import ResourceService

if TYPE_CHECKING:
from pyramid.response import Response
from sqlalchemy.orm.session import Session

LOGGER = get_logger("magpie.stac")

def create_collection_resource(response):
# type: (Response) -> Response
"""
Create the stac collection resource
"""
request = response.request
body = request.json
collection_id = body["id"]
try:
display_name = extract_display_name(body["links"])
except Exception as exc:
LOGGER.error("Error when extracting display_name from links %s %s", body["links"], str(exc), exc_info=exc)
return response

# note: matchdict reference of Twitcher owsproxy view is used, just so happens to be same name as Magpie
service = get_service_matchdict_checked(request)
# Getting a new session from the request, since the current session found in the request is already handled with his own transaction manager.
session = get_session_from_other(request.db)
try:
# Create the resource tree
create_resource_tree(f"stac/collections/{collection_id}", 0, service.resource_id , session, display_name)
session.commit()

except Exception as exc:
LOGGER.error("Unexpected error while creating the collection %s %s", display_name, str(exc), exc_info=exc)
session.rollback()
Nazim-crim marked this conversation as resolved.
Show resolved Hide resolved

return response

def create_item_resource(response):
# type: (Response) -> Response
"""
Create the stac item resource
"""
request = response.request
body = request.json
item_id = body["id"]
try:
display_name = extract_display_name(body["links"])
except Exception as exc:
LOGGER.error("Error when extracting display_name from links %s %s", body["links"], str(exc), exc_info=exc)
return response

# Get the <collection_id> from url -> /collections/{collection_id}/items
collection_id = re.search(r'(?<=collections/)[0-9a-zA-Z_.-]+?(?=/items)', request.url).group()

# note: matchdict reference of Twitcher owsproxy view is used, just so happens to be same name as Magpie
service = get_service_matchdict_checked(request)
# Getting a new session from the request, since the current session found in the request is already handled with his own transaction manager.
session = get_session_from_other(request.db)
try:
# Create the resource tree
create_resource_tree(f"stac/collections/{collection_id}/items/{item_id}", 0, service.resource_id, session, display_name)
session.commit()

except Exception as exc:
LOGGER.error("Unexpected error while creating the item %s %s", display_name, str(exc), exc_info=exc)
session.rollback()
fmigneault marked this conversation as resolved.
Show resolved Hide resolved

return response

def extract_display_name(links):
# type: (List[Dict[str, str]]) -> str
fmigneault marked this conversation as resolved.
Show resolved Hide resolved
"""
Extract THREDD path from a STAC links
"""
display_name = None
for link in links:
if link["rel"] == "source":
# Example of title `thredds:birdhouse/CMIP6`
display_name = link["title"]
break
if not display_name:
raise ValueError("The display name was not extracted properly")

return display_name
Nazim-crim marked this conversation as resolved.
Show resolved Hide resolved

def create_resource_tree(resource_tree, current_depth, parent_id, session, display_name):
# type: (str, int, int, session, str) -> None
Nazim-crim marked this conversation as resolved.
Show resolved Hide resolved
"""
Create the resource tree on Magpie
"""
tree = resource_tree.split("/")
# We are at the max depth of the tree.
if current_depth > len(tree) - 1:
return

resource_name = tree[current_depth]
query = session.query(ResourceService.model).filter(ResourceService.model.resource_name == resource_name, ResourceService.model.parent_id == parent_id)
resource = query.first()

if resource is not None:
# Since the resource exists, we can use its id to create the next resource.
parent_id = resource.resource_id
next_depth = current_depth + 1
create_resource_tree(resource_tree, next_depth, parent_id, session, display_name)

# The resource wasn't found in the current depth, we need to create it.
else:
# Creating the last resource in the tree, we need to use the display_name.
if current_depth == len(tree) - 1:
ru.create_resource(resource_name, display_name, Route.resource_type_name, parent_id, db_session=session)
else:
# Creating the resource somewhere in the middle of the tree before using its id.
node = ru.create_resource(resource_name, None, Route.resource_type_name, parent_id, db_session=session)
parent_id = node.json["resource"]["resource_id"]
next_depth = current_depth + 1
create_resource_tree(resource_tree, next_depth, parent_id, session, display_name)
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: "3.4"

services:
# extend twitcher with MagpieAdapter hooks employed for STAC proxied requests
twitcher:
volumes:
# NOTE: MagpieAdapter hooks are defined within Magpie config, but it is actually Twitcher proxy that runs them
# target mount location depends on 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable that is found under `birdhouse/config/twitcher/docker-compose-extra.yml`
- ./components/stac/config/magpie/config.yml:/opt/birdhouse/src/magpie/config/stac-config.yml:ro
fmigneault marked this conversation as resolved.
Show resolved Hide resolved
- ./components/stac/config/magpie/stac_hooks.py:/opt/birdhouse/src/magpie/hooks/stac_hooks.py:ro
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ services:
twitcher:
volumes:
# NOTE: MagpieAdapter hooks are defined within Magpie config, but it is actually Twitcher proxy that runs them
# target mount location depends on main docker-compose 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable
# target mount location depends on 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable that is found under `birdhouse/config/twitcher/docker-compose-extra.yml`
- ./components/weaver/config/magpie/config.yml:/opt/birdhouse/src/magpie/config/weaver-config.yml:ro
- ./components/weaver/config/magpie/weaver_hooks.py:/opt/birdhouse/src/magpie/hooks/weaver_hooks.py:ro
Loading