Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update weaver 6.1.0 #489

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,28 @@
[Unreleased](https://github.com/bird-house/birdhouse-deploy/tree/master) (latest)
------------------------------------------------------------------------------------------------------------------

[//]: # (list changes here, using '-' for each new entry, remove this when items are added)
## Changes

- Weaver: update `weaver` component default version to [6.1.1](https://github.com/crim-ca/weaver/tree/6.1.1).

### Relevant changes
* Add support of *OGC API - Processes - Part 3: Workflows and Chaining* with *Nested Process* ad-hoc workflow.
* Add support of *OGC API - Processes - Part 3: Workflows and Chaining* with *Remote Collection* (STAC and OGC).
* Add support of *OGC API - Processes - Part 4: Job Management* endpoints for job "pending" creation and execution.
* Add support of *OGC API - Processes - Part 4: Job Management* endpoints for job provenance as *W3C PROV* metadata.
* Multiple alignment and fixes related to latest *OGC API - Processes - Part 1: Core* definitions regarding handling
of input parameters and headers when submitting jobs to obtain alternate result representations and behavior.
* Add HTML responses by default via web browsers or as requested by `Accept` headers or `f` query parameter.
* Add improved CWL schema validation with `Weaver`-specific definitions where applicable
(see https://github.com/crim-ca/weaver/tree/master/weaver/schemas/cwl).

- Weaver: modifications to `proxy` configurations for `weaver`

* Add `WEAVER_ALT_PREFIX` optional variable that auto-configures `WEAVER_ALT_PREFIX_PROXY_LOCATION`,
which allows setting an alternate endpoint to redirect requests to `weaver`.
It uses `/ogcapi` by default which is a very common expectation from servers supporting OGC standards.
* Use the `TWITCHER_VERIFY_PATH` approach to accelerate access of `weaver` resources authorization.
* Modify proxy pass definitions and URL prefixes to resolve correctly with HTML resources.

[2.7.1](https://github.com/bird-house/birdhouse-deploy/tree/2.7.1) (2024-12-20)
------------------------------------------------------------------------------------------------------------------
Expand Down
10 changes: 10 additions & 0 deletions birdhouse/components/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,16 @@ Customizing the Component
Further ``docker-compose-extra.yml`` could be needed to define
any other ``volumes`` entries where these component would need to be mounted to.

- Optionally, set ``WEAVER_ALT_PREFIX`` with any desired prefix location to use as alternate alias
for the ``/weaver/`` endpoint. The ``/weaver/`` endpoint will remain available.
The ``WEAVER_ALT_PREFIX`` alias defines an *additional* equivalent location to access the service.
By default ``/ogcapi`` is employed as a common value for this suite of OGC standards.

Note that custom prefix values, if specified, should start with a leading ``/``, and leave out any trailing ``/``.
The prefix can also use multiple levels as desired (e.g.: ``/my/custom/path``).

If the original ``/weaver/`` endpoint is deemed sufficient, and you would rather omit this additional alias
entirely, the ``WEAVER_ALT_PREFIX`` variable should be explicitly set to an empty value.


.. _finch: https://github.com/bird-house/finch
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,13 @@ permissions:
group: anonymous
action: create

# HTML rendering files
- service: ${WEAVER_MANAGER_NAME}
resource: /static
permission: read
group: anonymous
action: create

# Process deployment (write) and listing (read)
# use 'read-match' to allow only listing, and not describe underlying processes (require 'read' on them individually)
- service: ${WEAVER_MANAGER_NAME}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,25 +1,51 @@

location = /weaver-auth {
internal;
# note: using 'TWITCHER_VERIFY_PATH' path to avoid performing the request via proxy 'TWITCHER_PROTECTED_PATH'
# This ensures that access is validated for the user, but does not trigger its access/download twice.
# It is also more efficient, since less contents are transferred/buffered.
proxy_pass ${BIRDHOUSE_PROXY_SCHEME}://${BIRDHOUSE_FQDN_PUBLIC}${TWITCHER_VERIFY_PATH}/$request_uri;
proxy_pass_request_body off;
proxy_set_header Host $host;
proxy_set_header Content-Length "";
proxy_set_header X-Original-URI $request_uri;
proxy_set_header X-Forwarded-Proto $real_scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Host $host:$server_port;
}

location = /${WEAVER_MANAGER_NAME} {
return 301 /${WEAVER_MANAGER_NAME}/$is_args$args;
}
location ~ ^/${WEAVER_MANAGER_NAME}/(.*)$ {
auth_request /weaver-auth;
auth_request_set $auth_status $upstream_status;

# NOTE:
# Inject the 'WEAVER_MANAGER_NAME' prefix here to align with 'SCRIPT_NAME' in the docker-compose config.
# This is needed to help UI elements resolve the full URI path with proxy service prefixes since the
# generated locations returned that must be interpreted/retrieved by the client/browser would otherwise
# not be aware of the proxy redirection path prefix, leading to unresolved resources.
proxy_pass http://weaver:4001/${WEAVER_MANAGER_NAME}/$1$is_args$args;
proxy_set_header Host $http_host;
proxy_set_header X-Original-URI $request_uri;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $real_scheme;
proxy_set_header X-Forwarded-Host $http_host:$server_port;
proxy_buffering off;
}

# NOTE:
# Redirect to internal network of twitcher with Weaver root endpoint and alias allows to set
# the same 'magpie' permissions on the 'weaver' service defined by "WEAVER_MANAGER_NAME".
# This allows verification of the same service user/group permissions references regardless
# whether the *shortcut* Weaver endpoint, the alias or the explicit 'twitcher' proxy route is used.
# redirect EMS/ADES to actual secured Weaver path
#location /${WEAVER_CONFIG} {
# return 302 ${BIRDHOUSE_PROXY_SCHEME}://${BIRDHOUSE_FQDN_PUBLIC}${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME};
#}

location /${WEAVER_MANAGER_NAME} {
proxy_pass ${BIRDHOUSE_PROXY_SCHEME}://${BIRDHOUSE_FQDN_PUBLIC}${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME};
proxy_set_header Host $host;
proxy_buffering off;
include /etc/nginx/conf.d/cors.include;
location = ${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME} {
return 301 /${WEAVER_MANAGER_NAME}/$is_args$args;
}
location ${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME}/ {
Copy link
Collaborator

@mishaschwartz mishaschwartz Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
location ${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME}/ {
location ~ ^${TWITCHER_PROTECTED_PATH}/${WEAVER_MANAGER_NAME}/(.*)$ {

We need to match something in order to include the subpath in the redirect (see next comment)

return 301 /${WEAVER_MANAGER_NAME}/$is_args$args;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return 301 /${WEAVER_MANAGER_NAME}/$is_args$args;
return 308 /${WEAVER_MANAGER_NAME}/$1$is_args$args;

Actually include the matched subpath in the redirect.

Also, this should be 308 so that clients don't change a POST request to GET when performing the redirect.

(This was discovered when running the weaver/post-docker-compose-up script which still uses the twitcher proxy routes)

}

# NOTE:
# this is needed only if not using the location already provided by the core configuration
# see 'birdhouse/components/proxy/conf.d/all-services.include.template'
# location where process job outputs will be accessible
#location ^~ ${WEAVER_WPS_OUTPUTS_PATH}/ {
# alias ${WEAVER_WPS_OUTPUTS_DIR}/;
#}
# optional alternate endpoint to access weaver (see 'components/weaver/default.env')
${WEAVER_ALT_PREFIX_PROXY_LOCATION}
26 changes: 24 additions & 2 deletions birdhouse/components/weaver/default.env
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,15 @@ EXTRA_VARS='
$WEAVER_WPS_PROVIDERS_MAX_TIME
$WEAVER_WPS_PROVIDERS_RETRY_COUNT
$WEAVER_WPS_PROVIDERS_RETRY_AFTER
$WEAVER_ALT_PREFIX_PROXY_LOCATION
'
# extend the original 'VARS' from 'birdhouse/birdhouse-compose.sh' to employ them for template substitution
# adding them to 'VARS', they will also be validated in case of override of 'default.env' using 'env.local'
VARS="$VARS $EXTRA_VARS"

OPTIONAL_VARS="
$OPTIONAL_VARS
\$WEAVER_ALT_PREFIX
\$WEAVER_DOCKER
\$WEAVER_VERSION
\$WEAVER_WORKER_IMAGE
Expand All @@ -53,7 +55,7 @@ OPTIONAL_VARS="
export WEAVER_CONFIG=HYBRID

# default release version that will be used to fetch docker images (API mananger & celery workers services)
export WEAVER_VERSION=5.6.1
export WEAVER_VERSION=6.1.1
export WEAVER_DOCKER=pavics/weaver
export WEAVER_IMAGE='${WEAVER_DOCKER}:${WEAVER_VERSION}'
export WEAVER_MANAGER_IMAGE='${WEAVER_IMAGE}-manager'
Expand All @@ -63,7 +65,8 @@ export WEAVER_IMAGE_URI='registry.hub.docker.com/${WEAVER_IMAGE}'
# default release of the MongoDB version employed by Weaver
# NOTE:
# MongoDB>=5.0 is REQUIRED for Weaver>=4.5.0
export WEAVER_MONGODB_VERSION=5.0.4
# MongoDB==7.x works, but default remains 5.0 to avoid DB migration issues (update manually as desired)
mishaschwartz marked this conversation as resolved.
Show resolved Hide resolved
export WEAVER_MONGODB_VERSION=5.0
# URL is used by both Weaver API and Celery Worker
# it should contain the docker service name as host to map using shared link between images
# if credentials are desired, they can be defined with the override of the URL variable
Expand Down Expand Up @@ -96,6 +99,17 @@ export WEAVER_WPS_OUTPUTS_PATH="/wpsoutputs/weaver"
export WEAVER_WPS_OUTPUTS_DIR='${BIRDHOUSE_WPS_OUTPUTS_DIR}/weaver'
export WEAVER_WPS_WORKDIR="/tmp/wps_workdir/weaver"

# Optional alternate endpoint that will redirect to Weaver.
# If explicitly set to empty value, it will not be configured in the proxy.
export WEAVER_ALT_PREFIX=/ogcapi
fmigneault marked this conversation as resolved.
Show resolved Hide resolved
export WEAVER_ALT_PREFIX_PROXY_LOCATION='
$([ -z "${WEAVER_ALT_PREFIX}" ] && echo "" || echo "
location ~ ^${WEAVER_ALT_PREFIX}(.*)\$ {
mishaschwartz marked this conversation as resolved.
Show resolved Hide resolved
return 301 /${WEAVER_MANAGER_NAME}\$1\$is_args\$args;
}
")
'

# logging
export WEAVER_MANAGER_LOG_LEVEL=INFO
export WEAVER_WORKER_LOG_LEVEL=INFO
Expand All @@ -116,6 +130,7 @@ export WEAVER_UNREGISTER_DROPPED_PROVIDERS="False"

export DELAYED_EVAL="
$DELAYED_EVAL
WEAVER_ALT_PREFIX_PROXY_LOCATION
WEAVER_WPS_OUTPUTS_DIR
WEAVER_MONGODB_DATA_DIR
WEAVER_MONGODB_URL
Expand All @@ -124,3 +139,10 @@ export DELAYED_EVAL="
WEAVER_MANAGER_IMAGE
WEAVER_WORKER_IMAGE
"

COMPONENT_DEPENDENCIES="
$COMPONENT_DEPENDENCIES
./components/wps_outputs-volume
mishaschwartz marked this conversation as resolved.
Show resolved Hide resolved
./components/magpie
mishaschwartz marked this conversation as resolved.
Show resolved Hide resolved
./components/twitcher
"
4 changes: 4 additions & 0 deletions birdhouse/components/weaver/docker-compose-extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ services:
image: ${WEAVER_MANAGER_IMAGE}
environment:
HOSTNAME: ${BIRDHOUSE_FQDN}
# 'HTTP_HOST' and 'SCRIPT_NAME' are used to guide pyramid in the resolution of resources, such as
# when invoking the 'static_url' endpoint, so it can be made aware of reverse-proxy context
HTTP_HOST: ${BIRDHOUSE_FQDN}
SCRIPT_NAME: /${WEAVER_MANAGER_NAME}
FORWARDED_ALLOW_IPS: "*"
#env_file:
# - ./components/mongodb/credentials.env
Expand Down
Loading