From 59a00c7e364993d3885f5c2546bfd54cd896401d Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 11:43:47 +0100 Subject: [PATCH 1/7] Add big caveat --- faqs/galaxy/collections_change_datatype.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/faqs/galaxy/collections_change_datatype.md b/faqs/galaxy/collections_change_datatype.md index 7ca0d4b4a36889..253bcb90ca6a3b 100644 --- a/faqs/galaxy/collections_change_datatype.md +++ b/faqs/galaxy/collections_change_datatype.md @@ -4,7 +4,7 @@ description: This will set the datatype for all files in your collection. Does n area: collections box_type: tip layout: faq -contributors: [shiltemann] +contributors: [shiltemann, hexylena] --- 1. Click on **Edit** {% icon galaxy-pencil %} next to the collection name in your history @@ -13,3 +13,10 @@ contributors: [shiltemann] - tip: you can start typing the datatype into the field to filter the dropdown menu 4. Click the **Save** button + +**Cannot find the feature?** + +If you are on a smaller Galaxy server, i.e. not one of the large (multi)national public servers, you may not be able to find this operation, and there is no indication it is missing or why it is disabled. + +Galaxy has recently started putting more features behind a setting and deployment configuration that needs to be enabled by the server administrator. +Your administrator will need to deploy Celery and potentially additionally flower and redis to their stack to enable changing the datatype of a collection. From f5f2aca1e2765ff05f2612e1128493b4c590c3f6 Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:06:29 +0100 Subject: [PATCH 2/7] link to docs --- faqs/galaxy/collections_change_datatype.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/faqs/galaxy/collections_change_datatype.md b/faqs/galaxy/collections_change_datatype.md index 253bcb90ca6a3b..8cc7ad7d8df0ce 100644 --- a/faqs/galaxy/collections_change_datatype.md +++ b/faqs/galaxy/collections_change_datatype.md @@ -18,5 +18,5 @@ contributors: [shiltemann, hexylena] If you are on a smaller Galaxy server, i.e. not one of the large (multi)national public servers, you may not be able to find this operation, and there is no indication it is missing or why it is disabled. -Galaxy has recently started putting more features behind a setting and deployment configuration that needs to be enabled by the server administrator. -Your administrator will need to deploy Celery and potentially additionally flower and redis to their stack to enable changing the datatype of a collection. +Galaxy has recently started putting [more features behind a setting and deployment configuration](https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks) that needs to be enabled by the server administrator. +Your administrator will need to deploy Celery and potentially additionally flower and redis to their stack to enable changing the datatype of a collection. Consider sending them also the link to the [GTN tutorial for setting up redis and flower]({% link topics/admin/tutorials/celery/tutorial.md %}). From 3e48ed6eeebeaab839d757f94b2ccb437b224804 Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:33:01 +0100 Subject: [PATCH 3/7] Add simler celery deployment option --- topics/admin/tutorials/celeryless/tutorial.md | 67 +++++++++++++++++++ 1 file changed, 67 insertions(+) create mode 100644 topics/admin/tutorials/celeryless/tutorial.md diff --git a/topics/admin/tutorials/celeryless/tutorial.md b/topics/admin/tutorials/celeryless/tutorial.md new file mode 100644 index 00000000000000..670fd5d7e58118 --- /dev/null +++ b/topics/admin/tutorials/celeryless/tutorial.md @@ -0,0 +1,67 @@ +--- +layout: tutorial_hands_on + +title: "Alternative Celery Deployment for Galaxy" +zenodo_link: "" +questions: +objectives: + - Setup the bare minimum configuration to get tasks working +time_estimation: "1h" +key_points: +contributions: + authorship: + - hexylena +requirements: + - type: "internal" + topic_name: admin + tutorials: + - ansible + - ansible-galaxy + - pulsar +subtopic: data +tags: + - ansible +--- + +Celery is a new component to the Galaxy world (ca 2023) and is a distributed task queue that *can* be used to run tasks asynchronously. It isn't mandatory, but you might find some features you expect to use to be missing without it. + +If you are running a large production deployment you probably want to follow the [Celery+Redis+Flower Tutorial]({% link topics/admin/tutorials/celery/tutorial.md %}). + +However, if you are running a smaller Galaxy you may not want to manage deploying Celery (past what Gravity does for you automatically), you may not want to add Redis to your stack, and you may not have need of Flower! + +> +> +> 1. TOC +> {:toc} +> +{: .agenda} + +# Configuring Galaxy to use Postgres + +AMQP is a message queue protocol which processes can use to pass messages between each other. While a real message queue like RabbitMQ is perhaps the most robust choice, there is an easier option: Postgres + +Add the following to your Galaxy configuration to use Postgres: + +```bash +amqp_internal_connection: "sqlalchemy+postgresql:///galaxy?host=/var/run/postgresql" +``` + +# Configuring Celery to use Postgres + +Celery would prefer you use Redis (a Key-Value store) as a backend to store results. But we have a database! So let's try using that instead: + +``` +enable_celery_tasks: true +celery_conf: + broker_url: null # This should default to using amqp_internal_connection + result_backend: "db+postgresql:///galaxy?host=/var/run/postgresql" + task_routes: + galaxy.fetch_data: galaxy.external + galaxy.set_job_metadata: galaxy.external +``` + +With that we should now be able to [use useful features like](https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks): + +- Changing the datatype of a collection. +- Exporting histories +- other things! From 8a04e60df4fbd8b3c20096d1fc78aafd6811c79e Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:33:49 +0100 Subject: [PATCH 4/7] add alternative --- topics/admin/tutorials/celery/tutorial.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/topics/admin/tutorials/celery/tutorial.md b/topics/admin/tutorials/celery/tutorial.md index 8aa280f9517917..96783d3c8d37fc 100644 --- a/topics/admin/tutorials/celery/tutorial.md +++ b/topics/admin/tutorials/celery/tutorial.md @@ -31,9 +31,6 @@ tags: - git-gat --- -# Overview - - Celery is a distributed task queue written in Python that can spawn multiple workers and enables asynchronous task processing on multiple nodes. It supports scheduling, but focuses more on real-time operations. From the Celery website: @@ -52,7 +49,9 @@ From the Celery website: > {: .quote cite="https://docs.celeryq.dev/en/stable/getting-started/introduction.html#what-s-a-task-queue"} -[A slideshow presentation on this subject is available](slides.html). +[A slideshow presentation on this subject is available](slides.html). + +If you are not interesting in managing Redis and Flower, you might be interested in the [lower-configuration deployment option]({% link topics/admin/tutorials/celeryless/tutorial.md %}). > > From ddc8d76a60d777bfd4c71d55c0e26f184f50ea8d Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:34:34 +0100 Subject: [PATCH 5/7] note --- faqs/galaxy/collections_change_datatype.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faqs/galaxy/collections_change_datatype.md b/faqs/galaxy/collections_change_datatype.md index 8cc7ad7d8df0ce..770f05ac861018 100644 --- a/faqs/galaxy/collections_change_datatype.md +++ b/faqs/galaxy/collections_change_datatype.md @@ -19,4 +19,4 @@ contributors: [shiltemann, hexylena] If you are on a smaller Galaxy server, i.e. not one of the large (multi)national public servers, you may not be able to find this operation, and there is no indication it is missing or why it is disabled. Galaxy has recently started putting [more features behind a setting and deployment configuration](https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks) that needs to be enabled by the server administrator. -Your administrator will need to deploy Celery and potentially additionally flower and redis to their stack to enable changing the datatype of a collection. Consider sending them also the link to the [GTN tutorial for setting up redis and flower]({% link topics/admin/tutorials/celery/tutorial.md %}). +Your administrator will need to deploy Celery and potentially additionally flower and redis to their stack to enable changing the datatype of a collection. Consider sending your Galaxy administrator the link to the [simpler deployment option]({% link topics/admin/tutorials/celeryless/tutorial.md %}) or more complex [GTN tutorial for setting up redis and flower]({% link topics/admin/tutorials/celery/tutorial.md %}). From 55599f3cb52135d6c7ddef3921f434e481be5379 Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:36:45 +0100 Subject: [PATCH 6/7] QKPLOs --- topics/admin/tutorials/celeryless/tutorial.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/topics/admin/tutorials/celeryless/tutorial.md b/topics/admin/tutorials/celeryless/tutorial.md index 670fd5d7e58118..68637121acea81 100644 --- a/topics/admin/tutorials/celeryless/tutorial.md +++ b/topics/admin/tutorials/celeryless/tutorial.md @@ -4,10 +4,14 @@ layout: tutorial_hands_on title: "Alternative Celery Deployment for Galaxy" zenodo_link: "" questions: + - What is *required* for Celery to work in Galaxy? objectives: - Setup the bare minimum configuration to get tasks working + - Avoid deploying, securing, and managing RabbitMQ and Redis and Flower time_estimation: "1h" key_points: + - While a combination of RabbitMQ and Redis is perhaps the most production ready, you can use Postgres as a backend for Celery + - This significantly simplifies operational complexity, and reduces the attack surface of your Galaxy. contributions: authorship: - hexylena @@ -17,7 +21,6 @@ requirements: tutorials: - ansible - ansible-galaxy - - pulsar subtopic: data tags: - ansible From 65f703a5a5a99b9b5c056b7551bed44b77ff09b2 Mon Sep 17 00:00:00 2001 From: Helena Rasche Date: Mon, 4 Nov 2024 12:39:18 +0100 Subject: [PATCH 7/7] ansible vars --- topics/admin/tutorials/celeryless/tutorial.md | 20 +++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/topics/admin/tutorials/celeryless/tutorial.md b/topics/admin/tutorials/celeryless/tutorial.md index 68637121acea81..291224dadcaa17 100644 --- a/topics/admin/tutorials/celeryless/tutorial.md +++ b/topics/admin/tutorials/celeryless/tutorial.md @@ -45,7 +45,7 @@ AMQP is a message queue protocol which processes can use to pass messages betwee Add the following to your Galaxy configuration to use Postgres: -```bash +```yaml amqp_internal_connection: "sqlalchemy+postgresql:///galaxy?host=/var/run/postgresql" ``` @@ -53,7 +53,7 @@ amqp_internal_connection: "sqlalchemy+postgresql:///galaxy?host=/var/run/postgre Celery would prefer you use Redis (a Key-Value store) as a backend to store results. But we have a database! So let's try using that instead: -``` +```yaml enable_celery_tasks: true celery_conf: broker_url: null # This should default to using amqp_internal_connection @@ -63,8 +63,24 @@ celery_conf: galaxy.set_job_metadata: galaxy.external ``` + With that we should now be able to [use useful features like](https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks): - Changing the datatype of a collection. - Exporting histories - other things! + +# Configuring with Ansible + +If you're using Ansible, this could also look like: + +```yaml +amqp_internal_connection: "sqlalchemy+{{ database_connection }}" +enable_celery_tasks: true +celery_conf: + broker_url: null # This should default to using amqp_internal_connection + result_backend: "db+{{ database_connection }}" + task_routes: + galaxy.fetch_data: galaxy.external + galaxy.set_job_metadata: galaxy.external +```