diff --git a/content/news/2024-11-04-biohackathon/index.md b/content/news/2024-11-04-biohackathon/index.md index bf570a82a2..31ddc327ab 100644 --- a/content/news/2024-11-04-biohackathon/index.md +++ b/content/news/2024-11-04-biohackathon/index.md @@ -50,7 +50,7 @@ Galaxy already tracks the provenance of every tool execution, and we used this i Additionally, we explored ways to enable shared caching among users who opt in. This will be especially useful in collaborative settings, such as training sessions, where large-scale analyses can be performed without unnecessary computational overhead. ### Scheduling with Sustainability in Mind -Galaxy’s Total Perspective Vortex (TPV) plugin offers powerful scheduling capabilities by routing jobs to destinations based on custom rules. At the Biohackathon, we expanded TPV's functionality to prioritize sustainability. We gathered detailed job statistics from Galaxy's database and compute nodes, but also from the remote job execution endpoints called [Pulsar](https://github.com/galaxyproject/pulsar). Because Pulsar endpoints don't require any port to be open to the public, we needed to find a way where the Pulsar could send this information actively. This was achived by a script on the Pulsar endpoint that consumes the queue's status and other information and sends it to a message queue ([RabbitMQ](https://www.rabbitmq.com/)) from which another script picks it up and sends it to our InfluxDB server. We developed ranking algorithms that direct jobs to the most environmentally friendly destinations. These algorithms take into account factors such as energy efficiency and resource usage to ensure that Galaxy workflows have a smaller carbon footprint. +Galaxy’s Total Perspective Vortex (TPV) plugin offers powerful scheduling capabilities by routing jobs to destinations based on custom rules. At the BioHackathon, we expanded TPV's functionality to prioritize sustainability. We gathered detailed job statistics from Galaxy's database and compute nodes, but also from the remote job execution endpoints called [Pulsar](https://github.com/galaxyproject/pulsar). Because Pulsar endpoints don't require any port to be open to the public, we needed to find a way where Pulsar could send this information actively. This was achieved by a script on the Pulsar endpoint that consumes the queue's status and other information and sends it to a message queue ([RabbitMQ](https://www.rabbitmq.com/)) from which another script picks it up and sends it to our InfluxDB server. We developed ranking algorithms that direct jobs to the most environmentally friendly destinations. These algorithms take into account factors such as energy efficiency and resource usage to ensure that Galaxy workflows have a smaller carbon footprint. ### Network Boot Skill Sharing Galaxy Europe's workload fluctuates, with idling servers during holidays and heavy job queues in peak seasons. We are currently using an OpenStack instance where we spin up our images, so called 'Virtual Galaxy Compute Nodes' ([VGCN](https://github.com/usegalaxy-eu/vgcn)) which contain everything needed to pick up jobs from [HTCondor](https://htcondor.org/) and start crunching numbers.