Skip to content

Commit

Permalink
Update content/news/2024-11-04-biohackathon/index.md
Browse files Browse the repository at this point in the history
Co-authored-by: Beatriz Serrano-Solano <[email protected]>
  • Loading branch information
paulzierep and beatrizserrano authored Dec 12, 2024
1 parent aa739ac commit 7142d60
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion content/news/2024-11-04-biohackathon/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Galaxy already tracks the provenance of every tool execution, and we used this i
Additionally, we explored ways to enable shared caching among users who opt in. This will be especially useful in collaborative settings, such as training sessions, where large-scale analyses can be performed without unnecessary computational overhead.
### Scheduling with Sustainability in Mind

Galaxy’s Total Perspective Vortex (TPV) plugin offers powerful scheduling capabilities by routing jobs to destinations based on custom rules. At the Biohackathon, we expanded TPV's functionality to prioritize sustainability. We gathered detailed job statistics from Galaxy's database and compute nodes, but also from the remote job execution endpoints called [Pulsar](https://github.com/galaxyproject/pulsar). Because Pulsar endpoints don't require any port to be open to the public, we needed to find a way where the Pulsar could send this information actively. This was achived by a script on the Pulsar endpoint that consumes the queue's status and other information and sends it to a message queue ([RabbitMQ](https://www.rabbitmq.com/)) from which another script picks it up and sends it to our InfluxDB server. We developed ranking algorithms that direct jobs to the most environmentally friendly destinations. These algorithms take into account factors such as energy efficiency and resource usage to ensure that Galaxy workflows have a smaller carbon footprint.
Galaxy’s Total Perspective Vortex (TPV) plugin offers powerful scheduling capabilities by routing jobs to destinations based on custom rules. At the BioHackathon, we expanded TPV's functionality to prioritize sustainability. We gathered detailed job statistics from Galaxy's database and compute nodes, but also from the remote job execution endpoints called [Pulsar](https://github.com/galaxyproject/pulsar). Because Pulsar endpoints don't require any port to be open to the public, we needed to find a way where Pulsar could send this information actively. This was achieved by a script on the Pulsar endpoint that consumes the queue's status and other information and sends it to a message queue ([RabbitMQ](https://www.rabbitmq.com/)) from which another script picks it up and sends it to our InfluxDB server. We developed ranking algorithms that direct jobs to the most environmentally friendly destinations. These algorithms take into account factors such as energy efficiency and resource usage to ensure that Galaxy workflows have a smaller carbon footprint.
### Network Boot Skill Sharing
Galaxy Europe's workload fluctuates, with idling servers during holidays and heavy job queues in peak seasons.
We are currently using an OpenStack instance where we spin up our images, so called 'Virtual Galaxy Compute Nodes' ([VGCN](https://github.com/usegalaxy-eu/vgcn)) which contain everything needed to pick up jobs from [HTCondor](https://htcondor.org/) and start crunching numbers.
Expand Down

0 comments on commit 7142d60

Please sign in to comment.