-
Looks like agent process loops in increasing intervals to look for available flow runs, which goes up to 10 secs max. This introduces latency between when the flow is triggered and when it is actually picked up for execution. From slack discussion, the main reason for implementing this backoff is to take some polling pressure off the GraphQL API, which does makes sense. However, for someone switching from Airflow due to latency it introduces between task runs, this can be a big deciding factor. So, just wanted to open this up for discussion and to understand if this is something that is being considered by Prefect team. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Hi @asandeep this is definitely something that could be configurable with an enhancement. The loop intervals are simply declared like this: # Loop intervals for query sleep backoff
loop_intervals = {
0: 0.25,
1: 0.5,
2: 1.0,
3: 2.0,
4: 4.0,
5: 8.0,
6: 10.0,
} It would be straightforward to add an agent loop interval that could be set on the agent directly and through config/env var. If that's something people are open to I can make the PR but I do want to make sure we have a minimum set because anything subsecond is trivial. |
Beta Was this translation helpful? Give feedback.
-
Thanks @joshmeek for your response. Yes I think having flexibility to configure the loop interval will allow people to play around and keep a value that works for their workloads. |
Beta Was this translation helpful? Give feedback.
-
Hi @asandeep - for my own understanding, what are your latency needs and why? Because Flow runs tend to be on the order of minutes to hours for most people, the multisecond spin-up latency is usually not a concern but I'm curious to learn more! In the meantime, I definitely agree with @joshmeek on introducing a more configurable backoff. |
Beta Was this translation helpful? Give feedback.
-
Hello @cicdw I am currently working on a service that accepts files in various formats and converts them to HTML/TEXT. The service is expected to be consumed both realtime and for bulk processing. For realtime usecase, expectation is to trigger the flow ASAP, convert the file and show HTML/TEXT content to user. Conversion itself can take several seconds to complete, however the current latency in kicking off the flow itself is blocking me from using the service for realtime usecase. For batch processing, as you said, its bearable to have such latencies. Just thinking out loud here. If it was possible to make distinction between creating a flow run and submitting a flow run, where creating a flow run would additionaly poke agent to wake up and execute the flow run just submitted. That would help triggering the flow immediately where required while keeping the regular polling backoff to reduce API pressure. |
Beta Was this translation helpful? Give feedback.
-
@cicdw @joshmeek I was able to carve out a quick POC PR to demonstrate the idea suggested above: #2914 Would love to get your inputs before proceeding further on this. |
Beta Was this translation helpful? Give feedback.
@cicdw @joshmeek I was able to carve out a quick POC PR to demonstrate the idea suggested above: #2914
Would love to get your inputs before proceeding further on this.