Prefect agent latency #2908

asandeep · 2020-07-02T06:47:42Z

asandeep
Jul 2, 2020

Looks like agent process loops in increasing intervals to look for available flow runs, which goes up to 10 secs max. This introduces latency between when the flow is triggered and when it is actually picked up for execution.

From slack discussion, the main reason for implementing this backoff is to take some polling pressure off the GraphQL API, which does makes sense. However, for someone switching from Airflow due to latency it introduces between task runs, this can be a big deciding factor.

So, just wanted to open this up for discussion and to understand if this is something that is being considered by Prefect team.

Answered by asandeep

Jul 3, 2020

@cicdw @joshmeek I was able to carve out a quick POC PR to demonstrate the idea suggested above: #2914

Would love to get your inputs before proceeding further on this.

View full answer

joshmeek · 2020-07-02T12:52:53Z

joshmeek
Jul 2, 2020

Hi @asandeep this is definitely something that could be configurable with an enhancement. The loop intervals are simply declared like this:

# Loop intervals for query sleep backoff
loop_intervals = {
       0: 0.25,
       1: 0.5,
       2: 1.0,
       3: 2.0,
       4: 4.0,
       5: 8.0,
       6: 10.0,
}

It would be straightforward to add an agent loop interval that could be set on the agent directly and through config/env var. If that's something people are open to I can make the PR but I do want to make sure we have a minimum set because anything subsecond is trivial.

0 replies

asandeep · 2020-07-02T14:31:48Z

asandeep
Jul 2, 2020
Author

Thanks @joshmeek for your response.

Yes I think having flexibility to configure the loop interval will allow people to play around and keep a value that works for their workloads.

0 replies

cicdw · 2020-07-02T15:06:05Z

cicdw
Jul 2, 2020
Maintainer

Hi @asandeep - for my own understanding, what are your latency needs and why? Because Flow runs tend to be on the order of minutes to hours for most people, the multisecond spin-up latency is usually not a concern but I'm curious to learn more!

In the meantime, I definitely agree with @joshmeek on introducing a more configurable backoff.

0 replies

asandeep · 2020-07-02T16:26:41Z

asandeep
Jul 2, 2020
Author

Hello @cicdw

I am currently working on a service that accepts files in various formats and converts them to HTML/TEXT. The service is expected to be consumed both realtime and for bulk processing.

For realtime usecase, expectation is to trigger the flow ASAP, convert the file and show HTML/TEXT content to user. Conversion itself can take several seconds to complete, however the current latency in kicking off the flow itself is blocking me from using the service for realtime usecase.

For batch processing, as you said, its bearable to have such latencies.

Just thinking out loud here. If it was possible to make distinction between creating a flow run and submitting a flow run, where creating a flow run would additionaly poke agent to wake up and execute the flow run just submitted. That would help triggering the flow immediately where required while keeping the regular polling backoff to reduce API pressure.

0 replies

asandeep · 2020-07-03T05:55:55Z

asandeep
Jul 3, 2020
Author

@cicdw @joshmeek I was able to carve out a quick POC PR to demonstrate the idea suggested above: #2914

Would love to get your inputs before proceeding further on this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefect agent latency #2908

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Prefect agent latency #2908

asandeep Jul 2, 2020

Replies: 5 comments

joshmeek Jul 2, 2020

asandeep Jul 2, 2020 Author

cicdw Jul 2, 2020 Maintainer

asandeep Jul 2, 2020 Author

asandeep Jul 3, 2020 Author

asandeep
Jul 2, 2020

joshmeek
Jul 2, 2020

asandeep
Jul 2, 2020
Author

cicdw
Jul 2, 2020
Maintainer

asandeep
Jul 2, 2020
Author

asandeep
Jul 3, 2020
Author