Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker shutdown may lead to failed function invocations with ObjectDisposedException (which may be persisted by the WebJobs listener) #2687

Open
danielmarbach opened this issue Sep 6, 2024 · 27 comments
Labels
bug Something isn't working needs-discussion

Comments

@danielmarbach
Copy link
Contributor

Description

When running the attached reproduction for a while (usually after a few hours) in functions using Linux (we tried it in Switzerland North) the following exception occurs

Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.ServiceBusMessageHandler_workday_validation_queue
 ---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
Exception: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'IServiceProvider'.
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88
Stack:    at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88
   at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /src/azure-functions-host/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 101
   at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 82
   at Microsoft.Azure.WebJobs.Host.Executors.VoidTaskMethodInvoker`2.InvokeAsync(TReflected instance, Object[] arguments) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\VoidTaskMethodInvoker.cs:line 20
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:line 53
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 581
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 527
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 306
   --- End of inner exception stack trace ---
   at Microsoft.Azure.WebJobs.ServiceBus.SessionMessageProcessor.CompleteProcessingMessageAsync(ServiceBusSessionMessageActions actions, ServiceBusReceivedMessage message, FunctionResult result, CancellationToken cancellationToken)
   at Microsoft.Azure.WebJobs.ServiceBus.Listeners.ServiceBusListener.ProcessSessionMessageAsync(ProcessSessionMessageEventArgs args)
   at Azure.Messaging.ServiceBus.ServiceBusProcessor.OnProcessSessionMessageAsync(ProcessSessionMessageEventArgs args)
   at Azure.Messaging.ServiceBus.ServiceBusSessionProcessor.OnProcessSessionMessageAsync(ProcessSessionMessageEventArgs args)
   at Azure.Messaging.ServiceBus.SessionReceiverManager.OnMessageHandler(EventArgs args)
   at Azure.Messaging.ServiceBus.ReceiverManager.ProcessOneMessage(ServiceBusReceivedMessage triggerMessage, CancellationToken cancellationToken)

At the moment we don't think it is anything particular to do with the session in ASB or the ASB integration but rather with the function context handling in the middleware.

It could also be that it is related to the other issues in regard to ObjectDisposedExceptions.

#1929

I have already raised two PRs surrounding my research in this area but I'm unsure if they'll help

#2686
#2685

I have also made a comment about the use of TaskCompletionSource in the synchronization logic between the middleware.

Steps to reproduce

Run the repro on Service Bus for a while Experiment.zip

@danielmarbach danielmarbach added the potential-bug Items opened using the bug report template, not yet triaged and confirmed as a bug label Sep 6, 2024
@jsparent
Copy link

jsparent commented Sep 11, 2024

We're getting the same kind of errors all over the place when using consumption plans; no behaviour so far that could help identify the faulting process. It fails about 5% of the time, randomly on our Azure Function App in consumption plan. Using App Service Plan on those function resolve the issue, but we would like to keep on using consumption plans.

Issue started a few weeks ago when updating references to latest versions.

Sample exception:

Result: Failure Exception: System.ObjectDisposedException: Cannot access a disposed object. Object name: 'IServiceProvider'. 
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException() 
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope() 
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48 
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105 
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 91 
at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88 Stack: 
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException() 
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope() 
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48 
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105 
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 91 
at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88

@domenichelfenstein
Copy link

domenichelfenstein commented Sep 12, 2024

@danielmarbach 's issue was originally mine and he kinda posted it for me (thanks again)

I'm still getting the same exceptions you posted @jsparent
However, I've tried publishing the function in Switzerland North and Europe West and I'm only getting the exceptions in Switzerland North, so far (running the function for two days now)

Where do you host your function, @jsparent ?

@jsparent
Copy link

@domenichelfenstein our function apps are located in Canada Central

@domenichelfenstein
Copy link

Are you using Windows or Linux, @jsparent ?

@jsparent
Copy link

@domenichelfenstein We're using Linux

@danr-stadion
Copy link

We are also experiencing the same issue. Our function app is running on a Linux-hosted consumption plan in the West Europe region.

For reference:

Result: Failure
Exception: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'IServiceProvider'.
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 46
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context)
at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88
Stack: at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 46
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context)
at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88

@jbenettius
Copy link

jbenettius commented Sep 27, 2024

Also experiencing the same error. Linux hosted consumption plan in East US. The function is queue triggered and has a queue output.

Exception: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'IServiceProvider'.
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c.b__1_2(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 57
at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89
Stack: at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c.b__1_2(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 57
at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89

@jsparent
Copy link

UPDATE:
We stopped having this error on September 21st 2024 from 18:00 EST. We didn't change anything, we even reverted back the original code afterwards, and we haven't had any problem since.

It happened on ~10 differents Function Apps, and it all stopped at once. While I'm glad this is no longer an issue, I'm pretty sure this is something happening in the "other" layer of the Function App Service. Hardware or routing maybe?

@domenichelfenstein
Copy link

Same here: Problem seems to be gone. But without an acknowledgement of the problem by Microsoft and a clear message what they've changed, I don't know if these exception could reappear all of a sudden.

@AlexMasson
Copy link

Same here: Problem seems to be gone. But without an acknowledgement of the problem by Microsoft and a clear message what they've changed, I don't know if these exception could reappear all of a sudden.

Unfortunately we encountered the error again a few hours ago (link issue).

@jbenettius
Copy link

We received this same error with a Timer Triggered Durable Function running on an Windows Elastic Premium plan. I opened up a support ticket and will report back any findings.

@templarvii
Copy link

templarvii commented Oct 14, 2024

We encounter the same issue.

We have Azure Functions on pricing tier "Consumption" and "Standard". Both hosted on "Windows" and "West Europe". All of them have the same issue. It seems that the problem started with the migration from ".NET6 in-process" to ".NET8 isolated".

I can visualize this with the following KQL query:

exceptions | where details contains "System.ObjectDisposedException"
 | summarize count() by week_of_year(timestamp)
 | order by Column1 asc  

Image

@satvu satvu added area: http Items related to experience improvements for HTTP triggers needs-investigation and removed Needs: Triage (Functions) labels Oct 15, 2024
@danielmarbach
Copy link
Contributor Author

Fixed race condition that could lead to an ObjectDisposedException when using the ServiceBusSessionProcessor.

https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/servicebus/Azure.Messaging.ServiceBus/CHANGELOG.md#bugs-fixed-9

That's the only potentially related issue I could find in the Service Bus SDK.

It makes me sorrowful that there is no traction on this issue for such a long time.

@domenichelfenstein
Copy link

@danielmarbach the bug fix you mentioned could be related.
However, the Microsoft.Azure.Functions.Worker.Extensions.ServiceBus version in my error-throwing sample project is 5.22.0 which uses the version 7.18.1 of Azure.Messaging.ServiceBus. The fix has been introduced in 7.16.1.
So, this fix is definitely not solving the issue many of us are encountering here.

I second the sorrowful feeling of @danielmarbach

@NoleNerd
Copy link

I've been looking at a similar exception from an Orchestrator function without much luck. It has so far only been a one-off occurrence.

  • net8.0
  • isolated
  • dockerized (linux)
  • Function App running in Container Apps
  • microsoft.azure.functions.worker.extensions.servicebus => 5.21.0
"message": "Function 'MainOrchestrator' failed with an unhandled exception.",
"details": "Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.MainOrchestrator
 ---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
Exception: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'IServiceProvider'.
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Core\\Context\\DefaultFunctionContext.cs:line 46
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Core\\Hosting\\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context)
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Grpc\\Handlers\\InvocationHandler.cs:line 88
Stack:    at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Core\\Context\\DefaultFunctionContext.cs:line 46
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c__DisplayClass3_0`1.<UseMiddleware>b__1(FunctionContext context) in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Core\\Hosting\\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 105
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context)
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\\a\\_work\\1\\s\\src\\DotNetWorker.Grpc\\Handlers\\InvocationHandler.cs:line 88
   at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /src/azure-functions-host/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 101
   at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 82
   at Microsoft.Azure.WebJobs.Script.Description.FunctionGenerator.Coerce[T](Task`1 src) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionGenerator.cs:line 225
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionInvoker.cs:line 53
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.OutOfProcMiddleware.<>c__DisplayClass10_0.<<CallOrchestratorAsync>b__0>d.MoveNext() in D:\\a\\_work\\1\\s\\src\\WebJobs.Extensions.DurableTask\\OutOfProcMiddleware.cs:line 130
--- End of stack trace from previous location ---
   at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor`1.<>c__DisplayClass7_0.<<TryExecuteAsync>b__0>d.MoveNext() in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\TriggeredFunctionExecutor.cs:line 51
--- End of stack trace from previous location ---
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionExecutor.cs:line 581
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionExecutor.cs:line 527
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionExecutor.cs:line 306
   --- End of inner exception stack trace ---
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionExecutor.cs:line 352
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance, CancellationToken cancellationToken) in D:\\a\\_work\\1\\s\\src\\Microsoft.Azure.WebJobs.Host\\Executors\\FunctionExecutor.cs:line 108"

@jviau
Copy link
Contributor

jviau commented Oct 30, 2024

@danielmarbach, @NoleNerd, @jbenettius and any others: can you quantify any impact you have here beyond an exception in your logs?

We have investigated and we believe this is noise from application shutdown (most likely a scale in). The call stack points to the root service provider being disposed while we are in the middle of a function invocation. Root service provider disposal only happens on application shutdown.

We are evaluating if we want to clarify this exception to indicate the invocation is aborted due to app shutdown but we have the following to consider:

  1. No guarantee it will fully "solve" the issue. Given the line-by-line execution, the root provider can always be disposed between our shutdown check and accessing this property.
  2. Right now it appears to be noise, so it is not a high priority. We will re-evaluate if we observe impact to customer apps.

@satvu satvu added Needs: Author Feedback needs-discussion and removed needs-investigation area: http Items related to experience improvements for HTTP triggers potential-bug Items opened using the bug report template, not yet triaged and confirmed as a bug labels Oct 30, 2024
@domenichelfenstein
Copy link

domenichelfenstein commented Oct 31, 2024

@jviau
On the contrary: It's not just background noise. The impact is, that messages are not being processed (and in our case therefore deadlettered).
The impact is quite high!
I urge you to reconsider your priorisation on this issue.

It makes having a session queue handler running on .net 8 in isolation mode unusable.

@NoleNerd
Copy link

@jviau we saw an instance of our Durable Function Orchestrator fail with this exception. We had a sub-orchestrator waiting on an External Event, it received that External Event, the sub-orchestrator re-animated and did a little bit of work and finished. The parent/main orchestrator took over and did a little bit of work, called a couple more sub-orchestrators, but then it failed with this exception.

Regarding the scaling - I can see that our Container App scaled to 1 pod around the time the requests came in (10/14 at 20:40) but I don't see it going past 1 pod. Our External Event message was received around 20:48 and the functions started doing work. I would not expect a scale in to occur here with active work. Based on the screenshot below, there should have still been an instance running at the time of exception.

Image

@jviau
Copy link
Contributor

jviau commented Oct 31, 2024

@NoleNerd
Copy link

@jviau , here you go:

ExecutionTime: 2024-10-31T18:55:11Z
id=d2e5eb67-fed6-456f-acf4-d8cda75c1528
East US

@jviau
Copy link
Contributor

jviau commented Nov 1, 2024

@NoleNerd which version of durable worker extension are you using?

@NoleNerd
Copy link

NoleNerd commented Nov 1, 2024

@jviau, we're using Microsoft.Azure.Functions.Worker.Extensions.DurableTask 1.1.5

@jviau
Copy link
Contributor

jviau commented Nov 1, 2024

Thanks!

Looking at @NoleNerd's apps logs and some others, all of these role instances where this occurred are indeed being terminated. This exception is just a symptom and not the root issue. As to why they shut down, I am not sure - that would be a question for the individual platform/sku teams. As for the impact this has on function triggers that is up to each trigger to handle this shutdown scenario. For durable specifically, we did make improvements to this scenario but there may be more work to do. For service bus, this may need to be discussed with the Azure SDK team as they own the service bus WebJobs extension. A workaround might be to disable auto-complete of messages and use ServiceBusMessageActions to manually complete the message. This may avoid completing the message when the worker is shutting down.

We will need to discuss internally more how we want to proceed with this, as it is part platform/sku issue, part worker contract gap, part extension responsibility to be resilient to this.

With that said, we do have a drain mode feature which avoids this in most scenarios. But there are some management actions where drain mode does not run and apps are forcibly shutdown (stopping the function app for example). Also, unsure if Functions on Azure Container Apps supports drain mode.

Copy link
Contributor

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

If you are not the original author (danielmarbach) and believe this issue is not stale, please comment with /bot not-stale and I will not close it.

Copy link
Contributor

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@dmansurov83
Copy link

dmansurov83 commented Nov 20, 2024

We are facing with the same issue ~once per week (Consumption plan, West Europe):

Exception while executing function: Functions.BackgroundEventsProcessorFunction Result: Failure
Exception: System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'IServiceProvider'.
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c.<UseFunctionExecutionMiddleware>b__1_2(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 57
   at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a\_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88
Stack:    at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()
   at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices() in D:\a\_work\1\s\src\DotNetWorker.Core\Context\DefaultFunctionContext.cs:line 48
   at Microsoft.Extensions.Hosting.MiddlewareWorkerApplicationBuilderExtensions.<>c.<UseFunctionExecutionMiddleware>b__1_2(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\Hosting\WorkerMiddlewareWorkerApplicationBuilderExtensions.cs:line 57
   at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a\_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
   at Microsoft.Azure.Functions.Worker.FunctionsApplication.InvokeFunctionAsync(FunctionContext context) in D:\a\_work\1\s\src\DotNetWorker.Core\FunctionsApplication.cs:line 89
   at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.InvokeAsync(InvocationRequest request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 88 

In the AI traces, we see that the new host is started - but cannot handle any events, regardless of the trigger type. After ~10 minutes, host is down - and the next started one works fine.
The impact is quite big due to a lot of Service Bus messages are moved to the DQL after 10 retries in ~1-2 sec (messages are released after the exception).

@jviau jviau changed the title System.ObjectDisposedException: Cannot access a disposed object. on DefaultFunctionContext Worker shutdown may lead to failed function invocations with ObjectDisposedException (which may be persisted by the WebJobs listener) Dec 12, 2024
@jviau
Copy link
Contributor

jviau commented Dec 12, 2024

Using this issue to track further investigation and planning a fix.

Summary

In some cases, the dotnet worker is shutdown via SIGTERM, which causes all DI containers to dispose. In-flight invocations may then encounter an ObjectDisposedException. Even though worker is shutting down, we may have enough time to propagate these failures back to the host. Listeners may then treat this as a user-code failure. This is problematic for message-based listeners which should retry in this scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-discussion
Projects
None yet
Development

No branches or pull requests