Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flex Consumption timer functions are killed #10527

Open
konrad11901 opened this issue Oct 16, 2024 · 12 comments
Open

Flex Consumption timer functions are killed #10527

konrad11901 opened this issue Oct 16, 2024 · 12 comments
Assignees
Labels
area: flex-consumption Items related to Flex Consumption support needs-investigation

Comments

@konrad11901
Copy link

Investigative information

Please provide the following:

  • Timestamp: 2024-10-16T10:00:00.0021088Z
  • Function App version: 4
  • Function App name: flexconsumptiontimer-test-func
  • Function name(s) (as appropriate): TimerFunction
  • Invocation ID: c2c3130e-7709-4710-af5c-ee2b4c024f95
  • Region: northeurope

Repro steps

  1. Create a simple timer-triggered C# function like this:
Function source code
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

namespace FlexConsumptionTimer
{
    public class TimerFunction
    {
        private readonly ILogger _logger;

        public TimerFunction(ILoggerFactory loggerFactory)
        {
            _logger = loggerFactory.CreateLogger<TimerFunction>();
        }

        [Function("TimerFunction")]
        public async Task Run([TimerTrigger("0 0 10 * * *")] TimerInfo myTimer, CancellationToken cancellationToken)
        {
            _logger.LogInformation("C# Timer trigger function executed at: {time}", DateTime.Now);

            // Register a callback to log when the function is cancelled
            // so that we can track if Functions worker is cancelling the function
            using var reg = cancellationToken.Register(() =>
            {
                _logger.LogWarning(
                    "Someone (user or Azure Functions host) requested to cancel the job! Trace: {trace}",
                    Environment.StackTrace);
            });

            await Task.Delay(1000 * 60 * 5, cancellationToken); // Wait for 5 minutes

            if (myTimer.ScheduleStatus is not null)
            {
                _logger.LogInformation("Next timer schedule at: {time}", myTimer.ScheduleStatus.Next);
            }
        }
    }
}
Program.cs
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

var host = new HostBuilder()
    .ConfigureFunctionsWebApplication()
    .ConfigureServices(services =>
    {
        services.AddApplicationInsightsTelemetryWorkerService();
        services.ConfigureFunctionsApplicationInsights();
    })
    .Build();

host.Run();
Package references
<PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.23.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http" Version="3.2.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http.AspNetCore" Version="1.3.2" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Timer" Version="4.3.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.18.1" />
<PackageReference Include="Microsoft.ApplicationInsights.WorkerService" Version="2.22.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.ApplicationInsights" Version="1.4.0" />
  1. Create a new Azure Functions instance and necessary dependencies:
az group create --name "<RG name>" --location "northeurope"
az storage account create --name "<Storage Account name>" --location "northeurope" --resource-group "<RG name>" --sku Standard_LRS --allow-blob-public-access false --https-only
az functionapp create --resource-group "<RG name>" --name "flexconsumptiontimer-test-func" --storage-account "<Storage Account name>" --flexconsumption-location "northeurope" --runtime dotnet-isolated --runtime-version 8.0
  1. Deploy the previously created function:
func azure functionapp publish flexconsumptiontimer-test-func

Expected behavior

Function execution finishes successfully.

Actual behavior

Function execution is being killed by the host within a minute of starting the function.
It looks like there is some scaling down involved (that's my guess based on DrainMode mention in logs), which kills the ongoing execution of the function?

Known workarounds

Manual rerun of the function seems to work.

Related information

We had this issue earlier too (back in August) - then a simple package update seemed to help as the functions were working fine till the beginning of October.

This issue is not occurring on the Consumption plan.

Programming language used: C#
Bindings used: Timer

Logs
10/16/2024, 10:00:00.002 AM Executing 'Functions.TimerFunction' (Reason='Timer fired at 2024-10-16T09:59:59.9998519+00:00', Id=c2c3130e-7709-4710-af5c-ee2b4c024f95)
10/16/2024, 10:00:25.346 AM DrainMode mode enabled
10/16/2024, 10:00:25.346 AM Calling StopAsync on the registered listeners
10/16/2024, 10:00:25.347 AM Stopping the listener 'Microsoft.Azure.WebJobs.Host.Listeners.SingletonListener' for function 'TimerFunction'
10/16/2024, 10:00:25.356 AM Someone (user or Azure Functions host) requested to cancel the job! Trace: at System.Environment.get_StackTrace() at FlexConsumptionTimer.TimerFunction.<Run>b__2_0() in <...>\FlexConsumptionTimer\TimerFunction.cs:line 24 at System.Threading.CancellationTokenSource.Invoke(Delegate d, Object state, CancellationTokenSource source) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException) at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.TryCancel(String invocationId) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 147 at Microsoft.Azure.Functions.Worker.GrpcWorker.InvocationCancelRequestHandler(InvocationCancel request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 127 at Microsoft.Azure.Functions.Worker.GrpcWorker.ProcessRequestCoreAsync(StreamingMessage request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 109 at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) at Microsoft.Azure.Functions.Worker.GrpcWorker.ProcessRequestCoreAsync(StreamingMessage request) at Microsoft.Azure.Functions.Worker.GrpcWorker.<>c__DisplayClass11_0.<Microsoft.Azure.Functions.Worker.Grpc.IMessageProcessor.ProcessMessageAsync>b__0() in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 66 at System.Threading.Tasks.Task`1.InnerInvoke() at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
10/16/2024, 10:00:25.380 AM Executed 'Functions.TimerFunction' (Failed, Id=c2c3130e-7709-4710-af5c-ee2b4c024f95, Duration=25380ms)
10/16/2024, 10:00:25.414 AM Stopped the listener 'Microsoft.Azure.WebJobs.Host.Listeners.SingletonListener' for function 'TimerFunction'
10/16/2024, 10:00:25.415 AM Call to StopAsync complete, registered listeners are now stopped
@bhagyshricompany
Copy link

Hi @konrad11901 Thanks for reporting I have checked and created same for run every min same funct it working fine. locally.can you check it again.Image

@konrad11901
Copy link
Author

@bhagyshricompany I checked that and locally it seems to be working fine. So it looks like the issue occurs only after deployment to Azure.

@bhagyshricompany
Copy link

Hi @konrad11901 Thanks for your patience can you check it again it seems working.Image

@konrad11901
Copy link
Author

@bhagyshricompany It's still not working for me for timer-triggered runs. The function starts, but it's being killed <1min after.
Image

@bhagyshricompany
Copy link

can you try with some other plan like consumption and different region.please.

@konrad11901
Copy link
Author

@bhagyshricompany Sorry for a late response. A few days ago I deployed a new Flex Consumption function app in Sweden Central and it seems to work just fine (last 5 daily executions finished successfully).
Today I created a brand new North Europe-based Flex Consumption function (used the same template as for creating an app in Sweden Central) and the issue still occurs. It also still occurs on the original function app that I deployed in North Europe (there hasn't been a single successfully finished execution).
So the problem might occur only in North Europe...? I haven't tested other regions yet.

Other plans (Consumption, Premium, App Service Plan) were working always fine for us, this issue is specifically about Flex Consumption.

For reference, here are the details about the Sweden Central app:

  • Function App name: flexconsumptiontimer2-test-func
  • Function name(s) (as appropriate): TimerFunction
  • Invocation ID: 5572594e-159b-462c-9260-a07d4200e19e (last run's ID)

@bhagyshricompany
Copy link

@satvu please comment.Thanks

@satvu
Copy link
Member

satvu commented Nov 12, 2024

@pragnagopa and @paulbatum are investigating.

@johnrom
Copy link

johnrom commented Dec 3, 2024

I am having this issue on East US with a .NET 8 Isolated Flex Consumption function app that only has a single Timer function (there should only ever be 1 instance running).

This is the order of operations of an unsuccessful run. First the Drain Mode message appears, then it calls StopAsync on the SingletonListener which cancels the running application. The order of logs including DrainMode, StopAsync, etc, is the same exact order as when a run is successful and the SingletonListener.StopAsync is subsequently called. Maybe it thinks the run completed when it hasn't? I'm using an async function returning a task with its CancellationToken passed through the whole application. It is not a durable function. Most of the runs are unsuccessful, but I'd say about 1/3 of them complete. The runs only take up to 30-40 seconds to complete, but it can be canceled as early as 5 seconds in.

Here's an unsuccessful run. The timestamps read bottom to top.

Image

Here's a successful run:

Image

@Peter-Juhasz
Copy link

We are having the same issues with both Queue and Timer triggers:
Azure/azure-functions-dotnet-worker#2900

All invocations get cancelled after a few seconds or sometimes even a minute, essentially our function apps are unusable.

@jorodragoev
Copy link

jorodragoev commented Jan 7, 2025

same here - it is durable function, net-isolated, sometimes killed while executing any of the app activities
noticed that if the first log message is
POST /AzureFunctionsRpcMessages.FunctionRpc/EventStream
then in between every Executing Executed messages there is "Invoke" dependency message in the log
in this case the app is NOT prematurely killed

but when function wake up and above message is missing
then there is no "Invoke" dependency message in the log in between every Executing Executed messages
and the app is killed in 2-5 sec

it starts again shortly but this redundant restart complicates my life

Consumption AU-East

@b10-as
Copy link

b10-as commented Jan 21, 2025

I am also running into this issue. A TimerTrigger on a Flex Consumption function has its CancellationToken cancelled before the functionTimeout value of 1.5 hours. This happens anywhere between a few seconds and a few minutes. On the 20th of January there was a long stretch where cancellations were almost always at 2 minutes.
Running the function app code on a P0V3 ASP did not result in this issue.

Other triggers appear to be unaffected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: flex-consumption Items related to Flex Consumption support needs-investigation
Projects
None yet
Development

No branches or pull requests

7 participants