Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Behaviour of InstanceCount poorly documented #1467

Open
asos-robbell opened this issue Oct 11, 2023 · 9 comments
Open

[BUG] Behaviour of InstanceCount poorly documented #1467

asos-robbell opened this issue Oct 11, 2023 · 9 comments
Labels
type-code-defect Something isn't working

Comments

@asos-robbell
Copy link

Describe the bug
It's very unclear based on documentation what the behaviour of InstanceCount is. If I set InstanceCount=1 does that mean 'at least one instance' or 'no more than one instance'? From the documentation I've seen it's unclear and I have a scenario where I want at most one instance of my application running at any one time.

Area/Component:
Placement

To Reproduce
Steps to reproduce the behavior:

  1. Set InstanceCount=1

Expected behavior
Unknown

Observed behavior:
Unclear

Screenshots
If applicable, add screenshots to help explain your problem.

Service Fabric Runtime Version:
ex: 7.1., 7.2.

Environment:

  • Must be one of these values [Standalone OR Azure OR OneBox/Dev cluster]
  • OS: [e.g. Windows 2019, Ubuntu 18.04]
  • Version [e.g. 7.1, 7.2 ]

If this is a regression, which version did it regress from?

Additional context
I've raised this on StackOverflow where it was met with similar uncertainty. This behaviour needs to be documented:

https://stackoverflow.com/questions/77186214/does-instancecount-1-in-servicefabric-mean-at-least-one-instance-or-no-more-t/77254680


Assignees: /cc @microsoft/service-fabric-triage

@asos-robbell asos-robbell added the type-code-defect Something isn't working label Oct 11, 2023
@flower7434
Copy link

"...and I have a scenario where I want at most one instance of my application running at any one time." You can never trust SF to have a single instance of anything. So, no, that design will never fly on SF.

@mfmadsen
Copy link

Yes, SF will do its best to honor your intention on a single instance, but for SF the availability of your service is the most important, so in cases where a service instance is about to be moved (perhaps if a node is being disabled due to maintenance or such), then SF would most likely spin up a new instance on another node BEFORE closing down the instance running on the node being deactivated. In that sense it means 'at least one instance' (and under normal conditions 'no more than one').

@asos-robbell
Copy link
Author

Thanks, @mfmadsen and @FredrikDahlberg. Is there any way to guarantee essentially a singleton service in Service Fabric, e.g. never more than one instance?

@flower7434
Copy link

Thanks, @mfmadsen and @FredrikDahlberg. Is there any way to guarantee essentially a singleton service in Service Fabric, e.g. never more than one instance?

I believe we have some locks in a stateful service that we use to block reentrancy. Maybe something similar can be used.

@flower7434
Copy link

You can NEVER guarantee a singleton service, and despite the documentation, you can't even be assured that an instance of an actor won't be running concurrently. If Service Fabric moves the actor service, it does not wait for the existing actors to shut down, they continue running, but are no longer primary, so they won't be able to update state, but they're still running.

Correct, but even more importantly. Service Fabric does not guarantee that ANY instance of a service is running no matter what you set InstanceCount to. If you have 0 instances it does not help to increase it. It will still be 0.

@flower7434
Copy link

Correct, but even more importantly. Service Fabric does not guarantee that ANY instance of a service is running no matter what you set InstanceCount to. If you have 0 instances it does not help to increase it. It will still be 0.

I'm not sure what you mean by this, my experience has always been, SF creates the services you configure it to create. If it didn't, I would have abandoned it ages ago.

It is not often but it happens a couple of times per year for us. Typically it tries to create an instance but fails. Sometimes because it has not reloaded the config or something. Typically after deployments. The logic for the rollout seems to be wrong. Even if something fails it just continues with the next nodes. I would be much happier if one would be able to restart a service or if it was possible to deploy a service or if it was possible to at least deploy the same version of an app. The solution now is to rebuild the app and then deploy it. Which may take over an hour. I have never managed to get a failed service to run again without deploying a new version.

@JohnNilsson
Copy link

JohnNilsson commented Nov 17, 2023

@FredrikDahlberg we've been running SF for six years now. Never saw that behaviour. My guess is you've missed something about how upgrades work.

Things to check:

  • Are using Monitored upgrades? Unmonitored would ignore errors as you describe
  • Are you updating the version number of all components you need to reload/restart? Not changing the version of the config package would result in not triggering a reload
  • Is the service configured to only refresh the config if nothing else changed? Are you reacting to those events?
  • Are you using diff deploys? If a new config package is not included in the deploy package there is nothing to reload. Conversely, if a new code package is not included, there is nothing to restart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-code-defect Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants
@JohnNilsson @mfmadsen @flower7434 @asos-robbell and others