Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Retrospective] Release 2.8.0 #3604

Closed
peterzhuamazon opened this issue Jun 6, 2023 · 5 comments
Closed

[Retrospective] Release 2.8.0 #3604

peterzhuamazon opened this issue Jun 6, 2023 · 5 comments
Assignees

Comments

@peterzhuamazon
Copy link
Member

This is Retrospective for the release 2.8.0:

How to use this issue?
Please add comments to this issue, they can be small or large in scope. Honest feedback is important to improve our processes, suggestions are also welcomed but not required.

What will happen to this issue post release?
There will be a discussion(s) about how the release went and how the next release can be improved. Then this ticket will be updated with the notes of that discussion along side action items.

@acarbonetto
Copy link

Request to merge [AUTO] issues that increment snapshot versions quickly so as not to block bwc CI on downstream dependencies. These are raised by bots and are super-simple to approve/merge.
e.g. opensearch-project/common-utils#444

@MaxKsyunz
Copy link
Contributor

MaxKsyunz commented Jun 10, 2023

Request to merge [AUTO] issues that increment snapshot versions quickly so as not to block bwc CI on downstream dependencies. These are raised by bots and are super-simple to approve/merge. e.g. opensearch-project/common-utils#444

Yes! It's more involved than that -- the manifest file needs to be updated incrementally as the version bump PRs are merged.

Currently only OpenSearch snapshot are built for 2.9.

@dblock
Copy link
Member

dblock commented Jun 13, 2023

A workflow that tries to add plugins one-by-one from the previous release for as long as build passes and makes a PR with each new addition would be a nice automation solution to this.

@Yury-Fridlyand
Copy link
Contributor

Hi!

On behalf of OpenSearch SQL plugin team I share our observations of current weaknesses of the release process.


Dependency chains

Dependency chains should be defined all across the project. This is crucial to make breaking changes as painless as possible.

Some people could be surprised that SQL plugin depends on ML plugin, which depends on ml-commons, which depends on common-utils, which depends on OpenSearch core.

Unblocking dependencies

For all components which have dependents, responsible people should be defined. Those people should be first responders for unblocking dependents and applying fixes which required for that. This task should be on the highest priority always.

Notification for breaking changes

When a component team is going to do a breaking change, a notification (email, or GH tagging) should be sent to all first responders of dependent modules/components. I also propose to postpone the merge of breaking changes for one day after making a notification, this will give people time to get prepared.

Fallback strategy

As I understand from discussing troubles of the current release, if a component is not ready for the release, a previous version of it could be taken as part of the release build. Unfortunately, a version N of a component could be incompatible with version N+1 of core or other components. Keeping a fallback version ready for the release is a good strategy to avoid release delays.

Fallback maintenance

Unfortunately, an extra effort is required from all teams to keep version N of a component ready for release N+1. Some changes need to be backported to make version N compatible with N+1. That version won’t be N anymore, it will be N*, which requires additional testing.

Branch integration testing

CI workflow described in sql#1242 should be done for all project repositories. Given a PR from branch A to B has CI workflows running on branch A (let me name this run as CI_1). Once merged, branch B is updated and new CI started (CI_2), but nobody sees the results. If CI_1 fails, it blocks the merge, but failures of CI_2 don’t, and it also produces troubles for upcoming PRs much later. The goal is to run CI_2 in parallel with CI_1 as part of PR checks. It is pretty simple to do in GHA.

Flakey tests

OpenSearch release build runs tests for N components K times (there are builds for rpm, deb and so on), where every component has M thousands of tests. When 1% of tests fall with 1% probability, total chance to fail the entire build becomes pretty high. I propose to prioritize fixing flakey tests, maybe second after unblocking dependents and dependencies.

@peterzhuamazon
Copy link
Member Author

Thanks everyone's feedbacks and retro here.
I will close this issue for now.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

6 participants