tests/robustness: init with powerfailure case #622

fuweid · 2023-11-25T09:07:53Z

Add Robustness Test pipeline for robustness test cases.

fuweid · 2023-11-29T08:26:09Z

go.mod

ahrtr · 2023-12-07T16:10:24Z

.github/workflows/robustness_test.yaml

+    strategy:
+      matrix:
+        os: [ubuntu-latest]
+    runs-on: ${{ matrix.os }}


Suggested change

strategy:

matrix:

os: [ubuntu-latest]

runs-on: ${{ matrix.os }}

runs-on: ubuntu-latest

Makefile

tests/robustness/powerfailure_test.go

ahrtr · 2023-12-07T16:43:08Z

tests/robustness/powerfailure_test.go

+	// FIXME: gofail should support unix socket so that the test cases won't
+	// be conflicted.


Where is the conflict?

The existing https://github.com/etcd-io/bbolt/blob/master/tests/failpoint/db_failpoint_test.go doesn't require exporting a port for failpoint;

We will not run the test cases under test/robustness in parallel.

I run into port-already-being-used issue in etcd flakey case when that test case doesn't close port after finish.
I remove that comment since it seems confusing.

ahrtr · 2023-12-07T16:51:14Z

tests/robustness/powerfailure_test.go

+	time.Sleep(time.Duration(time.Now().UnixNano()%5+1) * time.Second)
+	t.Logf("simulate power failure")
+
+	activeFailpoint(t, fpURL, "beforeSyncMetaPage", "panic")


I am thinking we should also support forcibly killing the process so that the process can exit at a random point?

This can be resolved in a followup PR.

Yeah. I am thinking about introducing random panic including force-kill. Let me handle this in the follow-up. Thanks.

In this test, you inject the failure on device (fs) after the process already terminates. Should we inject the failure (dropWrite) before we terminate(panic) the process?

For the forcibly killing case (we will support it in a followup PR), we do need to inject the failure on device (fs) after the process already terminates.

Discuss with @fuweid , let's support more cases in followup PRs

Sync times: t1 t2 t3 x FS f1 f2 f3 f4 if f4 < x f4 ~ x if f4 > x t3 ~ x

Use gofailpoint

Set a huge value for commit interval: make sure all data after the last sync is lost

Set proper value for commit interval: make sure part of the data since last sync is lost

Set very small value for commit interval: almost no data loss

forcibly killing the process

same as above to support different commit interval

Add `Robustness Test` pipeline for robustness test cases. Signed-off-by: Wei Fu <[email protected]>

ahrtr

LGTM

Let's resolve comment in followup PRs.

Thanks @fuweid

k8s-ci-robot added the do-not-merge/work-in-progress label Nov 25, 2023

fuweid force-pushed the powerfailure branch 9 times, most recently from 7f37f07 to 926680c Compare November 25, 2023 12:48

fuweid marked this pull request as ready for review November 25, 2023 14:24

k8s-ci-robot removed the do-not-merge/work-in-progress label Nov 25, 2023

fuweid force-pushed the powerfailure branch from 926680c to 55134a5 Compare November 29, 2023 06:05

ahrtr reviewed Dec 3, 2023

View reviewed changes

go.mod Outdated Show resolved Hide resolved

fuweid mentioned this pull request Dec 5, 2023

tests/*: introduce go-dmflakey #628

Merged

fuweid force-pushed the powerfailure branch 2 times, most recently from cdfc75f to e60530d Compare December 7, 2023 15:24

fuweid changed the title ~~tests: introduce powerfailure case~~ tests/robustness: init with powerfailure case Dec 7, 2023

ahrtr reviewed Dec 7, 2023

View reviewed changes

fuweid force-pushed the powerfailure branch from e60530d to d5d9007 Compare December 8, 2023 01:47

tests/robustness: init with powerfailure case

300e72a

Add `Robustness Test` pipeline for robustness test cases. Signed-off-by: Wei Fu <[email protected]>

fuweid force-pushed the powerfailure branch from d5d9007 to 300e72a Compare December 8, 2023 01:49

ahrtr approved these changes Dec 8, 2023

View reviewed changes

ahrtr merged commit 95a982c into etcd-io:master Dec 8, 2023
14 checks passed

fuweid mentioned this pull request Aug 8, 2024

[1.3] backport go-dmflakey and robustness test #812

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests/robustness: init with powerfailure case #622

tests/robustness: init with powerfailure case #622

fuweid commented Nov 25, 2023 •

edited

Loading

fuweid commented Nov 29, 2023

ahrtr Dec 7, 2023

fuweid Dec 8, 2023

ahrtr Dec 7, 2023

fuweid Dec 8, 2023

ahrtr Dec 7, 2023

fuweid Dec 8, 2023

ahrtr Dec 8, 2023

ahrtr Dec 8, 2023

ahrtr left a comment

		// FIXME: gofail should support unix socket so that the test cases won't
		// be conflicted.

tests/robustness: init with powerfailure case #622

tests/robustness: init with powerfailure case #622

Conversation

fuweid commented Nov 25, 2023 • edited Loading

fuweid commented Nov 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Use gofailpoint

forcibly killing the process

ahrtr left a comment

Choose a reason for hiding this comment

fuweid commented Nov 25, 2023 •

edited

Loading