Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto-revert SIGSEGV #426

Open
ricochet1k opened this issue Sep 28, 2021 · 2 comments · Fixed by cyrilgdn/levant#1 · May be fixed by #473
Open

auto-revert SIGSEGV #426

ricochet1k opened this issue Sep 28, 2021 · 2 comments · Fixed by cyrilgdn/levant#1 · May be fixed by #473
Assignees
Labels
stage/needs-discussion theme/deploy Relates to the deployment of jobs type/bug

Comments

@ricochet1k
Copy link

Description

Levant crashes every time I deploy a job with a failing check when it gets to the auto-revert state.

By the way, why does Levant even do auto-revert, since Nomad already returns to a stable working version on its own? Can we add a flag to disable Levant's auto-revert functionality?

2021-09-27T23:58:41Z |INFO| levant/auto_revert: job the-job has entered auto-revert state; launching auto-revert checker job_id=the-job
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x77098e]

goroutine 1 [running]:
github.com/hashicorp/levant/levant.(*levantDeployment).autoRevert(0xc0003fd040, 0xc00011e2f0, 0xc00011e2d0)
	/home/circleci/project/project/levant/auto_revert.go:25 +0x16e
github.com/hashicorp/levant/levant.(*levantDeployment).checkAutoRevert(0xc0003fd040, 0xc00011e2d0)
	/home/circleci/project/project/levant/auto_revert.go:71 +0x186
github.com/hashicorp/levant/levant.(*levantDeployment).deploy(0xc0003fd040, 0x1)
	/home/circleci/project/project/levant/deploy.go:195 +0x505
github.com/hashicorp/levant/levant.TriggerDeployment(0xc0000ccd20, 0x0, 0xc0001696f0)
	/home/circleci/project/project/levant/deploy.go:81 +0x7e
github.com/hashicorp/levant/command.(*DeployCommand).Run(0xc0000aedf8, 0xc0000c0020, 0x3, 0x3, 0xc0000ccb00)
	/home/circleci/project/project/command/deploy.go:197 +0x939
github.com/mitchellh/cli.(*CLI).Run(0xc0000edcc0, 0xc0000edcc0, 0x7, 0xc0000aed08)
	/go/pkg/mod/github.com/mitchellh/[email protected]/cli.go:260 +0x41a
main.RunCustom(0xc0000c0010, 0x4, 0x4, 0xc000196870, 0x406365)
	/home/circleci/project/project/main.go:49 +0x33e
main.Run(0xc0000c0010, 0x4, 0x4, 0xc000094058)
	/home/circleci/project/project/main.go:17 +0x56
main.main()
	/home/circleci/project/project/main.go:11 +0x65

Relevant Nomad job specification file
I don't think this is very relevant, it happens every time.

Output of levant version:

Levant v0.3.0

Output of consul version:

Consul v1.10.0
Revision 27de64da7
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Output of nomad version:

Nomad v1.1.2 (60638a086ef9630e2a9ba1e237e8426192a44244)

Additional environment details:

Debug log outputs from Levant:

@rytyr
Copy link

rytyr commented Dec 7, 2022

We also have similar issue when enabling auto_revert with hashicorp/levant:0.3.1 docker

2022-12-07T04:19:10Z |INFO| levant/auto_revert: job develop--vas-service has entered auto-revert state; launching auto-revert checker job_id=develop--job
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x713dc4]
goroutine 1 [running]:
github.com/hashicorp/levant/levant.(*levantDeployment).autoRevert(0xc0003433f0, 0xc00043f2b0, 0xc00043f290)
	/home/circleci/project/project/levant/auto_revert.go:25 +0x[124](https://gitlab.private.domain/deco/vas/vas-service/-/jobs/1831#L124)
github.com/hashicorp/levant/levant.(*levantDeployment).checkAutoRevert(0xc000177bb0, 0xc00043f290)
	/home/circleci/project/project/levant/auto_revert.go:71 +0x129
github.com/hashicorp/levant/levant.(*levantDeployment).deploy(0xc0003433f0)
	/home/circleci/project/project/levant/deploy.go:193 +0x377
github.com/hashicorp/levant/levant.TriggerDeployment(0x7ffc04bfbf70, 0x4)
	/home/circleci/project/project/levant/deploy.go:81 +0x45
github.com/hashicorp/levant/command.(*DeployCommand).Run(0xc00000ce70, {0xc00001e0a0, 0x2, 0x2})
	/home/circleci/project/project/command/deploy.go:197 +0x934
github.com/mitchellh/cli.(*CLI).Run(0xc00022e000)
	/go/pkg/mod/github.com/mitchellh/[email protected]/cli.go:260 +0x5f8
main.RunCustom({0xc00001e090, 0xc0000f9f70, 0x405d79}, 0xc0001e0960)
	/home/circleci/project/project/main.go:49 +0x26a
main.Run({0xc00001e090, 0x3, 0x3})
	/home/circleci/project/project/main.go:17 +0x45
main.main()
	/home/circleci/project/project/main.go:11 +0x50

Is this normal / expected behavior on doing auto-revert?

@cyrilgdn
Copy link

Hi,

I have similar and after looking at the code, I discovered that it happens for job that are not in the default namespace.
This call:

dep, _, err := l.nomad.Jobs().LatestDeployment(*jobID, nil)
does not pass the namespace so it doesn't find the deployment (and dep is nil).

I have a fix and will try to create a PR, I'm not sure it will be merged though, Levant development seems stopped (I think they focus more on Nomad CLI / nomad-pack IIRC). I switched on custom releases from my personal fork (to have features like HCL2 support from #398 or this kind of fixes) as Levant is still a very convenient tool with a better deployment experience than Nomad CLI currently (specifically for deployments from CI/CD).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/needs-discussion theme/deploy Relates to the deployment of jobs type/bug
Projects
None yet
4 participants