Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node add isn't consistently working #728

Open
gsstoykov opened this issue Oct 22, 2024 · 1 comment
Open

Node add isn't consistently working #728

gsstoykov opened this issue Oct 22, 2024 · 1 comment
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team

Comments

@gsstoykov
Copy link

gsstoykov commented Oct 22, 2024

To Reproduce

Initialisation steps from #727 and:

npm run solo -- node add --gossip-keys true --tls-keys true --release-tag v0.54.0-alpha.4 --namespace solo-e2e

Describe the bug

◼ Finalize
node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "#<ErrorEvent>".] {
  code: 'ERR_UNHANDLED_REJECTION'
}

Node.js v21.7.1

I've seen fails with the error from #727 as well.

Describe the expected behavior

Node added and functioning. Does not happen every time but still it is not consistent for testing from our side.

Whole JUnit/CLI Logs

npm run solo -- node add --gossip-keys true --tls-keys true --release-tag v0.54.0-alpha.4 --namespace solo-e2e

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs node add --gossip-keys true --tls-keys true --release-tag v0.54.0-alpha.4 --namespace solo-e2e


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize [0.1s]
✔ Check that PVCs are enabled
✔ Identify existing network nodes
  ✔ Check network pod: node1
✔ Determine new node account number
✔ Generate Gossip key [0.3s]
  ✔ Backup old files
  ✔ Gossip key for node: node2 [0.3s]
✔ Generate gRPC TLS key [0.4s]
  ✔ Backup old files
  ✔ TLS key for node: node2 [0.4s]
✔ Load signing key certificate
✔ Compute mTLS certificate hash
✔ Prepare gossip endpoints
✔ Prepare grpc service endpoints
✔ Prepare upgrade zip file for node upgrade process [2s]
✔ Check existing nodes staked amount [2s]
✔ Send node create transaction [2s]
✔ Send prepare upgrade transaction [4s]
✔ Send freeze upgrade transaction [2s]
✔ Download generated files from an existing node [0.5s]
✔ Prepare staging directory
  ✔ Copy Gossip keys to staging
  ✔ Copy gRPC TLS keys to staging
✔ Copy node keys to secrets [0.1s]
  ✔ Copy TLS keys [0.1s]
  ✔ Node: node1
    ✔ Copy Gossip keys
  ✔ Node: node2
    ✔ Copy Gossip keys
✔ Check network nodes are frozen [9s]
  ✔ Check network pod: node1  - status FREEZE_COMPLETE, attempt: 3/120 [9s]
✔ Get node logs and configs [2s]
✔ Deploy new network node [5s]
✔ Kill nodes to pick up updated configMaps
✔ Check node pods are running [58s]
  ✔ Check Node: node1
  ✔ Check Node: node2 [58s]
❯ Fetch platform software into all network nodes
  ⠇ Update node: node1 [ platformVersion = v0.54.0-alpha.4 ]
  ⠇ Update node: node2 [ platformVersion = v0.54.0-alpha.4 ]
◼ Download last state from an existing node
◼ Upload last saved state to new network node
◼ Setup new network node
◼ Start network nodes
◼ Enable port forwarding for JVM debugger
◼ Check all nodes are ACTIVE
◼ Check all node proxies are ACTIVE
◼ Stake new node
◼ Trigger stake weight calculate
◼ Finalize
node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "#<ErrorEvent>".] {
  code: 'ERR_UNHANDLED_REJECTION'
}

Node.js v21.7.1

Additional Context

No response

@gsstoykov gsstoykov added Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team labels Oct 22, 2024
@gsstoykov
Copy link
Author

Also tried doing the same flow with the C++ SDK NodeCreateTransaction followed by npm run solo -- node add-execute --input-dir context. Seems like the node pod is correctly created also setup and start are passing as well but got the following log:

npm run solo -- node add-execute --input-dir context

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs node add-execute --input-dir context


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize [0.1s]
✔ Identify existing network nodes
  ✔ Check network pod: node1
✔ Load context data
✔ Download generated files from an existing node [0.4s]
✔ Prepare staging directory
  ✔ Copy Gossip keys to staging
  ✔ Copy gRPC TLS keys to staging
✔ Copy node keys to secrets
  ✔ Copy TLS keys
  ✔ Node: node1
    ✔ Copy Gossip keys
  ✔ Node: node2
    ✔ Copy Gossip keys
✔ Check network nodes are frozen [6s]
  ✔ Check network pod: node1  - status FREEZE_COMPLETE, attempt: 0/120 [6s]
✔ Get node logs and configs [8s]
✔ Deploy new network node [2s]
✔ Kill nodes to pick up updated configMaps
✔ Check node pods are running [30s]
  ✔ Check Node: node1
  ✔ Check Node: node2 [30s]
✔ Fetch platform software into all network nodes [5s]
  ✔ Update node: node1 [ platformVersion = v0.54.0-alpha.4 ] [5s]
  ✔ Update node: node2 [ platformVersion = v0.54.0-alpha.4 ] [5s]
✔ Download last state from an existing node [0.4s]
✔ Upload last saved state to new network node [0.4s]
✔ Setup new network node [0.1s]
  ✔ Node: node1 [0.1s]
    ✔ Set file permissions [0.1s]
  ✔ Node: node2
    ✔ Set file permissions
✔ Start network nodes [0.1s]
  ✔ Start node: node1
  ✔ Start node: node2
↓ Enable port forwarding for JVM debugger
❯ Check all nodes are ACTIVE
  ✔ Check network pod: node1  - status ACTIVE, attempt: 16/120 [24s]
  ✖ node 'node2' is not ACTIVE[ attempt = 120/120 ]
◼ Check all node proxies are ACTIVE
◼ Stake new node
◼ Trigger stake weight calculate
◼ Finalize
*********************************** ERROR *****************************************
Error in setting up nodes: node 'node2' is not ACTIVE[ attempt = 120/120 ]
***********************************************************************************

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team
Projects
None yet
Development

No branches or pull requests

1 participant