Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test use_sbd as a boolean #183

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

mpagot
Copy link
Collaborator

@mpagot mpagot commented Oct 13, 2023

Verification

SBD

Azure

Inventory has

all:
  vars:
    use_sbd: true

All playbooks are called without any -e sue_sbd=* http://openqaworker15.qa.suse.cz/tests/249497/file/configure-qesap_azure_fullypatch.yaml

ansible:
  create:
    - registration.yaml -e reg_code='****' -e email_address='****'
    - fully-patch-system.yaml
    - pre-cluster.yaml
    - sap-hana-preconfigure.yaml -e use_reboottimeout=900
    - cluster_sbd_prep.yaml
    - sap-hana-storage.yaml
    - sap-hana-download-media.yaml
    - sap-hana-install.yaml
    - sap-hana-system-replication.yaml
    - sap-hana-system-replication-hooks.yaml
    - sap-hana-cluster.yaml

In the Ansible log http://openqaworker15.qa.suse.cz/tests/249497/logfile?filename=deploy-qesap_exec_ansible__profile.log.txt

DEBUG    OUTPUT: TASK [Variable use_sbd] ********************************************************
DEBUG    OUTPUT: task path: /root/qe-sap-deployment/ansible/playbooks/tasks/azure-cluster-hana.yaml:33
DEBUG    OUTPUT: Wednesday 18 October 2023  10:40:59 -0400 (0:00:01.807)       0:02:04.776 ***** 
DEBUG    OUTPUT: META: noop
DEBUG    OUTPUT: ok: [vmhana01] => {
DEBUG    OUTPUT:     "msg": "Cloud platform appears to be use_sbd:True use_sbd bool:True use_sbd string:False"
DEBUG    OUTPUT: }

All conditional tasks from ansible/playbooks/tasks/azure-cluster-bootstrap.yaml, like Slurp SBD config, are executed.
Conditional tasks Set stonith-timeout [sdb] and Set stonith-timeout [azure fencing] from ansible/playbooks/tasks/azure-cluster-hana.yaml are not executed.
Task file tasks/cluster-bootstrap.yaml is not included so none of the task from it are executed

crm status after the deployment

Status of pacemakerd: 'Pacemaker is running' (last updated 2023-10-18 14:43:25Z)
Cluster Summary:
  * Stack: corosync
  * Current DC: vmhana01 (version 2.1.5+20221208.a3f44794f-150500.6.5.8-2.1.5+20221208.a3f44794f) - partition with quorum
  * Last updated: Wed Oct 18 14:43:25 2023
  * Last change:  Wed Oct 18 14:43:17 2023 by root via crm_attribute on vmhana01
  * 2 nodes configured
  * 9 resource instances configured

Node List:
  * Online: [ vmhana01 vmhana02 ]

Full List of Resources:
  * stonith-sbd	(stonith:external/sbd):	 Started vmhana01
  * Clone Set: cln_azure-events [rsc_azure-events]:
    * Started: [ vmhana01 vmhana02 ]
  * Clone Set: cln_SAPHanaTopology_HDB_HA000 [rsc_SAPHanaTopology_HDB_HA000]:
    * Started: [ vmhana01 vmhana02 ]
  * Clone Set: msl_SAPHana_HDB_HA000 [rsc_SAPHana_HDB_HA000] (promotable):
    * rsc_SAPHana_HDB_HA000	(ocf::suse:SAPHana):	 FAILED vmhana01 (Monitoring)
    * Slaves: [ vmhana02 ]
  * rsc_socat_HDB_HA000	(ocf::heartbeat:azure-lb):	 Started vmhana02
  * Resource Group: g_ip_HDB_HA000:
    * rsc_ip_HA000	(ocf::heartbeat:IPaddr2):	 Started vmhana01

Failed Resource Actions:
  * rsc_SAPHana_HDB_HA000_monitor_61000 on vmhana01 'promoted' (8): call=50, status='complete', last-rc-change='Wed Oct 18 14:42:27 2023', queued=0ms, exec=4896ms
  

AWS

VR using qesap_reg_aws_gcp_sbd

GCP

VR using qesap_reg_aws_gcp_sbd

Native

AWS

using -e use_sbd=false

Inventory has

all:
  vars:
    use_sbd: false

Ansible playbooks are called as http://openqaworker15.qa.suse.cz/tests/249498/file/configure-qesap_aws_fullypatch.yaml

- sap-hana-cluster.yaml -e use_sbd=false

The Ansible log has : http://openqaworker15.qa.suse.cz/tests/249498/logfile?filename=deploy-qesap_exec_ansible__profile.log.txt

ERROR    OUTPUT:          TASK [Variable use_sbd] ********************************************************
ERROR    OUTPUT:          task path: /root/qe-sap-deployment/ansible/playbooks/tasks/cluster-bootstrap.yaml:221
ERROR    OUTPUT:          Wednesday 18 October 2023  11:40:25 -0400 (0:00:00.096)       0:00:49.478 ***** 
ERROR    OUTPUT:          ok: [vmhana01] => {
ERROR    OUTPUT:              "msg": "Cloud platform appears to be use_sbd:false use_sbd bool:False use_sbd string:True"
ERROR    OUTPUT:          }
ERROR    OUTPUT:          ok: [vmhana02] => {
ERROR    OUTPUT:              "msg": "Cloud platform appears to be use_sbd:false use_sbd bool:False use_sbd string:True"
ERROR    OUTPUT:          }

So TASKs from ansible/playbooks/tasks/cluster-bootstrap.yaml :

  • [Enable SBD [sbd]]
  • [Set stonith-timeout [sdb]]
  • [Set stonith timeout [native - aws]]
  • [Enable stonith]
  • [Disable stonith action [aws]]

are all SKIPPED. Then Ansible FATALs at task Configure cluster IP [aws]

ERROR    OUTPUT:                      "_raw_params": "crm configure primitive rsc_ip_HDB_HDB00 ocf:suse:aws-vpc-move-ip params ip=192.168.1.10 routing_table=rtb-02e3b718a0f668552 interface=eth0 profile=default op start interval=0 timeout=180 op stop interval=0 timeout=180 op monitor interval=60 timeout=60",

...

ERROR    OUTPUT:              "rc": 1,

...

ERROR    OUTPUT:              "stderr_lines": [
ERROR    OUTPUT:                  "\u001b[33mWARNING\u001b[0m: (cluster_status) \twarning: Fencing and resource management disabled due to lack of quorum",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: Resource start-up disabled since no STONITH resources have been defined",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: Either configure some or disable STONITH with the stonith-enabled option",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: NOTE: Clusters with shared data need STONITH to ensure data integrity",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: crm_verify: Errors found during check: config not valid"
ERROR    OUTPUT:              ],

not using -e use_sbd=false

Obtained including also os-autoinst/os-autoinst-distri-opensuse#17978

GCP

using -e use_sbd=false

not using -e use_sbd=false

Obtained including also os-autoinst/os-autoinst-distri-opensuse#17978

VR: - sle-15-SP5-Qesap-Gcp-Byos-x86_64-BuildLATEST_GCE_SLE15_5_BYOS-qesap_gcp_sapconf_test@64bit ->
http://openqaworker15.qa.suse.cz/tests/246979

Inventory has

all:
  vars:
    use_sbd: false

and playbooks are called without any -e use_sbd= http://openqaworker15.qa.suse.cz/tests/246979/file/configure-qesap_gcp_sapconf.yaml

The test Enable SBD [sbd] in the Ansible log is now

ERROR    OUTPUT:          TASK [Enable SBD [sbd]] ********************************************************
ERROR    OUTPUT:          task path: /root/qe-sap-deployment/ansible/playbooks/tasks/cluster-bootstrap.yaml:225
ERROR    OUTPUT:          Friday 13 October 2023  10:38:33 -0400 (0:00:00.070)       0:00:40.621 ******** 
ERROR    OUTPUT:          skipping: [qesapval246979-vmhana01] => {
ERROR    OUTPUT:              "changed": false,
ERROR    OUTPUT:              "skip_reason": "Conditional result was False"
ERROR    OUTPUT:          }
ERROR    OUTPUT:          skipping: [qesapval246979-vmhana02] => {
ERROR    OUTPUT:              "changed": false,
ERROR    OUTPUT:              "skip_reason": "Conditional result was False"
ERROR    OUTPUT:          }

But Ansible fails later with

ERROR    OUTPUT:          TASK [Configure cluster IP [gcp]] **********************************************


...

ERROR    OUTPUT:          fatal: [qesapval246979-vmhana01]: FAILED! => {

...

ERROR    OUTPUT:                      "_raw_params": "crm configure primitive rsc_ip_HDB_HDB00 IPaddr2 params ip=10.0.0.12 cidr_netmask=32 nic=eth0 op monitor interval=3600s timeout=60s",


...

ERROR    OUTPUT:              "rc": 1,

...

ERROR    OUTPUT:              "stderr_lines": [
ERROR    OUTPUT:                  "\u001b[33mWARNING\u001b[0m: (cluster_status) \twarning: Fencing and resource management disabled due to lack of quorum",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: Resource start-up disabled since no STONITH resources have been defined",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: Either configure some or disable STONITH with the stonith-enabled option",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: (unpack_resources) \terror: NOTE: Clusters with shared data need STONITH to ensure data integrity",
ERROR    OUTPUT:                  "\u001b[31mERROR\u001b[0m: crm_verify: Errors found during check: config not valid"
ERROR    OUTPUT:              ],


ERROR    OUTPUT:              "stdout": "",

crm status after the deployment is not executed

@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 11:34 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:05 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 13, 2023 13:30 — with GitHub Actions Inactive
Remove asignment to constant value
Add debug task to print what is coming from the external
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
@mpagot mpagot temporarily deployed to production October 18, 2023 14:07 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant