-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace check-instance-ready endpoint to use status instead of stats from CanarieAPI #293
base: master
Are you sure you want to change the base?
Conversation
run tests |
@matprov
|
@fmigneault Yeah that's totally fine - I'm currently changing things to make sure pipeline don't get run over non-existent branches, but changes are in progress. This issue has been introduced some minutes ago. I'll run the tests for this PR when it'll be stable. |
run tests |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1238/Result : failure BIRDHOUSE_DEPLOY_BRANCH : check-instance-ready DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-91.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
@fmigneault
|
@matprov |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac/60/Result : failure BIRDHOUSE_DEPLOY_BRANCH : check-instance-ready DACCS_CONFIGS_BRANCH : PAVICS_E2E_WORKFLOW_TESTS_BRANCH : PAVICS_SDI_BRANCH : DESTROY_INFRA_ON_EXIT : PAVICS_HOST : https:// PAVICS-e2e-workflow-tests Pipeline ResultsTests URL :NOTEBOOK TEST RESULTS
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to keep /stats
.
@@ -13,29 +13,31 @@ COMPOSE_DIR="`dirname "$THIS_DIR"`" | |||
|
|||
if [ -f "$COMPOSE_DIR/env.local" ]; then | |||
# Get PAVICS_FQDN | |||
. $COMPOSE_DIR/env.local | |||
. "${COMPOSE_DIR}/env.local" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watch out, this change will conflict with the DELAYED_EVAL PR.
fi | ||
|
||
MONITOR_URL="https://${PAVICS_FQDN}/canarie/node/service/status" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we could do is use /status
with the Accept: application/json
to better validate the contents.
I think the 200 is returned only for the HTML representation (though the correct code could be returned in that case as well...)
Technically, /stats
could be completely empty if it never (yet) ran the log parsing cron job.
I've encountered this issue recently in the PR pipeline where tests were started too early because /stats
looked OK as everything was empty.
If Solr and ncWMS2 frequently break in your pipeline, we can remove them from the canarie-api montoring list since they are scheduled to be removed anyways. This will probably make |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1256/Result : failure BIRDHOUSE_DEPLOY_BRANCH : check-instance-ready DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-216.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/962/NOTEBOOK TEST RESULTS |
@tlvu |
Misha will do the removal for part 1 of the components refactoring. It might take a while as he is ramping up. If this is frequently causing issues with CRIM pipeline, you can remove them early in this canarie-api monitoring. Talking of monitoring, you would want to add Weaver to this canarie-api monitoring at the same time. |
Will do in #284
It's already included in it. Same with Cowbird. |
Overview
Fix errors seen in #283 (comment) and #283 (comment).
More precisely, the
/stats
endpoint indicates 503, while all services under it seems OK(https://host-140-154.rdext.crim.ca/canarie/node/service/stats)
Using
/status
is more reliable(https://host-140-154.rdext.crim.ca/canarie/node/service/status)
Changes
Non-breaking changes
Scripts: fix
check-instance-ready
script.Previously employed
/canarie/node/service/stats
endpoint could be unreliable for some services under the node thatproduced log collection errors to populate stats. Instead, use
/canarie/node/service/status
that check only if theservices are responsive according to configured endpoints under CanarieAPI. This status endpoint is the same one that
is employed by the CI test suite to check that the instance is ready before starting notebook tests.
Breaking changes
Related Issue / Discussion