-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-32874 Add WsLogAcces Health Report Method #19307
base: candidate-9.8.x
Are you sure you want to change the base?
HPCC-32874 Add WsLogAcces Health Report Method #19307
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32874 Jirabot Action Result: |
Add WsLogAcces Health Report Method - Adds healthreport method to logaccess interface - Implements healthreport method for all logaccess plugins Signed-off-by: Rodrigo Pastrana <[email protected]>
9224ed6
to
f77d122
Compare
@jeclrsg I added this method to help developers debug config/connection logaccess issues, but thought ECLWatch might be able to make use of it as well. As of now, the different plugins (Elastic Stack, Azure, loki) generate their own reports with unique content related to the particular plugin, but we could enforce certain entries if you think ECLWatch could use them. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest item in my mind is a preference for a common structured output. Second are questions about encoding output for security- I'm in that mindset now, but not sure what would be best in this case.
esp/scm/ws_logaccess.ecm
Outdated
bool IncludeSampleQuery(true); | ||
}; | ||
|
||
ESPResponse GetHealthReportResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should have [exceptions_inline] also
esp/scm/ws_logaccess.ecm
Outdated
|
||
ESPResponse GetHealthReportResponse | ||
{ | ||
string Report; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much rather see a structured response that has some common subset of useful values for ECLWatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, commented below for further discussion
bool success = true; | ||
if (!queryRemoteLogAccessor()) | ||
{ | ||
report.append("\"Error\": \"LogAccess plugin not available, review logAccess configuration!\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be part of a structured response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In agreement. what is the ESP standard for these types of messages? (ie, what's the expected structure format)
resp.setReport(report.str()); | ||
|
||
return success; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing final newline
system/jlib/jlog.hpp
Outdated
@@ -1675,6 +1710,106 @@ struct LogQueryResultDetails | |||
unsigned int totalReceived; | |||
unsigned int totalAvailable; | |||
}; | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing before you finalize you'll remove these comments and those in WsLogAccessService.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some, but kept the structure around as a mechanism to track structured responses for the ESP
{ | ||
StringBuffer configJSON; | ||
toJSON(m_pluginCfg, configJSON, 0, JSON_Format|JSON_HideRootArrayObject); | ||
report.appendf("\"ConfigurationTree\": %s", configJSON.str()); //json encode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is supposed to be more JSON appended to the report, not a string that contains the JSON configuration, correct? If it is the latter then it probably should be encoded.
{ | ||
StringBuffer configJSON; | ||
toJSON(m_pluginCfg, configJSON, 0); | ||
report.appendf("\"ConfigurationTree\": %s", configJSON.str()); //json encode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here as in ElasticStackLogAccess.cpp about JSON vs JSON in a string which needs encoding.
StringBuffer configJSON; | ||
toJSON(m_pluginCfg, configJSON, 0); | ||
report.appendf("\"ConfigurationTree\": %s", configJSON.str()); //json encode | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any sensitive information in the configuration that shouldn't be exposed, like maybe tokens or passwords or internal URLs? On the other hand maybe this is a moot question since ECLWatch already gives access to component configs for roxie, esp etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair question, I don't think we support credentials in the config, but I'll make sure we don't, and if we do I'll mask it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HELM chart Schema doesn't call for any sensitive data, I think we're ok.
HPCC-Platform/helm/hpcc/values.schema.json
Line 3192 in 70ccefc
"logAccess": { |
StringBuffer queryString, queryIndex; | ||
populateKQLQueryString(queryString, queryIndex, queryOptions); | ||
|
||
StringBuffer encodedValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the only string that needs to be encoded, or are there other cases where user input could be part of any value returned in the report? Would it be a good idea to encode all strings returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, within the sample query section, yes. But point taken, I'll look around and see if anything else could/should be encoded.
@@ -107,4 +109,5 @@ class GrafanaLogAccessCurlClient : public CInterfaceOf<IRemoteLogAccess> | |||
virtual IRemoteLogAccessStream * getLogReader(const LogAccessConditions & options, LogAccessLogFormat format) override; | |||
virtual IRemoteLogAccessStream * getLogReader(const LogAccessConditions & options, LogAccessLogFormat format, unsigned int pageSize) override; | |||
virtual bool supportsResultPaging() const override { return false;} | |||
virtual bool healthReport(StringBuffer & report, LogAccessHealthReportOptions options) override; | |||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing final newline
report.append("\"ConnectionInfo\": { "); | ||
report.appendf("\"ConnectionString\": \"%s\"", m_grafanaConnectionStr.str()); | ||
report.appendf(", \"UserName\": \"%s\"", m_grafanaUserName.str()); | ||
report.appendf(", \"PasswordProvided\": %s", isEmptyString(m_grafanaPassword.str()) ? "true" : "false"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the only thing I noticed that Terrence hadn't mentioned already. Are these backwards? Seems like if isEmptyString(...) then PasswordProvided
should be set to false
.
Thanks for the review guys (@asselitx / @jeclrsg ) I'll work those suggestions in today. I do want to discuss the suggestion to provide further structure, which I fully agree with, and I'm glad it was brought up. Since the method is not meant for client consumption (it's really just for developer/operations debugging) it doesn't Here's my thought, if ECLWatch commits to consuming some common values, we'll go through the effort of structuring the report. Something like:
It feels to me like ECLWatch would only benefit from the Report/Conection/Status. Thoughts? |
@rpastrana That makes sense to me. I was looking at some of the docs yesterday, and one of them (either Grafana or Kibana) made it sound like they had an api status URI that only returned a 200 (no actual response body) if everything is ok. But I'd agree that ECL Watch probably only needs to get a status flag, and if the engine isn't available some message as to why. |
@@ -11,7 +11,7 @@ global: | |||
id: "1" | |||
name: "Loki" | |||
namespace: | |||
name: "hpcc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to revert this change
A "status" and "message" field makes sense to me too- really whatever it is that the UI would need. Then if more details are needed in the future we could extend the interface. I think it still makes sense to return discrete fields that don't require further parsing and have a neat SOAP representation even though ECLWatch uses (exclusively?) JSON. If there is also a need for the full raw "health report" for debugging/troubleshooting exposed through ESP I'd not be opposed to that as a separate method. Trying to make a common representation for that probably doesn't make sense and may even make it less useful. |
@asselitx I think your comment makes a lot of sense. I'm wondering though if this single method could/should serve the debug report based on a flag. IE by default it only reports well established "status" and "message", but also optionally provides the debug report as a separate text field (in JSON format). |
- Adds ESP respons structure - Introduces top level gree/yellow/red status concept - Introduces jlog structs to help status reporting - Utilizes JSON appending/encoding functionality - Various other minor changes Signed-off-by: Rodrigo Pastrana <[email protected]>
bool IncludeSampleQuery(false); | ||
}; | ||
|
||
ESPenum LogAccessStatusCode : int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeclrsg this service can either provide a green/yellow/red type status, or more specific status codes as in the enum below. Let me know what you'd prefer.
@asselitx updated the ESP interface and some internal jlog interfaces to match the suggestions from code review. I only updated the Azure client, and will update the other clients once we finalize the ESP interface. |
@jeclrsg take a look at the latest proposed format. I'm not sure about the color based status/code (green|yellow|green), but the alternative is for the clients to send back a numeric value representing one of many possible statuses which you would then interpret as success, success with warnings, failure. Let me know what you think |
@rpastrana Yeah, I would vote for status values being explicit, rather than subjective like the colors. I'd actually thought (as a health check type of api) status would be a boolean for if the engine is currently available & if not the message would communicate why. |
I see the health status as at least 3 distinct states: 1 healthy (configured correctly, timely connection, timely response, reasonable response content) 2 Warning (configured correctly, connected (could be lengthy connection), received response (could be sluggish), no logs returned (could signify no logs available) 3 Error (configuration error? connection error? invalid or no response?) I can convert the color code to a 3 entry enumeration representing those states and ensure a message is provided with details. thoughts? |
- Changed status codes to success/warning/fail Signed-off-by: Rodrigo Pastrana <[email protected]>
Updated status enumeration values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Rodrigo this update looks great. I like the approach to have the default behavior return just a Status or maybe Code, with options to include the full raw dumps of info. That gives a good balance between needing a simple check and more in-depth debugging details.
Use of the JSON append functions makes the report-building more readable which is nice.
I'm assuming you've just fleshed out the Azure implementation for now and would do similar for the others once you know to use the Status vs. Code approach.
My gut says that the Codes may be more useful, but I'd defer to Jeremy's preference. If we stick with the stoplight Status, I'd suggest commenting somewhere generally what the colors mean or what sorts of problems would put you in yellow vs. red.
HPCC-32874 Add WsLogAcces Health Report Method
Type of change:
Checklist:
Smoketest:
Testing: