Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add external ports to job automatically when promMetrics are enabled #3643

Merged
merged 4 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,6 @@ export class K8sResource {
.push(
{
name: portValue.name,
// @ts-expect-error TODO: don't know types here
containerPort: portValue.port
}
);
Expand Down
32 changes: 32 additions & 0 deletions packages/teraslice/src/lib/cluster/services/jobs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ export class JobsService {
});
}

this.addExternalPortsToJobSpec(jobSpec);
const validJob = await this._validateJobSpec(jobSpec);

// We don't create with the fully parsed validJob as it changes the asset names
Expand Down Expand Up @@ -122,6 +123,7 @@ export class JobsService {
if (originalJob.active !== false && jobSpec.active === false) {
this.logger.info(`Skipping job validation to set jobId ${jobId} as _inactive`);
} else {
this.addExternalPortsToJobSpec(jobSpec);
await this._validateJobSpec(jobSpec);
}

Expand Down Expand Up @@ -338,4 +340,34 @@ export class JobsService {

return parsedAssetJob;
}

/**
* Automatically add external_ports to jobSpec if needed.
* This ensures that the Prometheus exporter server can be scraped.
* Check if prom_metrics_enabled is true on jobSpec or teraslice config.
* If so, add or update external_ports property with correct port.
* @param {Partial<JobConfig>} jobSpec
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should describe the precedence behavior in the docs here:

https://terascope.github.io/teraslice/docs/configuration/clustering-k8s#prometheus-metrics

Otherwise this is good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following properties will override the matching property set within the terafoundation configuration.

I'm guessing this sentence from the docs is not clear enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I need to think through that logic more carefully overall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know why this is hard.

Technically, I don't think we have enough properties to control all the conditions an operator/user may want. We have a single prom_metrics_enabled property, set either at the Terafoundation or the job level. But we may want the ability to trigger Prometheus independently on the master or on the jobs. That being the case what does the Terafoundation level prom_metrics_enabled represent? Is it the setting for the metrics on the master or is it the default for the jobs?

I hadn't really considered this until I saw your conditional:

        if (enabledInJob === true || (enabledInJob === undefined && enabledInTF)) {

and wondered what enabledInJob === false would imply.

I think the logic you have here is a great compromise.

Here are the conditions we hypothetically want to support:

  • master metrics off, job metrics off
    • no settings anywhere
    • good
  • master metrics on, job metrics off
    • enabledInTF: True and enabledInJob: False in all jobs
    • I can see this one potentially being a problem, lets say the metrics are costly somehow but we want them in the master, but not in the jobs, we will be forced to set this for prom_metrics_enabled: False in EVERY job to work around this.
  • master metrics on, job metrics on
    • enabledInTF: True
    • good
  • master metrics off, job metrics on
    • enabledInTF: False and enabledInJob: True in all jobs
    • this is pretty good, because the odds of not wanting master metrics is low

What I'm getting at is our properties were kind of wrong from the start, for completeness they might need to be:

  • Terafoundation
    • prom_metrics_enabled_master
    • prom_metrics_enabled_job_default
  • Job
    • prom_metrics_enabled

But really this all seems a little fussy, unless we really need it for that second scenario above.

addExternalPortsToJobSpec(jobSpec: Partial<JobConfig>) {
const {
prom_metrics_enabled: enabledInTF,
prom_metrics_port: tfPort
} = this.context.sysconfig.terafoundation;
const { prom_metrics_enabled: enabledInJob, prom_metrics_port: jobPort } = jobSpec;
const portToUse: number = jobPort || tfPort;

if (enabledInJob === true || (enabledInJob === undefined && enabledInTF)) {
let portPresent = false;
if (!jobSpec.external_ports) {
jobSpec.external_ports = [];
}
for (const item of jobSpec.external_ports) {
const currentPort = typeof item === 'number' ? item : item.port;
if (currentPort === portToUse) portPresent = true;
}
if (!portPresent) {
jobSpec.external_ports.push({ name: 'metrics', port: portToUse });
}
}
}
}
2 changes: 1 addition & 1 deletion packages/types/src/teraslice.ts
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ export interface Targets {

export interface ExternalPort {
name: string;
containerPort: number
port: number
}

export interface Volume {
Expand Down
Loading