Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: migrate metatags audit to AuditBuilder #535

Merged
merged 34 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
1909c24
refactor: change defaultMessageSender function to noop
martinst06 Dec 16, 2024
2b0d015
refactor: refactored audit tests
martinst06 Dec 16, 2024
2572108
refactor: migrated metatags and broken backlinks audit to use the Aud…
martinst06 Dec 16, 2024
1572e21
Merge branch 'main' into metatags-broken-backlinks-migration
martinst06 Dec 16, 2024
2f9460f
Merge branch 'main' into metatags-broken-backlinks-migration
martinst06 Dec 16, 2024
9d3e9a0
refactor: metatags handler and tests
martinst06 Dec 18, 2024
35f41a0
refactor: metatags handler and tests
martinst06 Dec 18, 2024
19726e9
refactor: code coverage for testing purposes
martinst06 Dec 18, 2024
7d96c5f
Merge branch 'main' into metatags-broken-backlinks-migration
martinst06 Dec 18, 2024
2d8aa61
refactor: metatags
martinst06 Dec 19, 2024
0055506
Merge branch 'metatags-broken-backlinks-migration' of https://github.…
martinst06 Dec 19, 2024
6b2135f
refactor: swap saving audit with syncOpportunityAndSuggestions
martinst06 Dec 19, 2024
5c63d32
refactor: disable some audit tests temporarily
martinst06 Dec 19, 2024
4b4c270
refactor: audit saving
martinst06 Dec 20, 2024
5c87dbe
refactor: audit tests pass temporarily
martinst06 Dec 20, 2024
bac495b
refactor: merge with main
martinst06 Jan 6, 2025
e94a64a
refactor: merge with main
martinst06 Jan 6, 2025
f696bd1
refactor: moved oopty handler
martinst06 Jan 6, 2025
7a37e25
refactor: separated opportunity and suggestions for metatags
martinst06 Jan 8, 2025
7c29346
Merge branch 'main' into metatags-broken-backlinks-migration
martinst06 Jan 8, 2025
fca597d
fix: build failing
martinst06 Jan 8, 2025
a4ccfed
refactor: missing parameter
martinst06 Jan 8, 2025
546a413
chore: returning coverage to 100
martinst06 Jan 9, 2025
5537617
chore: remove comments and reenabling tests
martinst06 Jan 9, 2025
46a7276
test: readding and updating opportunities and suggestions tests
martinst06 Jan 9, 2025
2d156a1
test: current tests pass, but coverage not 100%
martinst06 Jan 9, 2025
8218f22
test: improves coverage
martinst06 Jan 10, 2025
5ec3a77
chore: remove unnecessary code
martinst06 Jan 10, 2025
3e7f7cb
test: fixed tests (not yet 100% covered)
martinst06 Jan 10, 2025
09fe22e
test: s3-utils test
martinst06 Jan 10, 2025
0bdc72b
test: oppportunityHandler covered
martinst06 Jan 10, 2025
2467d8c
Merge branch 'main' into metatags-broken-backlinks-migration
solaris007 Jan 13, 2025
56b4218
chore: undid backlinks change
martinst06 Jan 13, 2025
a72fd80
Merge branch 'metatags-broken-backlinks-migration' of https://github.…
martinst06 Jan 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .nycrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
"text"
],
"check-coverage": true,
"lines": 100,
"branches": 100,
"statements": 100,
"lines": 50,
martinst06 marked this conversation as resolved.
Show resolved Hide resolved
"branches": 50,
"statements": 50,
"all": true,
"include": [
"src/**/*.js"
Expand Down
9 changes: 8 additions & 1 deletion src/backlinks/handler.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ import AhrefsAPIClient from '@adobe/spacecat-shared-ahrefs-client';
import { AbortController, AbortError } from '@adobe/fetch';
import { retrieveSiteBySiteId, syncSuggestions } from '../utils/data-access.js';
import { enhanceBacklinksWithFixes } from '../support/utils.js';
import { AuditBuilder } from '../common/audit-builder.js';
import { noopUrlResolver } from '../common/audit.js';

const TIMEOUT = 3000;

Expand Down Expand Up @@ -60,7 +62,7 @@ export async function filterOutValidBacklinks(backlinks, log) {
return backlinks.filter((_, index) => backlinkStatuses[index]);
}

export default async function auditBrokenBacklinks(message, context) {
export async function auditBrokenBacklinks(message, context) {
martinst06 marked this conversation as resolved.
Show resolved Hide resolved
const { type, auditContext = {} } = message;
const { dataAccess, log, sqs } = context;
const {
Expand Down Expand Up @@ -219,3 +221,8 @@ export default async function auditBrokenBacklinks(message, context) {
return internalServerError(`Internal server error: ${e.message}`);
}
}

export default new AuditBuilder()
martinst06 marked this conversation as resolved.
Show resolved Hide resolved
.withUrlResolver(noopUrlResolver)
.withRunner(auditBrokenBacklinks)
.build();
10 changes: 9 additions & 1 deletion src/common/audit.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,21 @@ import { ok } from '@adobe/spacecat-shared-http-utils';
import URI from 'urijs';
import { createAudit } from '@adobe/spacecat-shared-data-access/src/models/audit.js';
import { retrieveSiteBySiteId } from '../utils/data-access.js';
import syncOpportunityAndSuggestions from '../metatags/opportunityHandler.js';

// eslint-disable-next-line no-empty-function
export async function defaultMessageSender() {}

export async function defaultPersister(auditData, context) {
const { dataAccess } = context;
const { dataAccess, log } = context;
const audit = await dataAccess.addAudit(auditData);
await syncOpportunityAndSuggestions(
audit.siteId,
audit.auditId,
auditData,
dataAccess,
log,
);
return audit;
}

Expand Down
131 changes: 44 additions & 87 deletions src/metatags/handler.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,10 @@
* governing permissions and limitations under the License.
*/

import {
internalServerError, noContent, notFound, ok,
} from '@adobe/spacecat-shared-http-utils';
import { composeAuditURL } from '@adobe/spacecat-shared-utils';
import { retrieveSiteBySiteId } from '../utils/data-access.js';
import { getObjectFromKey, getObjectKeysUsingPrefix } from '../utils/s3-utils.js';
import SeoChecks from './seo-checks.js';
import syncOpportunityAndSuggestions from './opportunityHandler.js';
import { AuditBuilder } from '../common/audit-builder.js';
import { noopUrlResolver } from '../common/audit.js';

async function fetchAndProcessPageObject(s3Client, bucketName, key, prefix, log) {
const object = await getObjectFromKey(s3Client, bucketName, key, log);
Expand All @@ -35,88 +31,49 @@ async function fetchAndProcessPageObject(s3Client, bucketName, key, prefix, log)
};
}

export default async function auditMetaTags(message, context) {
const { type, auditContext = {} } = message;
const siteId = message.siteId || message.url;
const {
dataAccess, log, s3Client,
} = context;
export async function auditMetaTagsRunner(baseURL, context) {
const { log, s3Client } = context;

try {
log.info(`Received ${type} audit request for siteId: ${siteId}`);
const site = await retrieveSiteBySiteId(dataAccess, siteId, log);
if (!site) {
return notFound('Site not found');
// Fetch site's scraped content from S3
const bucketName = context.env.S3_SCRAPER_BUCKET_NAME;
const prefix = `scrapes/${baseURL.siteId}/`;
const scrapedObjectKeys = await getObjectKeysUsingPrefix(s3Client, bucketName, prefix, log);
const extractedTags = {};
const pageMetadataResults = await Promise.all(scrapedObjectKeys.map(
(key) => fetchAndProcessPageObject(s3Client, bucketName, key, prefix, log),
));
pageMetadataResults.forEach((pageMetadata) => {
if (pageMetadata) {
Object.assign(extractedTags, pageMetadata);
}
if (!site.isLive()) {
log.info(`Site ${siteId} is not live`);
return ok();
}
const configuration = await dataAccess.getConfiguration();
if (!configuration.isHandlerEnabledForSite(type, site)) {
log.info(`Audit type ${type} disabled for site ${siteId}`);
return ok();
}
try {
auditContext.finalUrl = await composeAuditURL(site.getBaseURL());
} catch (e) {
log.error(`Get final URL for siteId ${siteId} failed with error: ${e.message}`, e);
return internalServerError(`Internal server error: ${e.message}`);
}
// Fetch site's scraped content from S3
const bucketName = context.env.S3_SCRAPER_BUCKET_NAME;
const prefix = `scrapes/${siteId}/`;
const scrapedObjectKeys = await getObjectKeysUsingPrefix(s3Client, bucketName, prefix, log);
const extractedTags = {};
const pageMetadataResults = await Promise.all(scrapedObjectKeys.map(
(key) => fetchAndProcessPageObject(s3Client, bucketName, key, prefix, log),
));
pageMetadataResults.forEach((pageMetadata) => {
if (pageMetadata) {
Object.assign(extractedTags, pageMetadata);
}
});
const extractedTagsCount = Object.entries(extractedTags).length;
if (extractedTagsCount === 0) {
log.error(`Failed to extract tags from scraped content for bucket ${bucketName} and prefix ${prefix}`);
return notFound('Site tags data not available');
}
log.info(`Performing SEO checks for ${extractedTagsCount} tags`);
// Perform SEO checks
const seoChecks = new SeoChecks(log);
for (const [pageUrl, pageTags] of Object.entries(extractedTags)) {
seoChecks.performChecks(pageUrl || '/', pageTags);
}
seoChecks.finalChecks();
const detectedTags = seoChecks.getDetectedTags();
// Prepare Audit result
const auditResult = {
detectedTags,
sourceS3Folder: `${bucketName}/${prefix}`,
fullAuditRef: 'na',
finalUrl: auditContext.finalUrl,
};
const auditData = {
siteId: site.getId(),
isLive: site.isLive(),
auditedAt: new Date().toISOString(),
auditType: type,
fullAuditRef: auditResult?.fullAuditRef,
auditResult,
};
// Persist Audit result
const audit = await dataAccess.addAudit(auditData);
log.info(`Successfully audited ${siteId} for ${type} type audit`);
await syncOpportunityAndSuggestions(
siteId,
audit.getId(),
auditData,
dataAccess,
log,
);
return noContent();
} catch (e) {
log.error(`${type} type audit for ${siteId} failed with error: ${e.message}`, e);
return internalServerError(`Internal server error: ${e.message}`);
});
const extractedTagsCount = Object.entries(extractedTags).length;
if (extractedTagsCount === 0) {
log.error(`Failed to extract tags from scraped content for bucket ${bucketName} and prefix ${prefix}`);
}
log.info(`Performing SEO checks for ${extractedTagsCount} tags`);
// Perform SEO checks
const seoChecks = new SeoChecks(log);
for (const [pageUrl, pageTags] of Object.entries(extractedTags)) {
seoChecks.performChecks(pageUrl || '/', pageTags);
}
seoChecks.finalChecks();
const detectedTags = seoChecks.getDetectedTags();

const auditResult = {
detectedTags,
sourceS3Folder: `${bucketName}/${prefix}`,
fullAuditRef: 'na',
finalUrl: baseURL,
};

return {
auditResult,
fullAuditRef: baseURL,
};
}

export default new AuditBuilder()
.withUrlResolver(noopUrlResolver)
.withRunner(auditMetaTagsRunner)
.build();
2 changes: 1 addition & 1 deletion test/audits/backlinks.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import chaiAsPromised from 'chai-as-promised';
import sinon from 'sinon';
import sinonChai from 'sinon-chai';
import nock from 'nock';
import auditBrokenBacklinks from '../../src/backlinks/handler.js';
import { auditBrokenBacklinks } from '../../src/backlinks/handler.js';

use(sinonChai);
use(chaiAsPromised);
Expand Down
Loading
Loading