Spike: Test JSON logs in new cloud.gov logging service #1227

exalate-issue-sync · 2024-11-29T20:16:39Z

Original ticket: https://fecgov.atlassian.net/browse/FECFILE-241

Now that cloud.gov fully supports json logging, we should test json logging in Kibana to see how it works for us.

Deploy a version of the app to dev that’s outputting logs in json. What does it look like viewing the logs vs. searching the logs?

Identify if using json logging can enable flat searching of custom fields such as User ID

Create follow-up tickets as needed and link to this ticket.

QA Notes

null

DEV Notes

Note: Structlog supports JSON out of the box.

Design

null

See full ticket and images here: FECFILE-1864

exalate-issue-sync · 2024-12-17T14:30:45Z

Elaine Krauss commented: Formatting our logs for JSON is easily done (see the example PR [here|https://github.com//pull/1256]), but on its own it seems to convey no intrinsic benefit. The logs, as seen in the new OpenSearch-based logging system on [logs.fr.cloud.gov|http://logs.fr.cloud.gov], have no significant visible difference before and after implementing JSON based logging. The new system’s support for JSON logs is more oriented towards interpreting and accessing the data of individual fields than it is to

The [announcement post|https://cloud.gov/2024/11/21/new-logging-system/] describes that JSON logs can optionally be ingested as “flat objects” (see the docs [here|https://opensearch.org/docs/latest/field-types/supported-field-types/flat-object/] and [here|https://opensearch.org/docs/latest/field-types/]). This is done by creating an “explicit mapping” that describes the layout of a log entry to OpenSearch. On our OpenSearch setup, we are currently using automatic “dynamic” mapping, and their documentation recommends using explicit mappings instead for the sake of consistency and performance. We should create explicit mappings for our logs anyways, and as we’re making those, we can define specific fields as JSON-based “flat objects” in order to access nested data if necessary.

The use of “flat objects” comes with a few benefits as listed in the documentation:

Efficient reads: Fetching performance is similar to that of a keyword field.
Memory efficiency: Storing the entire complex JSON object in one field without indexing all of its subfields reduces the number of fields in an index.
Space efficiency: OpenSearch does not create an inverted index for subfields in flat objects, thereby saving space.
Compatibility for migration: You can migrate your data from systems that support similar flat types to OpenSearch.

But they also lack certain features compared to more standard fields (also taken from the docs),

Type-specific parsing.
Numerical operations, such as numerical comparison or numerical sorting.
Text analysis.
Highlighting.
Aggregations of subfields using dot notation.
Filtering by subfields.

Lacking the ability to filter based on the nested data in a flat object is, in my opinion, a fairly significant drawback. It also seems somewhat unnecessary, though, since I am not aware of any fields in our logs where we are nesting data inside of a JSON object. Still, it’s a tool worth keeping in mind. Regardless of whether or not we map any fields as flat objects, I would recommend that we take the time to add explicit mappings for our log entry fields, if only for the performance benefits that the OpenSearch documentation’s [described|https://opensearch.org/docs/latest/field-types/] performance benefits.

exalate-issue-sync · 2024-12-17T17:53:02Z

Elaine Krauss commented: In order to create explicit mappings in OpenSearch, open the menu (accessed via the hamburger in the top-left corner), scroll down to the “Management” category, and click on “Index Management.” Next, click on Create Index in the top-right corner. We then need to create an index (see the docs [here|https://opensearch.org/docs/latest/api-reference/index-apis/create-index/]) for a specific type of document (i.e, API log entries), and the (optional) last step is index mapping. OpenSearch provides a visual tool for this that should prove easy enough to use, but the documentation for mappings can be found [here|https://opensearch.org/docs/latest/field-types/].

!image-20241217-173523.png|width=526,height=1193,alt="image-20241217-173523.png"!

Regarding the other fields on the form, the index name doesn’t seem to require any special logic, it just needs to be unique and descriptive. For the three other listed index settings, # of primary shards, # of replicas, and refresh interval, the default values should prove sufficient for our use-case, but further information on each of them can be found [here|https://opensearch.org/docs/latest/install-and-configure/configuring-opensearch/index-settings/].

exalate-issue-sync · 2024-12-17T17:54:06Z

Elaine Krauss commented: See my previous two comments, [~accountid:5b92c509d0b4022bdc51bdf4] [~accountid:712020:2a1493e5-adee-45bd-b27e-868a5c8d3f62] for what to peer review.

exalate-issue-sync bot assigned Elaine-Krauss-TCG Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: Test JSON logs in new cloud.gov logging service #1227

Spike: Test JSON logs in new cloud.gov logging service #1227

exalate-issue-sync bot commented Nov 29, 2024 •

edited

Loading

exalate-issue-sync bot commented Dec 17, 2024 •

edited

Loading

exalate-issue-sync bot commented Dec 17, 2024

exalate-issue-sync bot commented Dec 17, 2024

Spike: Test JSON logs in new cloud.gov logging service #1227

Spike: Test JSON logs in new cloud.gov logging service #1227

Comments

exalate-issue-sync bot commented Nov 29, 2024 • edited Loading

QA Notes

DEV Notes

Design

exalate-issue-sync bot commented Dec 17, 2024 • edited Loading

exalate-issue-sync bot commented Dec 17, 2024

exalate-issue-sync bot commented Dec 17, 2024

exalate-issue-sync bot commented Nov 29, 2024 •

edited

Loading

exalate-issue-sync bot commented Dec 17, 2024 •

edited

Loading