Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Full-Page Caching for 404s #46

Closed
wants to merge 54 commits into from
Closed
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
81b09b9
Serve cache for 404 pages - WIP
mslinnea Jun 15, 2023
4d57b69
add stale cache, switch to using a cron job
mslinnea Jun 22, 2023
4598a31
phpcs and work on tests
mslinnea Jun 22, 2023
bc52963
use output buffering to save the cache
mslinnea Jun 28, 2023
1c97d49
Merge remote-tracking branch 'origin/main' into feature/9/caching-404s
mslinnea Jun 28, 2023
b6e709f
schedule single event. remove stderror flag because test failures wer…
mslinnea Jun 28, 2023
730dd63
Merge branch 'main' into feature/9/caching-404s
mslinnea Nov 1, 2023
c8c9f8b
Update tests/alley/wp/alleyvate/features/test-full-page-cache-404.php
mslinnea Nov 1, 2023
466282f
Merge remote-tracking branch 'origin/main' into feature/9/caching-404s
mslinnea Dec 28, 2023
38ddf23
prevent outputting headers if already sent
mslinnea Dec 28, 2023
6e3946d
Merge branch 'feature/9/caching-404s-local' into feature/9/caching-404s
mslinnea Dec 28, 2023
951296c
Avoid setting cache to empty string
mslinnea Dec 28, 2023
71b88bb
phpcs
mslinnea Dec 28, 2023
20ad3e8
php cs fixer
mslinnea Dec 28, 2023
a51bcc9
Server 404 page early
mslinnea Dec 28, 2023
4a2ae4a
Logged in users should bypass cache
mslinnea Dec 28, 2023
b776b39
Fix issue where HTTP header was set incorrectly
mslinnea Dec 29, 2023
789eb19
Switch to static methods, add missing types, update phpdoc, remove se…
mslinnea Jan 9, 2024
15364ae
replace generator URI with request URI
mslinnea Jan 15, 2024
681cc8e
temp add --testdox to help with debugging unit tests
mslinnea Jan 15, 2024
70fff9a
Merge remote-tracking branch 'origin' into feature/9/caching-404s
mslinnea Jan 15, 2024
9b30221
phpcs
mslinnea Jan 15, 2024
b91b3da
use static methods
mslinnea Jan 15, 2024
900cb71
Merge branch 'main' into feature/9/caching-404s
renatonascalves Feb 5, 2024
b31104f
Adding tests
renatonascalves Feb 13, 2024
cde8591
Merge branch 'main' into feature/9/caching-404s
renatonascalves Feb 13, 2024
c9210e4
Merge branch 'feature/9/caching-404s' of https://github.com/alleyinte…
renatonascalves Feb 13, 2024
b53b3de
Merge branch 'feature/9/caching-404s' into feature/9/caching-404s-uni…
renatonascalves Feb 13, 2024
56b4921
Making `phpcs` happy
renatonascalves Feb 13, 2024
d36f87f
Making `php-cs-fixer` happy
renatonascalves Feb 13, 2024
7782eb8
php-cs-fixer lol
renatonascalves Feb 13, 2024
93c4b69
Making phpcs happy, conflicting tools ¯\_(ツ)_/¯
renatonascalves Feb 13, 2024
13c94f1
Only boot feature if external object cache is being used
renatonascalves Feb 13, 2024
0fcef02
Add object cache to Mantle
renatonascalves Feb 13, 2024
6b166c9
Set `MANTLE_REQUIRE_OBJECT_CACHE`
renatonascalves Feb 13, 2024
b326f71
Test using `niden/actions-memcached@v7`
renatonascalves Feb 13, 2024
ff06717
Organize tests
renatonascalves Feb 13, 2024
d905e53
Revert last change and debug Mantle
renatonascalves Feb 13, 2024
df26556
Reset
renatonascalves Feb 13, 2024
6336e62
Set `INSTALL_OBJECT_CACHE` via env
renatonascalves Feb 13, 2024
83b95d2
Set `INSTALL_OBJECT_CACHE` via env
renatonascalves Feb 13, 2024
5b2021b
Set `INSTALL_OBJECT_CACHE`
renatonascalves Feb 13, 2024
6bacb48
Remove `INSTALL_OBJECT_CACHE: true`
renatonascalves Feb 13, 2024
6f9c270
Skip tests if object cache is not available
renatonascalves Feb 13, 2024
7a8c1ee
Disable tests if object cache is not in use
renatonascalves Feb 13, 2024
f6f985f
Adding CR suggestions
renatonascalves Feb 14, 2024
cd33707
Minor tweak
renatonascalves Feb 14, 2024
f628f26
Merge pull request #76 from alleyinteractive/feature/9/caching-404s-u…
renatonascalves Feb 14, 2024
1590143
Sync with the latest
renatonascalves Feb 22, 2024
ac984cd
Dot not clean buffer too early
renatonascalves Feb 22, 2024
8017c6a
Clean up any previous buffer before starting our own
renatonascalves Feb 22, 2024
a2c04df
The "Full-Page Caching for 404s" feature requires ssl
renatonascalves Feb 28, 2024
370860d
php-cs-fixer fixes
renatonascalves Feb 28, 2024
006f900
Merge pull request #77 from alleyinteractive/feature/9/caching-404s-r…
renatonascalves Feb 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .php-cs-fixer.dist.php
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,10 @@
// Enabled by '@PHP81Migration' but generates invalid spacing for WordPress.
'method_argument_space' => false,

'final_class' => true,
'native_constant_invocation' => true,
'native_function_casing' => true,
'native_function_invocation' => true,
'native_function_type_declaration_casing' => true,
'final_class' => true,
'native_constant_invocation' => true,
'native_function_casing' => true,
'native_function_invocation' => true,
]
);
$config->setFinder( $finder );
Expand Down
306 changes: 306 additions & 0 deletions src/alley/wp/alleyvate/features/class-full-page-cache-404.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
<?php
/**
* Class file for Full Page Cache for 404s.
*
* (c) Alley <[email protected]>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*
* @package wp-alleyvate
*/

declare( strict_types=1 );

namespace Alley\WP\Alleyvate\Features;

use Alley\WP\Alleyvate\Feature;

/**
* Full Page Cache for 404s.
*/
final class Full_Page_Cache_404 implements Feature {

/**
* Cache group.
*
* @var string
*/
public const CACHE_GROUP = 'alleyvate';

/**
* Cache key.
*
* @var string
*/
public const CACHE_KEY = '404_cache';

/**
* Cache key for stale cache.
*
* @var string
*/
public const STALE_CACHE_KEY = '404_cache_stale';

/**
* Cache time.
*
* @var int
*/
public const CACHE_TIME = HOUR_IN_SECONDS;

/**
* Stale cache time.
*
* @var int
*/
public const STALE_CACHE_TIME = DAY_IN_SECONDS;

/**
* Guaranteed 404 URI.
* Used for populating the cache.
*
* @var string
*/
public const TEMPLATE_GENERATOR_URI = '/wp-alleyvate/404-template-generator/?generate=1&uri=1';

/**
* Boot the feature.
*/
public function boot(): void {

/**
* Only boot feature if external object cache is being used.
*
* We don't want to store the cached 404 page in the database.
*/
if ( ! (bool) wp_using_ext_object_cache() ) {
return;
}

// Return 404 page cache on template_redirect.
add_action( 'template_redirect', [ self::class, 'action__template_redirect' ], 1 );

// For the Guaranteed 404 page, hook in on WP to start output buffering, to capture the HTML.
add_action( 'wp', [ self::class, 'action__wp' ] );

// Replenish the cache every hour.
if ( ! wp_next_scheduled( 'alleyvate_404_cache' ) ) {
wp_schedule_event( time(), 'hourly', 'alleyvate_404_cache' );
}

// Callback for Cron Event.
add_action( 'alleyvate_404_cache', [ self::class, 'trigger_404_page_cache' ] );
add_action( 'alleyvate_404_cache_single', [ self::class, 'trigger_404_page_cache' ] );
}

/**
* Get 404 Page Cache and return early if found.
*/
public static function action__template_redirect(): void {

// Don't cache if user is logged in.
if ( is_user_logged_in() ) {
return;
}

// Don't cache if not a 404.
if ( ! is_404() ) {
return;
}

// Allow this URL through, as this request will populate the cache.
if ( isset( $_SERVER['REQUEST_URI'] ) && self::TEMPLATE_GENERATOR_URI === $_SERVER['REQUEST_URI'] ) {
return;
}

echo self::get_cached_response_with_headers(); // phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped

if ( \defined( 'MANTLE_IS_TESTING' ) && MANTLE_IS_TESTING ) {
wp_die( '', '', [ 'response' => 404 ] );
}

exit;
}

/**
* Get cached response with headers.
*
* @return string
*/
public static function get_cached_response_with_headers(): string {
$stale_cache_in_use = false;
$cache = self::get_cache();

if ( false === $cache ) {
$cache = self::get_stale_cache();
$stale_cache_in_use = true;
}

if ( ! empty( $cache ) ) {
$html = self::prepare_response( $cache );

self::send_header( 'HIT', $stale_cache_in_use );

// Cached content is already escaped.
return $html; // phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped
}

// Schedule a single event to generate the cache immediately.
if ( ! wp_next_scheduled( 'alleyvate_404_cache_single' ) ) {
wp_schedule_single_event( time(), 'alleyvate_404_cache_single' );
}

self::send_header( 'MISS' );

// If no cache, return an empty string.
return '';
}

/**
* Send X-Alleyvate HTTP Header.
*
* @param string $type HIT or MISS.
* @param bool $stale Whether the stale cache is in use. Default false.
*/
public static function send_header( string $type, bool $stale = false ): void {

if ( headers_sent() ) {
return;
}

if ( ! $stale && 'HIT' === $type ) {
header( 'X-Alleyvate-404-Cache: HIT' );
} elseif ( $stale && 'HIT' === $type ) {
header( 'X-Alleyvate-404-Cache: HIT (stale)' );
} elseif ( 'MISS' === $type ) {
header( 'X-Alleyvate-404-Cache: MISS' );
}
}

/**
* Start output buffering, so we can cache the 404 page.
*
* @global WP_Query $wp_query WordPress database access object.
*/
public static function action__wp(): void {
if ( isset( $_SERVER['REQUEST_URI'] ) && self::TEMPLATE_GENERATOR_URI === $_SERVER['REQUEST_URI'] ) {
global $wp_query;

if ( ! $wp_query->is_404() ) {
return;
}

ob_start( [ self::class, 'finish_output_buffering' ] );
renatonascalves marked this conversation as resolved.
Show resolved Hide resolved

// Clean up the buffer.
ob_get_clean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, while testing this in a client site, I noticed PHP can run multiple buffers in a request. In my testing environment, there was only one, but in the client site, by the time the code gets here, it is buffer number 2. So it returns the wrong value (empty).

I'm inclined to remove this line since it will exit anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the following error while running the unit tests, but not while testing on a site. 🤔

Test code or tested code did not (only) close its own output buffers

Copy link
Contributor

@renatonascalves renatonascalves Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I learned that buffers can actually be nested and it is pretty hard to match the status with the initial value.

So here, we are clearing any previous buffers before starting our own.

8017c6a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be closing the output buffer that we're creating. We could also, potentially, keep the output buffer that already exists if output buffering is already active.

ob_get_status will tell us the current status of output buffering. If there is an active buffer, it will return a non-empty array. If there is not an active buffer, it will return an empty array. We could use that function to detect whether an output buffer is currently active and set a flag, which could control whether we spin up a new output buffer or use the current one, or whether we capture anything that's already in an existing output buffer for use later.

Also, the current implementation relies on a callback for when the buffer is naturally flushed, but we could perhaps hook into shutdown and capture the output there and add it to the cache.

Alternately, this doesn't need to be considered an Alleyvate bug, and could instead be incumbent upon sites that use this plugin to ensure they don't have an active output buffering session on the 404 page, as it would break this feature (as written). Having an active output buffering session on the 404 page that doesn't come from this plugin is likely to be an edge case and could better be handled in that codebase rather than trying to code around it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I revisited this morning and I had 3 buffers already present before this one in a regular site and client site. No idea yet how. ¯_(ツ)_/¯

But I'll take your notes and do a bit more digging and hopefully apply a solution that works by default.

Also, the current implementation relies on a callback for when the buffer is naturally flushed, but we could perhaps hook into shutdown and capture the output there and add it to the cache.

🤔 I'd expect more buffers to be available at this point, added by plugins, etc. I'll test this approach just in case.

Alternately, this doesn't need to be considered an Alleyvate bug, and could instead be incumbent upon sites that use this plugin to ensure they don't have an active output buffering session on the 404 page,

I'll try to find a way to make it work for "any" sites.

Copy link
Contributor

@renatonascalves renatonascalves Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick feedback after some testing:

  • Hooking into shutdown hook: the buffer always returns empty. My guess is that it is already cleared by the time it gets there. It's not that easy to track buffers.
  • Using ob_get_status is useful, but any attempt to clear the buffer after that, clears the html and it returns empty instead of the 404 page buffer. I tried a mixture of flushing the previous buffer and flushing ours later. I'm at a lost at why it is empty.

So far, any attempt to clear the buffer after ob_start( [ self::class, 'finish_output_buffering' ] );, it is essentially clearing the HTML from the buffer and returning "". Meaning that we are not caching the html page.

}
}

/**
* Finish output buffering.
*
* @global WP_Query $wp_query WordPress database access object.
*
* @param string $buffer Buffer.
* @return string
*/
public static function finish_output_buffering( string $buffer ): string {
global $wp_query;

if ( ! $wp_query->is_404() ) {
return $buffer;
}

if ( is_user_logged_in() ) {
return $buffer;
}

if ( ! self::get_cache() && ! empty( $buffer ) ) {
self::set_cache( $buffer );
}

return $buffer;
}

/**
* Get cache.
*
* @return mixed
*/
public static function get_cache(): mixed {
return wp_cache_get( self::CACHE_KEY, self::CACHE_GROUP );
}

/**
* Get stale cache.
*
* @return mixed
*/
public static function get_stale_cache(): mixed {
return wp_cache_get( self::STALE_CACHE_KEY, self::CACHE_GROUP );
}

/**
* Set cache.
*
* @param string $buffer The Output Buffer.
*/
public static function set_cache( string $buffer ): void {
wp_cache_set( self::CACHE_KEY, $buffer, self::CACHE_GROUP, self::CACHE_TIME ); // phpcs:ignore WordPressVIPMinimum.Performance.LowExpiryCacheTime.CacheTimeUndetermined
wp_cache_set( self::STALE_CACHE_KEY, $buffer, self::CACHE_GROUP, self::STALE_CACHE_TIME ); // phpcs:ignore WordPressVIPMinimum.Performance.LowExpiryCacheTime.CacheTimeUndetermined
}

/**
* Delete cache.
*/
public static function delete_cache(): void {
wp_cache_delete( self::CACHE_KEY, self::CACHE_GROUP );
wp_cache_delete( self::STALE_CACHE_KEY, self::CACHE_GROUP );
}

/**
* Prepare response.
*
* @param string $content The content.
* @return string
*/
public static function prepare_response( string $content ): string {
// To avoid analytics issues, replace the Generator URI with the requested URI.
$uri = sanitize_text_field( $_SERVER['REQUEST_URI'] ?? '' );

return str_replace(
[
self::TEMPLATE_GENERATOR_URI,
wp_json_encode( self::TEMPLATE_GENERATOR_URI ),
esc_html( self::TEMPLATE_GENERATOR_URI ),
esc_url( self::TEMPLATE_GENERATOR_URI ),
],
[
$uri,
wp_json_encode( $uri ),
esc_html( $uri ),
esc_url( $uri ),
],
$content
);
}

/**
* Spin up a request to the guaranteed 404 page to populate the cache.
*/
public static function trigger_404_page_cache(): void {
$url = home_url( self::TEMPLATE_GENERATOR_URI, 'https' );

// Replace http with https to ensure the styles don't get blocked due to insecure content.
$url = str_replace( 'http://', 'https://', $url );

// This request will populate the cache using output buffering.
if ( \function_exists( 'wpcom_vip_file_get_contents' ) ) {
wpcom_vip_file_get_contents( $url );
} else {
wp_remote_get( $url ); // phpcs:ignore WordPressVIPMinimum.Functions.RestrictedFunctions.wp_remote_get_wp_remote_get
}
}
}
1 change: 1 addition & 0 deletions src/alley/wp/alleyvate/load.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ function available_features(): array {
'redirect_guess_shortcircuit' => new Features\Redirect_Guess_Shortcircuit(),
'site_health' => new Features\Site_Health(),
'user_enumeration_restrictions' => new Features\User_Enumeration_Restrictions(),
'full_page_cache_404' => new Features\Full_Page_Cache_404(),
];
}

Expand Down
Loading
Loading