Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patterns: Decode potentially-malformed ampersands in block content #533

Merged
merged 1 commit into from
Oct 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ function register_rest_fields() {
'pattern_content',
array(
'get_callback' => function() {
return wp_kses_post( get_the_content() );
return decode_pattern_content( get_the_content() );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's safe not to sanitize through wp_kses_post here? Check my comment: WordPress/gutenberg#47636 (comment)

The issue here is a result of GB sanitizing the pattern content through wp_kses_post. The flow there is:

  • wp_kses_normalize_entities replaces & to &
  • wp_pre_kses_block_attributes replaces & to \u0026 that results in \u0026amp; in the final output.

The above though is for security reasons and normally is not an issue for block attributes that are not URLs. In the Cover
block case though it is a problem, as it results to a malformed URL.

},

'schema' => array(
Expand Down Expand Up @@ -793,3 +793,28 @@ function() {
exit;
}
);

/**
* Intercept the post object and decode the content.
*/
add_action(
'the_post',
function( $post ) {
$post->post_content = decode_pattern_content( $post->post_content );
}
);

/**
* Process post content, replacing broken encoding.
*
* Some image URLs have &s, which are double-encoded and sanitized to become malformed,
* for example, `https://img.rawpixel.com/s3fs-private/rawpixel_images/website_content/a010-markuss-0964.jpg?w=1200\u0026amp;h=1200\u0026amp;fit=clip\u0026amp;crop=default\u0026amp;dpr=1\u0026amp;q=75\u0026amp;vib=3\u0026amp;con=3\u0026amp;usm=15\u0026amp;cs=srgb\u0026amp;bg=F4F4F3\u0026amp;ixlib=js-2.2.1\u0026amp;s=7d494bd5db8acc2a34321c15ed18ace5`.
*
* @param string $content The raw post content.
*
* @return string
*/
function decode_pattern_content( $content ) {
// Sometimes the initial `\` is missing, so look for both versions.
return str_replace( [ '\u0026amp;', 'u0026amp;' ], '&', $content );
}
20 changes: 0 additions & 20 deletions public_html/wp-content/themes/pattern-directory/functions.php
Original file line number Diff line number Diff line change
Expand Up @@ -213,26 +213,6 @@ function user_has_flagged_pattern() {
return $items->have_posts();
}

/**
* Get the full, filtered content of a post, ignoring more and noteaser tags and pagination.
*
* See https://github.com/WordPress/wordcamp.org/blob/442ea26d8e6a1b39f97114e933842b1ec4f8eef9/public_html/wp-content/mu-plugins/blocks/includes/content.php#L21
*
* @param int|WP_Post $post Post ID or post object.
* @return string The full, filtered post content.
*/
function get_all_the_content( $post ) {
$post = get_post( $post );

$content = wp_kses_post( $post->post_content );

/** This filter is documented in wp-includes/post-template.php */
$content = apply_filters( 'the_content', $content );
$content = str_replace( ']]>', ']]>', $content );

return $content;
}

/**
* Set up redirects for the site.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,43 +8,42 @@
*/

namespace WordPressdotorg\Pattern_Directory\Theme;
use function WordPressdotorg\Pattern_Directory\Theme\get_all_the_content;
use function WordPressdotorg\Pattern_Directory\Pattern_Post_Type\decode_pattern_content;

get_header();

$user_has_reported = is_user_logged_in() ? user_has_flagged_pattern() : false;
$raw_block_content = get_the_content();
?>
<input id="block-data" type="hidden" value="<?php echo rawurlencode( wp_json_encode( $raw_block_content ) ); ?>" />
<main id="main" class="site-main col-12" role="main">
while ( have_posts() ) :
the_post();

<?php
while ( have_posts() ) :
the_post();
?>
$user_has_reported = is_user_logged_in() ? user_has_flagged_pattern() : false;
$raw_block_content = decode_pattern_content( get_the_content() );
?>

<article id="post-<?php the_ID(); ?>" <?php post_class(); ?>>
<div
hidden
class="pattern__container"
data-post-id="<?php echo intval( get_the_ID() ); ?>"
data-user-has-reported="<?php echo json_encode( $user_has_reported ); ?>"
></div><!-- .pattern__container -->
<input id="block-data" type="hidden" value="<?php echo rawurlencode( wp_json_encode( $raw_block_content ) ); ?>" />
<main id="main" class="site-main col-12" role="main">

<div class="entry-content hide-if-pattern-loaded">
<?php the_content(); ?>
<article id="post-<?php the_ID(); ?>" <?php post_class(); ?>>
<div
hidden
class="pattern__container"
data-post-id="<?php echo intval( get_the_ID() ); ?>"
data-user-has-reported="<?php echo json_encode( $user_has_reported ); ?>"
></div><!-- .pattern__container -->

<hr />
<div class="entry-content hide-if-pattern-loaded">
<?php the_content(); ?>

<label for="pattern-code"><?php esc_html_e( 'Pattern Code', 'wporg-patterns' ); ?></label>
<textarea id="pattern-code" class="pattern-code"><?php echo esc_attr( $raw_block_content ); ?></textarea>
</div>
<hr />

</article><!-- #post-## -->
<label for="pattern-code"><?php esc_html_e( 'Pattern Code', 'wporg-patterns' ); ?></label>
<textarea id="pattern-code" class="pattern-code"><?php echo esc_attr( $raw_block_content ); ?></textarea>
</div>

<?php endwhile; ?>
</article><!-- #post-## -->

</main><!-- #main -->

<?php
<?php
endwhile;

get_footer();