Skip to content

Content Processing

All WordPress post content goes through Model/ContentProcessor.php before being rendered. The pipeline has four steps applied in sequence.

Raw WP content (HTML string)
-> Step 1: rewriteInternalLinks()
-> Step 2: addLazyLoading()
-> Step 3: stripShortcodes()
-> Step 4: sanitizeDangerousHtml()
-> Processed content (HTML string)

WordPress internal links (links pointing to the WordPress domain) are rewritten to use the Magento blog URL.

Before:

<a href="https://blog.example.com/my-post">Read more</a>
<a href="https://blog.example.com/2024/01/15/my-post">Old permalink</a>

After:

<a href="/blog/my-post">Read more</a>
<a href="/blog/my-post">Old permalink</a>

The regex matches href attributes pointing to the configured WordPress URL. Date-based permalink prefixes (YYYY/MM/DD/) are stripped from paths before rewriting.

loading="lazy" is added to all <img> tags that do not already have a loading attribute.

Before:

<img src="photo.jpg" alt="Photo">
<img src="hero.jpg" alt="Hero" loading="eager">

After:

<img loading="lazy" src="photo.jpg" alt="Photo">
<img src="hero.jpg" alt="Hero" loading="eager">

The regex uses a negative lookahead to skip images that already have a loading attribute.

Any WordPress shortcodes that WordPress did not expand into HTML are removed. This prevents text like [gallery ids="1,2,3"] or [/caption] appearing in the rendered output.

The pattern /\[\/?\w[^\]]*\]/ matches both opening ([foo]) and closing ([/foo]) shortcode syntax.

Four categories of dangerous HTML constructs are removed:

WhatPatternAction
Script blocks<script>...</script>Remove entire block including content
Style blocks<style>...</style>Remove entire block including content
Event handlersonclick=, onload=, etc.Remove attribute
JS protocolhref="javascript:..."Replace with href="#"

This is a lightweight pass. If your use case requires stricter sanitization (e.g., user-submitted content), consider adding HTMLPurifier as an additional step after this pipeline.

To add a custom processing step, create a plugin on ContentProcessor:

class MyContentPlugin
{
public function afterProcess(
ContentProcessor $subject,
string $result
): string {
// Add custom processing
return str_replace('old-domain.com', 'new-domain.com', $result);
}
}