Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RewriteAmpUrls transformer #119

Merged
merged 62 commits into from
Apr 9, 2021
Merged

Conversation

schlessera
Copy link
Collaborator

@schlessera schlessera commented Apr 1, 2021

This PR adds the RewriteAmpUrls transformer that enables the following features:

  • ES modules support (enabled by default)
  • LTS support
  • Custom prefix support
  • Custom runtime version support
  • Custom geo-api URL

This PR also moves some code from AMP_DOM_Utils into the library to enable the library to use copyAttributes().

This PR also includes changes to the ReorderHead transformer to make it work properly with module/nomodule script combinations.

Fixes #20

@schlessera schlessera added this to the 0.4.0 milestone Apr 1, 2021
src/Dom/Element.php Outdated Show resolved Hide resolved
src/Dom/Element.php Outdated Show resolved Hide resolved
src/Dom/Element.php Outdated Show resolved Hide resolved
src/Dom/Element.php Outdated Show resolved Hide resolved
src/Dom/Element.php Outdated Show resolved Hide resolved
src/Optimizer/Transformer/RewriteAmpUrls.php Show resolved Hide resolved
src/Optimizer/Transformer/RewriteAmpUrls.php Outdated Show resolved Hide resolved
return false;
}

return strpos($url, Amp::CACHE_HOST) === 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing, but in shouldPreload it's using substr_compare(). Would that not be minimally better than using strpos() here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e0dd638

src/Optimizer/Transformer/RewriteAmpUrls.php Outdated Show resolved Hide resolved
schlessera and others added 2 commits April 6, 2021 08:14
* @param string $name Name of the part to return.
* @return string|null Part string or null if it was not found during parsing.
*/
public function __get($name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we should implement __set() as well, since it would be very useful to be able to change out just the host, when rewriting a URL to use the AMP Cache for example.

Either this, or make the member variables public.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend against that. This is supposed to behave like a value object, so it should be immutable to avoid bugs.

It could have a method to retrieve a new clone with a different host, though:

$cachedUrl = $url->withHost($cachedHost);

The important part is that the original object is not changed (as that would changed it in all the places it was passed around), but rather we return a new instance with the adapted host.

@@ -270,7 +270,7 @@ private function detectImageWithAttribute(Element $element, $attribute)
}

$src = $element->getAttribute(Attribute::SRC);
if ($element->tagName === Extension::IMG && Url::isValidNonDataUrl($src)) {
if ($element->tagName === Extension::IMG && (new Url($src))->isValidNonDataUrl()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that Url can throw an exception, should this method indicate that it @throws FailedToParseUrl Exception when the URL or Base URL is malformed.? Should the exception get added to $errors?

Comment on lines +274 to +276
if (!empty($url->scheme) && !empty($url->host)) {
$origin = "{$url->scheme}://{$url->host}";
$this->addMeta($document, 'runtime-host', $origin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about scheme-relative URLs? In that case, $url->scheme would be empty and the $origin would be "//{$url->host}". I guess that's not supported, and we'd want an explicit HTTPS URL anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would fail the first check (!empty($url->scheme)) and generate an error. Looks correct to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if such URLs should be supported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the requirements of the runtime are here. @sebastianbenz ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weston is right, we want an explicit https URL.

src/Url.php Show resolved Hide resolved
Comment on lines +89 to +94
/**
* Query string.
*
* @var string|null
*/
private $query;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it will be nice to add a way to access/manipulate the query vars as an array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be powered by a separate object, so that you'd be able to use it as part of a URL, or stand-alone when dealing with WP query vars, for example.

Comment on lines +315 to +339
private function calculateHost()
{
$lts = $this->configuration->get(RewriteAmpUrlsConfiguration::LTS);
$rtv = $this->configuration->get(RewriteAmpUrlsConfiguration::RTV);

if ($lts && $rtv) {
throw InvalidConfiguration::forMutuallyExclusiveFlags(
RewriteAmpUrlsConfiguration::LTS,
RewriteAmpUrlsConfiguration::RTV
);
}

$ampUrlPrefix = $this->configuration->get(RewriteAmpUrlsConfiguration::AMP_URL_PREFIX);
$ampRuntimeVersion = $this->configuration->get(RewriteAmpUrlsConfiguration::AMP_RUNTIME_VERSION);

$ampUrlPrefix = rtrim($ampUrlPrefix, '/');

if ($ampRuntimeVersion && $rtv) {
$ampUrlPrefix = RuntimeVersion::appendRuntimeVersion($ampUrlPrefix, $ampRuntimeVersion);
} elseif ($lts) {
$ampUrlPrefix .= '/lts';
}

return $ampUrlPrefix;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that the rtv and ampRuntimeVersion are both empty. Instead of seeing the AMP runtime loaded with https://cdn.ampproject.org/rtv/012103261048002/v0.mjs it comes out rtv-less: https://cdn.ampproject.org/v0.mjs

However, if I hack this to force $rtv to be true and $ampRuntimeVersion to be 012103261048002, then the result is invalid AMP:

The attribute 'src' in tag 'amphtml module engine script' is set to the invalid value 'https://cdn.ampproject.org/rtv/012103261048002/v0.mjs'

I guess versioned URLs are not yet valid AMP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only lts is valid. RTV-specific scripts are not yet valid: https://github.com/ampproject/amphtml/blob/52407d1c72c5e2051fda427270b9503ae9900ebf/validator/validator-main.protoascii#L3109-L3256

I think this is being tracked by ampproject/amphtml#27546, although this is about self-hosting and not allowing versioned script URLs. @sebastianbenz is that right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, there are no plans to support RTV versions served from cdn.ampproject.org.

…into add/20-rewrite-amp-urls-transformer

* 'main' of https://github.com/ampproject/amp-toolbox-php:
  Sync spec test suite - 2021-04-09
  Sync local fallback files - 2021-04-09
  Only construct preload if it is needed
  Sync local fallback files - 2021-04-08
  Replace phpcs exclude-pattern with explicit file includes
  Delete .phpunit.result.cache
  Sync local fallback files - 2021-04-07
  moving the curl_errno() call up, otherwise won't be reached if the response body is empty
  Calling curl_errno() before closing the conextion trhows an error
  Sync local fallback files - 2021-04-04
  Sync local fallback files - 2021-04-03
  Fix SSR for nested AMP components
  Sync spec test suite - 2021-04-02
  Sync local fallback files - 2021-04-02
  Sync local fallback files - 2021-04-01
@schlessera schlessera merged commit 6df0c85 into main Apr 9, 2021
@schlessera schlessera deleted the add/20-rewrite-amp-urls-transformer branch April 9, 2021 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend Optimizer with transformer for module/nomodule scripts (ESM)
3 participants