Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xff: add support for configuring a list of trusted CIDRs #31831

Merged
merged 51 commits into from
Sep 9, 2024

Conversation

jamesog
Copy link
Contributor

@jamesog jamesog commented Jan 15, 2024

Commit Message: xff: add support for configuring a list of trusted CIDRs

The original client IP address can be determined from the x-forwarded-for header either by a fixed number of trusted hops, or by evaluating the client IP address against a list of trusted addresses.

This adds support for configuring a list of CIDRs in the xff original IP detection extension. The remote IP address is evaluated against these, and optionally recurses through XFF to find the last non-trusted address.

Additional Description:
This feature is generally used by people with a CDN in front of their edge proxy to ensure that XFF is only parsed when the remote connection comes from a CDN server.

The behaviour of the new functionality should be the same as Nginx's realip module.

Disclaimer: This is my first time writing C++ so I'm not certain my changes are completely idiomatic, but I've tried to stick with existing style in the codebase. Feedback very welcome!

Risk Level: Medium
Testing: Unit tests, manual tests
Docs Changes: Updates to HTTP Connection Manager header manipulation docs, and proto docs.
Release Notes: Added to changelogs/current.yaml
Platform Specific Features: None
Fixes #21639
Relates to #31296

X-Forwarded-For parsing can either trust a number of hops, as Envoy does
today, or it can trust by IP address.

This enables configuration to add a set of CIDRs to the XFF config, and
a recurse option.

When recurse is not enabled, HCM should trust a remote IP that is
covered by one of the configured CIDRs and use the last IP in XFF as the
remote IP.

When recurse is enabled, if the remote IP is covered by one of the
CIDRs, HCM should also walk backwards through XFF to find the first IP
not covered by the trusted CIDR list.

Signed-off-by: James O'Gorman <[email protected]>
remoteAddressIsTrustedProxy checks the remote IP against a list of
CIDRs. If it is covered by any CIDR the remote is considered a trusted
proxy.

getLastNonTrustedAddressFromXFF walks backwards through XFF to find the
first IP that is not covered by a list of trusted CIDRs. This IP is
considered the "true" remote IP.

Signed-off-by: James O'Gorman <[email protected]>
If a list of trusted CIDRs is set, rather than checking for a set number
of trusted hops, instead check the remote IP against the trusted CIDR
list.

If recurse is set, XFF is also evaluated to find the last IP in the list
that is not covered by the trusted CIDRs.

Signed-off-by: James O'Gorman <[email protected]>
Copy link

Hi @jamesog, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #31831 was opened by jamesog.

see: more, trace.

Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #31831 was opened by jamesog.

see: more, trace.

Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
Override xff_num_trusted_hops in the constructor instead of validating
in the proto message when xff_trusted_cidrs is set.

Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
@markdroth
Copy link
Contributor

/lgtm api

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So sorry for the review delay. Thanks for the well documented and tested first pass - my thoughts below.
/wait

@@ -22,5 +26,37 @@ message XffConfig {
// determining the origin client's IP address. The default is zero if this option
// is not specified. See the documentation for
// :ref:`config_http_conn_man_headers_x-forwarded-for` for more information.
//
// If ``xff_trusted_cidrs`` is configured, this field is not used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we disallow it being set and reject config if it is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had this change as using a oneof encapsulating the original xff_num_trusted_hops and the new xff_trusted_cidrs fields, but Mark told me that would be a breaking API change and to do it this way instead. Do you know if there's another way to handle rejecting it without it being a breaking change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to disallow if both the old and new fields are set, since that makes it hard for control planes to start using this new feature.

If we reject when both fields are set, then a control plane must know whether any given client supports the new field. If it sets only the new field, old clients will not work properly, and if it sets both, new clients will break. So the only way to transition from the old field to the new field is to either (a) wait for all clients to be upgraded before starting to use the new field or (b) have some way for the control plane to know which client supports the new field and which clients support the old field so that it can send the right config to each one -- and doing that in a way that does not break cacheability requires some dances with dynamic parameter support, which means that the clients have to be configured to tell the control plane whether or not they support the new field.

In contrast, if we allow both fields to be set and specify the precedence order, then this becomes much easier. Control planes can simply set both fields. Old clients will use the old field, and new clients will prefer the new field.

I think the latter approach is much cleaner and should be preferred any time we add this kind of new field in xDS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about the change I made since Alyssa's feedback? d976354

She suggested it should either log or reject; I chose to reject by throwing an error with a message that clearly informs that only one field can be set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is that I don't think that's the behavior we want. I think the way you had this originally was the right thing to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'm getting conflicting advice from two different maintainers :-)

@alyssawilk perhaps you can help decide which way I should take this based on Mark's feedback?

//
// If ``recurse`` is set the :ref:`config_http_conn_man_headers_x-forwarded-for`
// header is also evaluated against the trusted CIDR list, and the last non-trusted
// address is used as the original client address.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc when would we not want to recurse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question! I added some commentary to the docs in 1f76584.

}
return {addr, true};
}
return {nullptr, false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if every address in the XFF is trusted don't we want to return the last address in the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one made me think! Originally I was assuming this might be some kind of undefined behaviour and so it was safer to return null, but after talking about this with a colleague, and looking over what other proxy servers do, I implemented your suggestion in a016a6c. Thanks for questioning my logic :-)

When parsing XFF using a list of trusted CIDRs, if all IP in XFF
match the trusted CIDRs (there are no untrusted IPs), return the first
IP in XFF (the last evaluated) - and a test to confirm the behaviour.

Signed-off-by: James O'Gorman <[email protected]>
@jamesog

This comment was marked as outdated.

@alyssawilk alyssawilk removed their assignment Aug 20, 2024
@adisuissa
Copy link
Contributor

/retest

Comment on lines 57 to 60
// For proper proxy behaviour it is not recommended to set this option.
// For backwards compatibility, if this option is unset and
// ``xff_num_trusted_hops`` is set, it defaults to true. When
// ``xff_trusted_cidrs`` is set, it defaults to false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different default values only brings confusion. Let's only use a default value with true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the related conversation below: #31831 (comment)

@alyssawilk seemed half inclined to go with this split default, but let me know if you still think we should really keep one default.

true is a bad default, to be honest, but I also don't want to break any existing users of xff_num_trusted_hops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From API's perspective, explicit and predictable behavior is always better. Different default values is always not recommend for our API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I've updated this to default to true.

Comment on lines 760 to 769
bool Utility::remoteAddressIsTrustedProxy(
const Envoy::Network::Address::InstanceConstSharedPtr& remote,
const std::vector<Network::Address::CidrRange> trusted_cidrs) {
for (const auto& cidr : trusted_cidrs) {
if (cidr.isInRange(*remote.get())) {
return true;
}
}
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const Envoy::Network::Address::Instance& remote: const reference to shared pointer here make no sense.
absl::Span<const Network::Address::CidrRange>: to avoid copy whole vector. (or vector reference should be used.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, these have been updated.

Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. Some comments are added.


Utility::GetLastAddressFromXffInfo Utility::getLastNonTrustedAddressFromXFF(
const Http::RequestHeaderMap& request_headers,
const std::vector<Network::Address::CidrRange> trusted_cidrs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. Please use span or vector reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks.

static constexpr absl::string_view separator(",");
static constexpr size_t MAX_ALLOWED_XFF_ENTRIES = 20;

const auto xff_entries = StringUtil::splitToken(xff_header, separator, false, true);
Copy link
Member

@wbpcode wbpcode Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which type is used here to store the splitted tokens? Seems auto is not appropriate here. Maybe you should specify a type here. InlineVector<absl::string_view, 8> InlinedVector<absl::string_view, 8> will be recommend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I used std::vector<absl::string_view> as this seemed to be the style elsewhere and I couldn't find any reference of InlineVector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absl::InlinedVector provide better performance when we only has limited hops (in most cases?). And sorry for the spelling error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I've updated to use InlinedVector now.

Comment on lines 17 to 20
if (config.has_xff_trusted_cidrs() && config.xff_num_trusted_hops() > 0) {
ENVOY_LOG(warn, "Both xff_num_trusted_hops and xff_trusted_cidrs are configured; only "
"xff_trusted_cidrs will be used");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be safe to reject the configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had previously tried to do that by throwing, but this causes failures for Mobile. Is there another way to reject the configuration?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can should use throwExceptionOrPanic method for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is panicking acceptable for Mobile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the throwExceptionOrPanic. Hopefully this won't cause the Mobile tests to fail this time. Thanks for the pointer.

@wbpcode
Copy link
Member

wbpcode commented Aug 21, 2024

/wait

@jamesog
Copy link
Contributor Author

jamesog commented Aug 23, 2024

/retest

@wbpcode
Copy link
Member

wbpcode commented Aug 31, 2024

Sorry for the delay. Some times I may ignore the notification from tools. It's OK to ping me directly on slack.

@wbpcode
Copy link
Member

wbpcode commented Aug 31, 2024

/wait

The API should provide a consistent default value, and we cannot set
this to false without causing a breaking change. Instead we have a
warning in the API docs recommending to disable this option.

Signed-off-by: James O'Gorman <[email protected]>
Signed-off-by: James O'Gorman <[email protected]>
This reverts c69df1c and changes to use
throwEnvoyExceptionOrPanic, which will cause a config error in Envoy,
and a panic in Envoy Mobile.

Signed-off-by: James O'Gorman <[email protected]>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great contribution. Only last comment, I think.

Comment on lines 760 to 761
bool Utility::remoteAddressIsTrustedProxy(
const Envoy::Network::Address::InstanceConstSharedPtr remote,
Copy link
Member

@wbpcode wbpcode Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, basically we only use the smart point as input parameter when we wants to change the ownership. But in this case, if we ensure this remote always be valid, then const Envoy::Network::Address::Instance& should be used.

If we can ensure this remote always be valid, then OptRef<const Envoy::Network::Address::Instance> should be used.

I think there should be the first case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understood you correctly. This is getting beyond my knowledge here so I wasn't quite sure. Please see the latest commit that now uses const Network::Address:Instance&.

Signed-off-by: James O'Gorman <[email protected]>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@repokitteh-read-only repokitteh-read-only bot removed the api label Sep 9, 2024
@wbpcode wbpcode merged commit fbc6ee2 into envoyproxy:main Sep 9, 2024
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configurable filter of trusted proxies to handle x-forwarded-for header
8 participants