Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pasta: figure out how to deal with /etc/{hosts,resolv.conf} entries #19213

Closed
Luap99 opened this issue Jul 12, 2023 · 41 comments · Fixed by #23791
Closed

pasta: figure out how to deal with /etc/{hosts,resolv.conf} entries #19213

Luap99 opened this issue Jul 12, 2023 · 41 comments · Fixed by #23791
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. pasta pasta(1) bugs or features stale-issue

Comments

@Luap99
Copy link
Member

Luap99 commented Jul 12, 2023

There are some basic problem with our hosts and resolv.conf handling when we the pasta network mode is used.

For resolv.conf unless custom dns servers are specified via config or cli then podman will read your hosts resolv.conf and add entries to the containers resolv.conf, except that we filter out localhost addresses because they will not be reachable from within the container netns.
Because pasta by default uses no NAT and reuses the host this causes the side effect that the host ip in the host netns is not reachable as well. This means that if you by default have you host ip in resolv.conf then podman adds an entry that is actually not reachable. This is a bad user experience so we should figure out how to best handle this.
This also gets more complicated depending on what pasta options the users specifies (--map-gw, --address, --gateway). They all will result in some NAT for the container and may make certain addresses unavailable.

For /etc/hosts it is not as problematic but there is still the host.containers.internal entry which should point to the host ip. Right now this is just the first non localhost ip we find. By default this will be very often the same ip used by pasta so then again you would not actually reach the host but stay in the container netns. Many application use host.containers.internal to connect to services running on the host so with pasta this will not work. Again this may also be impacted by the pasta option --address.

We need to figure out what pasta options effect this behavior and how to deal with that accordingly. This is something we should address before making pasta the default as this has a high change of causing regressions if we do not deal with it correctly.

@Luap99 Luap99 added kind/bug Categorizes issue or PR as related to a bug. pasta pasta(1) bugs or features labels Jul 12, 2023
@Luap99
Copy link
Member Author

Luap99 commented Jul 12, 2023

cc @sbrivio-rh @dgibson

@dgibson
Copy link
Collaborator

dgibson commented Jul 13, 2023

@Luap99 a couple of background queries to help me understand the problem better.

  1. Was there a specific rationale for invoking pasta by default with --no-map-gw? With that option, there's pretty fundamentally no way to access the host from the container. We hope to make that a bit more flexible with the future forwarding model we have in mind, but that might be a while before it's implemented.

  2. Does podman have the infrastructure to allocate IP addresses (from some private network), or did it always rely on other components for that? If so we should be able to re-use that along with the DNS specific NAT options to handle resolv.conf and name resolution. But we need to get an IP from somewhere, and pasta doesn't have enough view of the surrounding network to really do so. This is kind of the inevitable tradeoff for avoiding NAT in most cases.

@Luap99
Copy link
Member Author

Luap99 commented Jul 13, 2023

Was there a specific rationale for invoking pasta by default with --no-map-gw? With that option, there's pretty fundamentally no way to access the host from the container. We hope to make that a bit more flexible with the future forwarding model we have in mind, but that might be a while before it's implemented.

I think my concern was (still is) that a container must never have access to processes listing on 127.0.0.1 on the host ns, at least by default. That decision requires user opt in (i.e. allow_host_loopback=true for slirp4netns). As I understand in pasta by default the gateway ip is mapped to localhost on the host so it bypasses that guarantee. If you could map it to the actual host ip then I would not have any problems with it because this one can be accessed by all the other network modes as well. But then keep in mind that means we can no longer connect to the actual network gw and at least on common home network setups the home router will set itself as dns server which means a lot of users would be hit by this problem.

Does podman have the infrastructure to allocate IP addresses (from some private network), or did it always rely on other components for that? If so we should be able to re-use that along with the DNS specific NAT options to handle resolv.conf and name resolution. But we need to get an IP from somewhere, and pasta doesn't have enough view of the surrounding network to really do so. This is kind of the inevitable tradeoff for avoiding NAT in most cases.

Exactly that is the problem, for rootful we just assume 10.88.0.0/16 is free (if not a user has to change it in the config manually) with slirp4netns it uses their default of 10.0.2.0/24 (can also be changed in the config). Both are not great obviously as if you already use those subnets it will not work out of the box.
And now that I think of it the resolv.conf problem with potentially adding ips that are not reachable would exists there too.

With pasta we have the unique advantage that we only loose a single ip with is much better and I love that. Certainly we could just define a specific ip and I assume we could set --dns-forward by default to implement that? If we assign a ip we need to keep in mind that we must keep backwards compatibility in mind. But I think we could just pick a ip from a reserved range such as 169.254.0.0/16 which should not cause problems for users?

@dgibson
Copy link
Collaborator

dgibson commented Jul 19, 2023

Was there a specific rationale for invoking pasta by default with --no-map-gw? With that option, there's pretty fundamentally no way to access the host from the container. We hope to make that a bit more flexible with the future forwarding model we have in mind, but that might be a while before it's implemented.

I think my concern was (still is) that a container must never have access to processes listing on 127.0.0.1 on the host ns, at least by default. That decision requires user opt in (i.e. allow_host_loopback=true for slirp4netns). As I understand in pasta by default the gateway ip is mapped to localhost on the host so it bypasses that guarantee.

Your understanding is correct, so, yes, that constraint absolutely rules out map-gw in its present form.

If you could map it to the actual host ip then I would not have any problems with it because this one can be accessed by all the other network modes as well.

So, we want to allow that, but it's harder than it sounds. There are, alas, some assumptions about where things are mapped that influences how the port tracking stuff in the UDP code works. Sorting that out is definitely planned, but it's not that easy.

But then keep in mind that means we can no longer connect to the actual network gw and at least on common home network setups the home router will set itself as dns server which means a lot of users would be hit by this problem.

Right.

Does podman have the infrastructure to allocate IP addresses (from some private network), or did it always rely on other components for that? If so we should be able to re-use that along with the DNS specific NAT options to handle resolv.conf and name resolution. But we need to get an IP from somewhere, and pasta doesn't have enough view of the surrounding network to really do so. This is kind of the inevitable tradeoff for avoiding NAT in most cases.

Exactly that is the problem, for rootful we just assume 10.88.0.0/16 is free (if not a user has to change it in the config manually) with slirp4netns it uses their default of 10.0.2.0/24 (can also be changed in the config). Both are not great obviously as if you already use those subnets it will not work out of the box. And now that I think of it the resolv.conf problem with potentially adding ips that are not reachable would exists there too.

In the short to medium term, my inclination here would be to allocate a fake DNS server from the 10.88.0.0/16 range, and pass that to the --dns-forward option. Obviously that can break down if that subnet is in use, but it seems like that's probably a better option than preventing either the host, or the (real) local gateway from being the DNS server.

With pasta we have the unique advantage that we only loose a single ip with is much better and I love that. Certainly we could just define a specific ip and I assume we could set --dns-forward by default to implement that? If we assign a ip we need to keep in mind that we must keep backwards compatibility in mind. But I think we could just pick a ip from a reserved range such as 169.254.0.0/16 which should not cause problems for users?

Well... it depends what "reserved" means, exactly. Obviously 10.0.0.0/8 or 192.168.0.0/16 can fail easily if those are used for a private network on the host. Something like 192.0.2.0/24 would probably work in practice, but not if you're trying to run this inside an example environment already using that range - and it's not really what RFC5737 says you should use it for. Most of the other reserverd ranges have similar issues.

The link local range, 169.254.0.0/16, specifically is an interesting case. Because it's link-local we can potentially use it safely even if it's also in use on the host side. This then comes down to a general question of how to handle link local addresses (both IPv4 and IPv6) in pasta. One option is to treat the "link" as purely between the guest/container and pasta, in which case we can freely assign and use link-local addresses - but it means anything only accessible to the host via link-local addressing is not accessible to the guest/container. Another is for pasta to act as though it's a window out onto one of the host's link-local spaces. At present, we're a bit of an unholy mix of the two. My long term plan is to allow either of these options - there's actually a bunch of other curly edge cases where it becomes clearer what to do when we explicitly choose one of these two options. But, again, that will require a fair bit of work to reach.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Aug 22, 2023

@Luap99 Any update on this?

@Luap99
Copy link
Member Author

Luap99 commented Aug 23, 2023

No

@github-actions
Copy link

github-actions bot commented Oct 4, 2023

A friendly reminder that this issue had no activity for 30 days.

@Luap99
Copy link
Member Author

Luap99 commented Mar 5, 2024

@mheon @dgibson @sbrivio-rh I lost track of this one. I think we need to take look here again and figure out how to fix this is the best optimal way.

For /etc/hosts it is not as problematic but there is still the host.containers.internal entry which should point to the host ip. Right now this is just the first non localhost ip we find. By default this will be very often the same ip used by pasta so then again you would not actually reach the host but stay in the container netns. Many application use host.containers.internal to connect to services running on the host so with pasta this will not work. Again this may also be impacted by the pasta option --address.

This is still a issue, today we already lookup the pasta ip by checking the interface inside the netns so one easy thing to do would be to keep looking for another ip on the host if it is the same as the one pasta uses. However this only works if a system has more than one non localhost ip address which may not be common. Keep in mind that we pass --no-map-gw to pasta so using the gw address is not possible, and even if we would like to use it it is not suitable for us as it remaps to localhost on the host which we consider insecure and as such is a non starter.
So what we would would ideally need from pasta is the possibility to remap some ip in the container netns to the host ip that pasta uses and that should not connect to localhost.

I am not sure were the passt/pasta work on this but I think you were talking about working on some form of generic ip remapping. So maybe this is something we could implement today? It is totally fine if podman needs to pass a new option.

And then if we can do that, we could reuse it if the nameserver is the host ip because then podman could just write the ip we used to remap to the containers resolv.conf and it should just work.

The alternative of course is that podman could pass --address,--gateway with the old slirp4netns addresses to force NAT but I don't think this is what any of us want.

@sbrivio-rh
Copy link
Collaborator

So what we would would ideally need from pasta is the possibility to remap some ip in the container netns to the host ip that pasta uses and that should not connect to localhost.

Can --dns-forward ADDR help? You would tell pasta to map ADDR to the first configured resolver, where ADDR should match whatever Podman configures in the container's /etc/resolv.conf.

I am not sure were the passt/pasta work on this but I think you were talking about working on some form of generic ip remapping. So maybe this is something we could implement today? It is totally fine if podman needs to pass a new option.

This is still work in progress, I think we're quite far from having options that could be used now, unless @dgibson sees a way to do that.

@Luap99
Copy link
Member Author

Luap99 commented Mar 5, 2024

So what we would would ideally need from pasta is the possibility to remap some ip in the container netns to the host ip that pasta uses and that should not connect to localhost.

Can --dns-forward ADDR help? You would tell pasta to map ADDR to the first configured resolver, where ADDR should match whatever Podman configures in the container's /etc/resolv.conf.

This option only handles dns remapping so it does not to fix the generic host.containers.internal issue.
Also because we use --no-map-gw this will not work for only localhost resolvers (systemd-resolved). Or well that is at least what I think from reading the code because it throws the Couldn't get any nameserver address, however it seems to work in this case which totally confuses me.

$ grep nameserver /etc/resolv.conf 
nameserver 127.0.0.53
$ pasta --config-net --no-map-gw  --dns-forward 192.168.0.1 nslookup google.com 192.168.0.1
No routable interface for IPv6: IPv6 is disabled
Couldn't get any nameserver address
Server:		192.168.0.1
Address:	192.168.0.1#53

Non-authoritative answer:
Name:	google.com
Address: 216.58.212.142
Name:	google.com
Address: 2a00:1450:4001:82a::200e


So it is certainly an option to fix some of the problems, however it has the same problem in that it maps to 127.0.0.1 so it will not in the case were the host ip is used as nameserver and only listens on that ip (e.g. eth0) but not on localhost.

@sbrivio-rh
Copy link
Collaborator

So what we would would ideally need from pasta is the possibility to remap some ip in the container netns to the host ip that pasta uses and that should not connect to localhost.

Can --dns-forward ADDR help? You would tell pasta to map ADDR to the first configured resolver, where ADDR should match whatever Podman configures in the container's /etc/resolv.conf.

This option only handles dns remapping so it does not to fix the generic host.containers.internal issue. Also because we use --no-map-gw this will not work for only localhost resolvers (systemd-resolved). Or well that is at least what I think from reading the code because it throws the Couldn't get any nameserver address, however it seems to work in this case which totally confuses me.

I have to admit it confuses me as well. It might be a side effect of commit bad252687271 ("conf, udp: Allow any loopback address to be used as resolver"). I need to look into this a bit further.

So it is certainly an option to fix some of the problems, however it has the same problem in that it maps to 127.0.0.1 so it will not in the case were the host ip is used as nameserver and only listens on that ip (e.g. eth0) but not on localhost.

...is this case actually a thing? I've never seen systemd-resolved or dnsmasq binding to a specific address or interface.

@Luap99
Copy link
Member Author

Luap99 commented Mar 5, 2024

So it is certainly an option to fix some of the problems, however it has the same problem in that it maps to 127.0.0.1 so it will not in the case were the host ip is used as nameserver and only listens on that ip (e.g. eth0) but not on localhost.

...is this case actually a thing? I've never seen systemd-resolved or dnsmasq binding to a specific address or interface.

Yeah not for systemd/dnsmasq when used as local resolvers. However one place where I have done this is running a dns server in podman. So I put my eth0 ip in resolv.conf as I want to use it also from within all my other containers and podman has to skip localhost resolvers. Now I could bind all addresses (0.0.0.0) but there is a catch with that as well. Podman uses aardvark-dns which by default listens on the bridge ip on port 53 to offer name resolution for container names. So this would fail if there is already a dns sever running on 0.0.0.0.
I am well aware that this is a totally obscure example and maybe nobody besides me has ever done that and there are plenty of ways to work around it/set up differently in a way that it would work so I do not really worry about it personally.

Just saying this because technically you ignore the nameserver ip in this specific case and remap it to 127.0.0.1 which I find weird.

@dgibson dgibson self-assigned this Mar 6, 2024
@dgibson
Copy link
Collaborator

dgibson commented Mar 6, 2024

Both pasta and slirp4netns need to make a design tradeoff to deal with the fact that they don't have the capacity to allocate a genuinely new IP for their guest. Each has chosen a different option, and to some extent this issue is a fundamental consequence of that choice:

  • slirp4netns chooses to put the guest on its own NATted subnetwork. That makes things simple for internal address handling, but it has the usual problems of NAT.
  • pasta chooses to avoid NAT and instead have the guest share the host IP. This has a number of advantages, but the cost is that its now impossible to directly address the host from the guest.

The gw mapping option is pasta's attempt to mitigate the trade-off. It allows access to the host, but at the cost of not allowing access to the original gateway. It's also inflexible, in that it doesn't allow the user to control what address is mapped to the host, or to control which host port it maps to.

I'm intending to make this NAT special case more flexible, allowing the user (podman in this case) to choose some arbitrary address which can be mapped to the host, or even several different addresses which can be mapped to different host addresses. However, implementing this sanely has a fair bit of prerequisite work. I'm gradually getting there, but it's a pretty long road.

@sbrivio-rh
Copy link
Collaborator

sbrivio-rh commented Mar 6, 2024

This option only handles dns remapping so it does not to fix the generic host.containers.internal issue. Also because we use --no-map-gw this will not work for only localhost resolvers (systemd-resolved).

Doesn't aardvark-dns resolve host.containers.internal to a non-local address for a host interface? Because even with --no-map-gw, one can do this:

$ ip -4 ad sh dev enp9s0
2: enp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    inet 88.198.0.164/27 brd 88.198.0.191 scope global enp9s0
       valid_lft forever preferred_lft forever
$ pasta --config-net --no-map-gw
# grep host\.containers\.internal /etc/hosts
88.198.0.164	host.containers.internal
# ping -nc1 host.containers.internal
PING host.containers.internal (88.198.0.164) 56(84) bytes of data.
64 bytes from 88.198.0.164: icmp_seq=1 ttl=64 time=0.032 ms

--- host.containers.internal ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.032/0.032/0.032/0.000 ms

Or well that is at least what I think from reading the code because it throws the Couldn't get any nameserver address, however it seems to work in this case which totally confuses me.

$ grep nameserver /etc/resolv.conf 
nameserver 127.0.0.53
$ pasta --config-net --no-map-gw  --dns-forward 192.168.0.1 nslookup google.com 192.168.0.1
No routable interface for IPv6: IPv6 is disabled
Couldn't get any nameserver address
Server:		192.168.0.1
Address:	192.168.0.1#53

Non-authoritative answer:
Name:	google.com
Address: 216.58.212.142
Name:	google.com
Address: 2a00:1450:4001:82a::200e

I think this works because if --dns-forward is given, we set dns_match in conf.c, and not dns_host, so that's 0.0.0.0. At that point, we'll just use 0.0.0.0 as destination, which means "this host".

So it is certainly an option to fix some of the problems, however it has the same problem in that it maps to 127.0.0.1 so it will not in the case were the host ip is used as nameserver and only listens on that ip (e.g. eth0) but not on localhost.

We can also pass another address, not 127.0.0.1, with the --dns option.

If this works, would there be any remaining issue?

@Luap99
Copy link
Member Author

Luap99 commented Mar 6, 2024

This option only handles dns remapping so it does not to fix the generic host.containers.internal issue. Also because we use --no-map-gw this will not work for only localhost resolvers (systemd-resolved).

Doesn't aardvark-dns resolve host.containers.internal to a non-local address for a host interface? Because even with --no-map-gw, one can do this:

aardvark-dns is not in the picture here, we do not use it for host.containers.internal, this entry is added to /etc/hosts in the container. I don't understand what you trying to show in your example, the host ip 88.198.0.164 would of course be able to ping inside the container as this is locally inside the netns. host.containers.internal has to resolve to a ip that reaches the host side, if a container connects to that address they must be able to talk to services listing on the host. Or are you saying enp9s0 is not your default interface used by pasta, then yes using another ip on the host will work but requires that such other ip exist.

So it is certainly an option to fix some of the problems, however it has the same problem in that it maps to 127.0.0.1 so it will not in the case were the host ip is used as nameserver and only listens on that ip (e.g. eth0) but not on localhost.

We can also pass another address, not 127.0.0.1, with the --dns option.

That means we will read resolv.conf on our side? Seems kinda silly considering that pasta already does that.
I think just using --dns-forward would work if pasta wouldn't throw a warning that it cannot use localhost resolvers with --no-map-gw as this clearly is not the case.

@Luap99
Copy link
Member Author

Luap99 commented Mar 6, 2024

I'm intending to make this NAT special case more flexible, allowing the user (podman in this case) to choose some arbitrary address which can be mapped to the host, or even several different addresses which can be mapped to different host addresses. However, implementing this sanely has a fair bit of prerequisite work. I'm gradually getting there, but it's a pretty long road.

Yeah that does sound useful to me for to map to the actual host. We would need to choose some arbitrary address inside the netns but this shouldn't be a problem.

@sbrivio-rh
Copy link
Collaborator

That means we will read resolv.conf on our side? Seems kinda silly considering that pasta already does that.

Hmm, yes, right.

I think just using --dns-forward would work if pasta wouldn't throw a warning that it cannot use localhost resolvers with --no-map-gw as this clearly is not the case.

Okay, so I can prepare a patch for pasta that avoids the warning in that case. Then we need to pass another option in Podman. Should I try to make that change as well? (I'd rather leave it to you or somebody else at the moment, if possible)

@Luap99
Copy link
Member Author

Luap99 commented Mar 7, 2024

Then we need to pass another option in Podman. Should I try to make that change as well? (I'd rather leave it to you or somebody else at the moment, if possible)

I can do that.

Luap99 added a commit to Luap99/common that referenced this issue Mar 13, 2024
This reverts commit 92784a2.
I plan on using --dns-forward now so we do not want to disable dns by
default, see [1].

[1] containers/podman#19213
Luap99 added a commit to Luap99/common that referenced this issue Mar 13, 2024
This reverts commit 92784a2.
I plan on using --dns-forward now so we do not want to disable dns by
default, see [1].

[1] containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/common that referenced this issue Mar 14, 2024
This reverts commit 92784a2.
I plan on using --dns-forward now so we do not want to disable dns by
default, see [1].

[1] containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/common that referenced this issue Mar 14, 2024
This reverts commit 92784a2.
I plan on using --dns-forward now so we do not want to disable dns by
default, see [1].

[1] containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/common that referenced this issue Mar 14, 2024
This reverts commit 92784a2.
I plan on using --dns-forward now so we do not want to disable dns by
default, see [1].

[1] containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
@bmenant
Copy link

bmenant commented Apr 30, 2024

@SebTM Maybe set the network mode to slirp4netns from your compose file? https://docs.podman.io/en/latest/markdown/podman-pod-create.1.html#network-mode-net

Pasta works fine in a similar stack of mine (pod setup, not compose) as soon as there’s another ip address on the host to assign to host.containers.internal (podman 5.0.2 picks up a virtual network interface on my workstation for instance), as mentioned here: #22502 (comment)

@wangmaster
Copy link

@SebTM consider reverting your default networking tool to slirp4netns instead of pasta as described here: https://blog.podman.io/2024/03/podman-5-0-breaking-changes-in-detail/

Thanks for posting that link. I'd missed that on the podman blog. The second option (to assign an alternate IP address for the containers) worked to provide access to the host. What I'm having a hard time finding is a good documentation clarifying what the ramifications are (and why isn't this the default behavior as it seems more like the slirp4netns behavior). The pasta man page is.. about as clear as mud to me, probably because I haven't found the time to understand pasta.

@dgibson
Copy link
Collaborator

dgibson commented May 1, 2024

@SebTM consider reverting your default networking tool to slirp4netns instead of pasta as described here: https://blog.podman.io/2024/03/podman-5-0-breaking-changes-in-detail/

Thanks for posting that link. I'd missed that on the podman blog. The second option (to assign an alternate IP address for the containers) worked to provide access to the host. What I'm having a hard time finding is a good documentation clarifying what the ramifications are (and why isn't this the default behavior as it seems more like the slirp4netns behavior). The pasta man page is.. about as clear as mud to me, probably because I haven't found the time to understand pasta.

Both pasta and slirp4netns need to deal with the fact that we can't allocate "real" IP addresses. pasta has chosen a different approach here, which we think is better in more cases, but it's an unavoidable tradeoff, so there are some situations where pasta's approach causes trouble.

slirp's approach is to NAT the guest/container - it sees a private (usually 10.0.2.0/24) network, but packets sent from the container appear from the outside to have come from (one of) the host's IP. This approach means there's a private IP range where we can allocate the guest's address, and anything else we need. However, NAT means that the address the container sees isn't an address that's meaningful to anything outside, so anything which tries to communicate its IP out to the world will fail.

pasta instead chooses not to NAT; the container sees the IP from which its packets will appear on the outside - (one of) the host's IPs. This both simplifies the logic and avoids the problems above. The downside is that since it share's the host IP, there's no easy way for those to communicate with each other. Standlone pasta, by default, implements a special case NAT to handle that case, but it's rather limited and involves some other tradeoffs. podman disables that by default (you can re-enable it with --map-gw). We're aiming to have a more flexible set up for these special case NATs which should be usable in more situations, but it's still a ways off.

@jalberto
Copy link

Setting pasta_options didn't work for me (fedora 40) so I reverted to slirp4netns

Defaulting to pasta without taking in consideration this, seems not only a breaking change but a full remove of a feature

@sbrivio-rh
Copy link
Collaborator

Setting pasta_options didn't work for me (fedora 40) so I reverted to slirp4netns

Defaulting to pasta without taking in consideration this, seems not only a breaking change but a full remove of a feature

That was not intentional, see #19213 (comment), and now the fix is work in progress, see also #22653.

@Luap99
Copy link
Member Author

Luap99 commented Aug 22, 2024

pasta 2024_08_21 added a new --map-guest-addr option that we can set by default with some fixed address and then make podman add the corresponding hosts entry for host.containers.internal which should make that work in all cases now just on hosts with more than one ip address. I didn't had a chance to test this yet but I assume it "just works"

I will try to work on that next week. That however leaves one question open how do we want to deal with backwards compatibility with older pasta versions. If we pass this option by default users with older pasta versions will break. So we would have an hard requirement on this pasta version.
Alternatively we could do a feature check, e.g. (pasta --help and check for the flag in its output). This of course is more code and "slower" although the overhead will likely not be noticeable in overall startup time of a container.

While I would like to say distros/users need to provide the right versions it will make things like bisection much harder so I think the -help check makes the most sense. I know this is what we do with slirp4netns as well.

@sbrivio-rh
Copy link
Collaborator

That however leaves one question open how do we want to deal with backwards compatibility with older pasta versions. If we pass this option by default users with older pasta versions will break. So we would have an hard requirement on this pasta version. Alternatively we could do a feature check, e.g. (pasta --help and check for the flag in its output). This of course is more code and "slower" although the overhead will likely not be noticeable in overall startup time of a container.

Can we do an optimistic version of it, so that we don't make things slower in the general case? That is, try with the new option. If pasta exits with 1 (possibly with unrecognized option '--map-guest-addr'), then we fall back to the previous behaviour.

@Luap99
Copy link
Member Author

Luap99 commented Aug 22, 2024

Can we do an optimistic version of it, so that we don't make things slower in the general case? That is, try with the new option. If pasta exits with 1 (possibly with unrecognized option '--map-guest-addr'), then we fall back to the previous behaviour.

Yes good idea

Luap99 added a commit to Luap99/common that referenced this issue Aug 26, 2024
The --map-guest-addr option allows us to sepcify a ip that is remapped
to the actual host ip that was used by pasta. This is done to fix the
problem where connecting to the host ip was not possible as the same ip
was used in the netns.

We now set --map-guest-addr 169.254.0.2 which follows the same idea we
already used for the --dns-forward option. With that podman can use this
ip to set it for host.containers.internal which should the case where
there was no second host ip available, see
containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
@akostadinov
Copy link

One other issue is that when --network=pasta:--map-gw is used, host.containers.internal entry is NOT set. So it is inconvenient to know where host is inside the container. I think that when gw is mapped, then host.containers.internal should be set to that IP address.

@dgibson
Copy link
Collaborator

dgibson commented Sep 2, 2024

One other issue is that when --network=pasta:--map-gw is used, host.containers.internal entry is NOT set. So it is inconvenient to know where host is inside the container. I think that when gw is mapped, then host.containers.internal should be set to that IP address.

@akostadinov , I don't think that quite makes sense. host.containers.internal is supposed to map to the host's global address, not its loopback address. --map-gw, which is now essentially just shorthand for --map-host-loopback=<gw addr> maps the loopback address instead.

Luap99 added a commit to Luap99/common that referenced this issue Sep 2, 2024
The --map-guest-addr option allows us to sepcify a ip that is remapped
to the actual host ip that was used by pasta. This is done to fix the
problem where connecting to the host ip was not possible as the same ip
was used in the netns.

We now set --map-guest-addr 169.254.1.2 which follows the same idea we
already used for the --dns-forward option. With that podman can use this
ip to set it for host.containers.internal which should the case where
there was no second host ip available, see
containers/podman#19213

Signed-off-by: Paul Holzinger <[email protected]>
@akostadinov
Copy link

Thank you, @dgibson , --map-host-loopback can be much more useful than --map-gw. I think this discussion is very needed because I consider myself an experienced podman used (still only a user) and current behavior + available documentation turned into a frustrating experience for me. So here is my user perspective.

As a user I desire things to just work oob and when they don't, I expect to see some useful error/warning messages and better documentation how to achieve certain outcomes.

I think better documentation is needed around --map-host-loopback and map-gw. --map-gw is inside man podman run but not --map-host-loopback. Also I don't think a regular user would expect that:

host.containers.internal is supposed to map to the host's global address, not its loopback address

In docker world host.docker.internal sounds like loopback. So my first expectation was that host.containers.internal is equivalent (moreover host.docker.internal is also set by podman pointing at the same IP). Either case, host.containers.internal is not mentioned in man podman run and I believe it is important enough to be. Presently there is no explanation when it will be created and when not. For example that you need to have more than one networks on more than one interfaces for it to be created. It took me many hours to understand.

This paragraph in the issue description is not valid anymore:

For /etc/hosts it is not as problematic but there is still the host.containers.internal entry which should point to the host ip. Right now this is just the first non localhost ip we find. By default this will be very often the same ip used by pasta so then again you would not actually reach the host but stay in the container netns. Many application use host.containers.internal to connect to services running on the host so with pasta this will not work. Again this may also be impacted by the pasta option --address.

host.containers.internal is not created anymore with a single interface/single ip or even single interface multiple ipv4 IPs. Only when a different interface with a different IPv4 IP is present, is this hostname added. Which is a little bit less confusing that what is described in the paragraph. But as I said, still lost me many hours. FYI for any lurkers, this is my ugly way to create a secondary interface with an IP that is unlikely to ever interfere with a real network and will allow host.containers.internal and host.docker.internal to be created:

sudo nmcli connection add type tun ifname tun0 con-name dummytun mode tun owner 0 ip4 172.16.0.0/32

Specifically my opinion is that, whatever is decided as a behavior, host.containers.internal should be considered a major feature, and when it cannot be created, a warning message should be output to stderr with a reference to documentation, how to make it work. Or at least added to the podman run man page. If internal host is mapped to any IP, I think host.containers.internal should be set to it. If container runs inside a bridge network, then ideally it would be set to the IP of the host in that network (maybe unless it is marked as --internal). idk if slirp4netns will be removed, but ideally it should also set host.containers.internal.

@Luap99
Copy link
Member Author

Luap99 commented Sep 2, 2024

which is why we are going to use --map-guest-addr by default in podman #23791 then all containers using pasta, pasta network mode or rootless bridge mode (custom networks) will make use of it so this juts works out of the box after this.

and when it cannot be created, a warning message should be output to stderr with a reference to documentation

That would be a major change, the fact is that likely 99+% of users will never care about this host entry nor need the connection to the host so throwing a warning by default will cause much more confusing to those users.
There are network environments where there might never be a host ip we can use so throwing warnings as default is just not correct. I do agree that adding docs around the host.containers.internal handling are needed, contributions welcome.

If container runs inside a bridge network, then ideally it would be set to the IP of the host in that network

That is already the case for rootful bridge networks we use the bridge ip but of course that only make sense as root because the rootless bridge ip isn't actually on the host (#22943 (comment)).

but ideally it should also set host.containers.internal

It already sets this entry to any host ip

@akostadinov
Copy link

--map-guest-addr sounds great, thank you. Just a clarification that presently [1][2] host.containers.internal does not "already sets this entry to any host ip" at least when only a single network interface on host is present.

[1] podman-5.2.2-1.fc40.x86_64
[2] passt-0^20240821.g1d6142f-1.fc40.x86_64

@dgibson
Copy link
Collaborator

dgibson commented Sep 3, 2024

Thank you, @dgibson , --map-host-loopback can be much more useful than --map-gw. I think this discussion is very needed because I consider myself an experienced podman used (still only a user) and current behavior + available documentation turned into a frustrating experience for me. So here is my user perspective.

Right, --map-host-loopback is strictly more flexible than --map-gw, which is (now) just shorthand for --map-host-loopback=<gw address>. The more salient distinction for hosts entries is between --map-host-loopback and --map-guest-addr.

As a user I desire things to just work oob and when they don't, I expect to see some useful error/warning messages and better documentation how to achieve certain outcomes.

I think better documentation is needed around --map-host-loopback and map-gw. --map-gw is inside man podman run but not --map-host-loopback. Also I don't think a regular user would expect that:

Sure. We only just implemented the new options in pasta, so the documentation hasn't yet been added to podman. @Luap99 is better place to answer when that might happen.

host.containers.internal is supposed to map to the host's global address, not its loopback address

In docker world host.docker.internal sounds like loopback. So my first expectation was that host.containers.internal is equivalent (moreover host.docker.internal is also set by podman pointing at the same IP). Either case, host.containers.internal is not mentioned in man podman run and I believe it is important enough to be. Presently there is no explanation when it will be created and when not. For example that you need to have more than one networks on more than one interfaces for it to be created. It took me many hours to understand.

That's a policy discussion only the podman people can address (I'm a pasta dev, not a podman dev). I was just repeating my understanding of why --map-host-loopback and the earlier --map-gw weren't put in /etc/hosts already. @Luap99 may have more background here.

This paragraph in the issue description is not valid anymore:

For /etc/hosts it is not as problematic but there is still the host.containers.internal entry which should point to the host ip. Right now this is just the first non localhost ip we find. By default this will be very often the same ip used by pasta so then again you would not actually reach the host but stay in the container netns. Many application use host.containers.internal to connect to services running on the host so with pasta this will not work. Again this may also be impacted by the pasta option --address.

host.containers.internal is not created anymore with a single interface/single ip or even single interface multiple ipv4 IPs. Only when a different interface with a different IPv4 IP is present, is this hostname added. Which is a little bit less confusing that what is described in the paragraph. But as I said, still lost me many hours. FYI for any lurkers, this is my ugly way to create a secondary interface with an IP that is unlikely to ever interfere with a real network and will allow host.containers.internal and host.docker.internal to be created:

@Luap99 has draft patches which address this using the new --map-guest-addr pasta option.

momeni added a commit to momeni/clean-arch that referenced this issue Sep 5, 2024
Luap99 added a commit to Luap99/libpod that referenced this issue Sep 6, 2024
pasta added a new --map-guest-addr to option that maps a to the actual
host ip. This is exactly what we need for host.containers.internal
entry. So we now make use of this option by default but still have to
keep the exclude fallback because the option is very new and some
users/distros will not have it yet.

This also fixes an issue where the --dns-forward ip were not used when
using the bridge network mode, only useful when not using aardvark-dns
as this used the proper ips there already from the rootless netns
resolv.conf file.

Fixes containers#19213

Signed-off-by: Paul Holzinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. pasta pasta(1) bugs or features stale-issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants