Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(enginenetx): fast recovery with unusable bridge #1552

Closed
wants to merge 139 commits into from

Conversation

bassosimone
Copy link
Contributor

@bassosimone bassosimone commented Apr 15, 2024

This diff modifies the strategy with which we mix tactics, to allow DNS-based tactics to chime in earlier in the case in which the bridge inside the source tree becomes unusable (e.g., IP based censorship).

While there, this diff introduces a DESIGN.md document that explains the enginenetx package design.

To make this diff simpler, we previously landed the following diffs, extracted from this pull request:

This work is part of ooni/probe#2704.

This diff refactors the code generating tactics to mix bridge and DNS
tactics, such that we avoid trying all bridge tactics before falling
back to DNS tactics. In the event in which the bridge is IP or endpoint
blocked, this change makes sure we try using DNS tactics earlier, and,
if the DNS is working, this means a faster bootstrap.

Based on testing, where I replaced the bridge address with 10.0.0.1, we
try DNS tactics after 8 seconds. After the first run, if the DNS tactics
are working, we would immediately use them before bridge tactics, since
we store information about tactics inside the $OONI_HOME/engine dir.

Part of ooni/probe#2704.
Previously, we were only testing with DNS returning error, while
now we should also have a test case for when it's working given that
we're mixing tactics together now.
@bassosimone
Copy link
Contributor Author

To implement the required changes, I think the best approach is that of decoupling a policy for generating tactics and a policy for mixing, rather than invoking mixing algorithms inside policies. This feels like a small refactoring and I started experimenting with it in this branch. But, this change will pay off a lot, because, at the moment, there are some implicit assumptions in the ordering of policies and how mixing works. Whereas, after the change, it will be much easier to compose several algorithms together. So, I am going to implement these changes first and commit them into master before circling back and (a) adapting the policy we're using here and (b) updating the design document.

bassosimone added a commit that referenced this pull request May 9, 2024
As mentioned in
#1552 (comment), we
want to split the generation of tactics and the mixing of tactics, such
that it's easier to compose the desired overall policy.

Part of ooni/probe#2704.

---------

Co-authored-by: Arturo Filastò <[email protected]>
bassosimone added a commit that referenced this pull request May 9, 2024
This implements the changes requested in
#1552. We rearrange the chains
such that the DNS has priority and extended policies come after it. Part
of ooni/probe#2704.

---------

Co-authored-by: Arturo Filastò <[email protected]>
Conflicts:
	internal/enginenetx/network.go
bassosimone added a commit that referenced this pull request May 9, 2024
In #1592 and previous pull
requests, I replaced the policies that embedded mixing logic with
neutral policies and external mixing logic, which enabled me to
implement what was requested in the
#1552 pull request review. Now,
with this pull request, I am cleaning up, by removing the policies that
we were previously using. Work part of
ooni/probe#2704.

---------

Co-authored-by: Arturo Filastò <[email protected]>
Conflicts:
	internal/enginenetx/bridgespolicy.go
	internal/enginenetx/bridgespolicy_test.go
	internal/enginenetx/statspolicy.go
	internal/enginenetx/userpolicy_test.go
This diff addresses a bug observed on the wild where a slow DNS
causes several tactics to be ready concurrently.

If we want several tactics to be ready concurrently, we should
arrange for that, and for now BTW that's not the case.

Part of ooni/probe#2704.
bassosimone added a commit that referenced this pull request May 10, 2024
Previously, the code was computing the zero time when we started
resolving. However, I have observed in the wild that, if the DNS lookup
time is high, we're going to have several ready tactics. We did not
previously see this bug because we gave priority to bridges and stats
tactics, hence we always had some ready tactics from the get go.

This PR is part of settling the dust after the changes requested in the
#1552 review.

The related tracking issue is ooni/probe#2704.

---------

Co-authored-by: Arturo Filastò <[email protected]>
@bassosimone
Copy link
Contributor Author

I guess this PR has served its purpose as being a platform for discussing the previously existing design, for experimenting with fast recover in case of issues, and as an integration branch for testing out changes to be merged. When we'll merge #1595, we should then close this PR and keep it as a documentation of the design discussion that happened, and the various change that were made to address the requested design changes.

bassosimone added a commit that referenced this pull request May 10, 2024
This design document documents the current implementation in light of
the changes requested in the #1552
pull request review. The actual changes have been implemented by
previous pull requests and basically boil down to ensure we give the DNS
the priority when dialing.

See #1552 for the original design
review as well as for a list of all the subsequent pull requests that
were merged to address the review comments.

Additionally, this PR explains in the design document what are the
current limitations and what we could do next.

With the merging of this PR, we can close
ooni/probe#2704.

Closes #1552.

---------

Co-authored-by: Arturo Filastò <[email protected]>
@bassosimone bassosimone deleted the issue/2704 branch May 10, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants