Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(minipipeline/analysis): distinguish between None and empty #1401

Merged
merged 82 commits into from
Nov 29, 2023

Conversation

bassosimone
Copy link
Contributor

@bassosimone bassosimone commented Nov 29, 2023

None means that an algorithm did not run or did not find enough data to produce a result. Empty means that the algorithm did run, did find enough data, and did produce an empty result.

The difference between these two states, which generally is not important when writing Go code, is extremely important to take the correct decisions when assigning the results of web measurements.

Accordingly, this diff goes through each algorithm and ensures we start with a None state and only switch to the empty state when we have seen enough data to determine that the result is indeed empty.

Additionally, in this diff we also add the following new analysis rules:

  • ComputeDNSPossiblyInvalidAddrsClassic, which is like ComputeDNSPossiblyInvalidAddrs but does not consider TLS, which in turn is useful to emulate the original Web Connectivity v0.4 behavior;
  • ComputeDNSPossiblyNonexistingDomains, which tells us for which domains the probe and the control agree that those domains are undefined (i.e., they both get NXDOMAIN), which is useful to detect this specific case;

We also renamed ComputeHTTPFinalResponses to ComputeHTTPFinalResponsesWithControl and changed the definition such that we only include responses for which we have a control. This is the core rule to decide whether we should move forward with considering the results of the HTTP diff set of algorithms.

The reference issue is ooni/probe#2634.

We're introducing failure modes that do not exist hence it seems
this is not the correct way of moving forward.
I'm doing this mainly to explore whether we could have more
robust webconnectivity v0.5 analysis code
Because I am dropping the requests again, we break again the tests
with the redirects. I could possibly fix it by putting requests back
again but I am not super happy about doing this because that would
cause the DSL to do some strange work and I'd honestly rather not do this.
what remains to be done now is to make sure we make green all the
tests that are currently skipped

we also need to account for differences between the two
then next step is to sort out this mess :-)
(I am thankful there's a ~comprehensive test suite.)
this happens because LTE sucessfully handshakes with the wrong address
Conflicts:
	internal/experiment/webconnectivitylte/cleartextflow.go
	internal/experiment/webconnectivitylte/secureflow.go
I am not planning on butchering lte on master, but I need to
butcher it here because I really need to figure out how to align
them correctly. What I have so far is good, but there's some
theory/abstraction that I am stil missing.
@bassosimone bassosimone changed the title Nullability fix(minipipeline): analysis distinguishes between None and empty Nov 29, 2023
@bassosimone bassosimone marked this pull request as ready for review November 29, 2023 00:18
@bassosimone bassosimone changed the title fix(minipipeline): analysis distinguishes between None and empty fix(minipipeline/analysis): distinguish between None and empty Nov 29, 2023
@bassosimone bassosimone merged commit f452bb0 into master Nov 29, 2023
8 checks passed
@bassosimone bassosimone deleted the nullability branch November 29, 2023 00:34
Murphy-OrangeMud pushed a commit to Murphy-OrangeMud/probe-cli that referenced this pull request Feb 13, 2024
…1401)

None means that an algorithm did not run or did not find enough data to
produce a result. Empty means that the algorithm did run, did find
enough data, and did produce an empty result.

The difference between these two states, which generally is not
important when writing Go code, is extremely important to take the
correct decisions when assigning the results of web measurements.

Accordingly, this diff goes through each algorithm and ensures we start
with a None state and only switch to the empty state when we have seen
enough data to determine that the result is indeed empty.

Additionally, in this diff we also add the following new analysis rules:

- `ComputeDNSPossiblyInvalidAddrsClassic`, which is like
`ComputeDNSPossiblyInvalidAddrs` but does not consider TLS, which in
turn is useful to emulate the original Web Connectivity v0.4 behavior;
- `ComputeDNSPossiblyNonexistingDomains`, which tells us for which
domains the probe and the control agree that those domains are undefined
(i.e., they both get `NXDOMAIN`), which is useful to detect this
specific case;

We also renamed `ComputeHTTPFinalResponses` to
`ComputeHTTPFinalResponsesWithControl` and changed the definition such
that we only include responses for which we have a control. This is the
core rule to decide whether we should move forward with considering the
results of the HTTP diff set of algorithms.

The reference issue is ooni/probe#2634.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant