-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: experiment to measure blocking of looks-like-random traffic #271
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for contributing this new interesting experiment! I think there's a need to discuss a bunch of design details before converging on a final spec. I also provided more detail suggestions regarding making the spec more clear and readable.
In addition to my inline comments, I have another broader comment. This experiment is deeply based on what you folks learned about the GFW. The bundled list of TCP endpoints, in particular, is based on what you learned about which ASN are blocked. I understand this list of ASNs is unlikely to shrink over time. However, IP addresses could be reassigned and become irrelevant. Contributing an experiment to OONI also means adding extra burden to the OONI team to maintain the experiment. So, I would appreciate if you could add a section to the spec describing what it is required to update the list of bundled TCP endpoints.
Another comment that I have is the following. Because this experiment is deeply based on what you learned about the GFW, how would this experiment apply to other countries where there are similar, if not more restrictive, random traffic filters (e.g., Iran)? I suppose we can consider it safe to run, because more restrictive filters would just block all "looks like random" traffic, but I would like to (a) have your opinion on the matter and (b) see this topic being briefly mentioned and explained in the specification.
Thanks again! 🙌 🙌 🙌 🙌
nettests/ts-039-randomtraffic.md
Outdated
Ability to detect the censorship of fully-encrypted protocols which encrypt every byte of traffic in an attempt to appear completely random. | ||
|
||
``` | ||
Note: This does not include TLS as TLS has a standard handshake to begin with. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest expanding upon this section to explain what fully-encrypted protocols are. For example, you can mention ShadowProxy, VMess, and OBFS4. You should integrate the remark that TLS is not a fully encrypted protocol into the whole discussion on fully-encrypted protocols, probably as the last sentence.
This section should also explain that this experiment is based on a paper. Even though the paper is not publicly available, I think you should mention the paper title and its primary author.
I think it's also important for you to summarize the findings of the paper in a very brief way. Basically, I would recommend mentioning the following points:
-
the paper investigated passive blocking of fully-encrypted traffic by the GFW
-
the paper characterized the rules used by the GFW to block such traffic
-
the nettest produces random traffic that should be blocked
-
blocking in this context means that, once the offending payload has been observed, the GFW installs rules that null-route traffic for the server endpoint for a given amount of time, that this blocking is nondeterministic and that sometimes it takes a bunch of connections to the same destination endpoint with offending payload to trigger this form of blocking
-
the nettest records the characteristics of the generated traffic along with whether it was blocked and what are the characteristics of the payload that eventually triggered blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the specification accordingly
nettests/ts-039-randomtraffic.md
Outdated
@@ -0,0 +1,104 @@ | |||
# Specification version number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, rename this file as ts-040-randomtraffic.md
. In the meanwhile, we merged ts-039-echcheck.md
, therefore, we need to bump the nettest number used by this nettest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, make sure you wrap long lines around ~line 80 to facilitate reading the spec from the terminal. Most users will read on the web, but it does not cost us that much to help people using the terminal. Also, having shorter lines helps with reviewing the spec in GitHub and providing suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the specification accordingly! 👍
nettests/ts-039-randomtraffic.md
Outdated
|
||
The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following: | ||
|
||
1. If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mention that the nettest does not take any input above. But here the first point of the algorithm mentions that the user can provide an IP address. I find this a bit confusing, and we should address this.
You chose to write an experiment that provides itself with input as a static list. You are using the InputNone
input policy. This choice is perfectly in line with what I would have done considering the constraints imposed to you by the OONI engine. Then, you additionally added the possibility of users to specify a target from the command line, which seems to me you did mostly for testing purposes. However, it may also be useful to test with a given endpoint from a possibly censored location, bypassing the default set of IP addresses.
Upcoming changes to OONI Probe will eventually allow us to provision this kind of input to your experiment in a smoother way (the high-level activity is ooni/ooni.org#1291, which I am cross referencing here to make sure I reference use cases made possible by this improvement).
Until these changes are ready, I think it does not make sense into the spec to advertise the possibility of providing targets for this experiment using miniooni -O Target=1.2.3.4:5678
. Therefore, I suggest you restructure this sentence to say that:
-
this experiment contains a set of TCP endpoints known to possibly host circumvention servers (e.g., an Outline server) and,
-
when started, this experiment will randomize this list and operate on the randomized permutation starting to select the first endpoint and then moving on to use subsequent endpoints
I think this is also a good place to point out some metonymy issue across the whole specification and implementation: You refer to "IP addresses" (e.g., 1.2.3.4
) while what you are actually dealing with are TCP endpoints (e.g., 1.2.3.4:80/tcp
). I think the wording should be more precise and explicitly say the whole experiment only deals with TCP endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok all sounds good. Changed the specification but left the functionality in there just for testing purposes.
nettests/ts-039-randomtraffic.md
Outdated
The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following: | ||
|
||
1. If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range | ||
2. Complete a TCP handshake with the IP address and send a stream of null bytes as a control test. If this control test succeeds then proceed with the experiment, otherwise attempt the control test with a new IP address two more times or until the control test is successful. If no control test succeeds end the test and return the error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Please, remember to wrap long sentences for readability.)
I think here you should say that you try with the first three TCP endpoints in the random permutation. If none of them works the test fails. In this case, the test would return an error to signal to the OONI Engine that you do not want to submit a measurement (<- is this the intended behavior?).
Then, you should explain that "success" in this preliminary check consists of performing a TCP connect (aka TCP handshake) and then sending a string of zero bytes with a random length. I think it may be useful here to explain why using all zeroes is considered safe with respect to the GFW. (Would it work in, say, Iran, which is know to have a much more restrictive sets of filters with respect to "unknown" traffic?)
I also have a methodological question here. You are sending a string of zero bytes but your code is not checking for the result of "send". Additionally, even if you would be checking for errors, I am not sure whether the error would be informative in most cases, because you're supposed to be able to enqueue on the socket buffer. Yet, checking for an error here would possibly be interesting in case you received an ICMP or other interference right after establishing the connection, but the opportunity window seems very small to me. That said, I am missing the real point of sending a string of zero bytes here. I suppose you are sending this to trigger some side effect, but I cannot fully see what the side effect is. Maybe your concern is that you want to know you can use a TCP endpoint before actually using it for the test, but, in such a case, what is the gain in sending the zero bytes given that after TCP connect succeeds you are not checking any other error? What would change methodologically if you avoid sending the zero-byte sequence and limit the control check to ensure that you can connect to the given IP address (to rule out it being already blocked, I suppose)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made all of the desired changes. I have one question however. If the test returns an error are the results not recorded by OONI? That is still the desired functionality however it may lead to some test keys becoming irrelevant. Also a great point about the string of zero bytes. You are 100% correct about it being unnecessary in this case. We decided to remove them now but may choose to reimplement them in the future in case we decide to generalize the test. The specification was updated accordingly!
nettests/ts-039-randomtraffic.md
Outdated
1. If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range | ||
2. Complete a TCP handshake with the IP address and send a stream of null bytes as a control test. If this control test succeeds then proceed with the experiment, otherwise attempt the control test with a new IP address two more times or until the control test is successful. If no control test succeeds end the test and return the error. | ||
3. Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test | ||
4. Step 3 is repeated 19 more times to account for the blocking rate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should restructure the algorithm to say you repeat for 20 times and then you should have a nested list containing what is currently the content of step 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the specification accordingly! 👍
nettests/ts-039-randomtraffic.md
Outdated
|
||
1. If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range | ||
2. Complete a TCP handshake with the IP address and send a stream of null bytes as a control test. If this control test succeeds then proceed with the experiment, otherwise attempt the control test with a new IP address two more times or until the control test is successful. If no control test succeeds end the test and return the error. | ||
3. Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test | |
3. Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the suggestion, I think it would be useful to specify what should happen in terms of submitting the measurement when you get an error that is not a timeout. Should the engine submit the measurement also in that case, or do you think we should not submit when we get, say, ENETUNREACH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we are looking for one of three different results. First, there is the case that the user is not experiencing censorship in which we expect no errors. Then there is the case where the user is indeed experiencing censorship in which we expect a timeout error and only a timeout error. Finally, there is the case that there are any other unexpected network errors in which the test simply returns the error and records the test as failed. The specification was updated to explain this.
nettests/ts-039-randomtraffic.md
Outdated
* The result of the test, 'success' or failure type | ||
* Whether or not the censorship was detected | ||
|
||
## Semantics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, restructure this section to read like this:
this experiment generates a "test keys" result object containing the following keys:
Additionally, please, use the name in the JSON output for each key rather than the name inside the Go implementation, which is just an implementation detail. (The data consumer sees the JSON file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the specification accordingly! 👍
nettests/ts-039-randomtraffic.md
Outdated
* PopcountRange: True if final popcount is less than 3.4 or greater than 4.6 | ||
* MatchesHTTP: True if fingerprinted as HTTP | ||
* MatchesTLS: True if fingerprinted as TLS | ||
* Payload: Payload of final packet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this definition should be improved. IIUC, this is the packet that triggered blocking in case there is censorship and the last packet that was generated otherwise.
Additional, broader design questions for you: Is there value in uploading to the OONI backend the final packet in case of success? Could it be that we're missing information by avoiding to submit all the packets that did not generate censorship?
nettests/ts-039-randomtraffic.md
Outdated
* MatchesTLS: True if fingerprinted as TLS | ||
* Payload: Payload of final packet | ||
* Censorship: False if all 20 connections succeeded | ||
* Error: String of error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, rename Error
to failure
. Most OONI experiments use failure
rather than error
.
Also, it seems to me censorship
, success
and error
could all derive from the value of error
. If that is the case, then I would recommend just keeping error
around. We don't need redundant information (some OONI experiments have that, but that's no excuse to be more tidy with new experiments.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that it seems both success and censorship could be derived from error, however in our test we do not consider a timeout error to be an error because it is expected in the case where a user is experiencing censorship. Error is used to record the type of any unexpected errors a user may have experienced while running the experiment. It is true however that success can be derived from error as a test is deemed successful if there were no unexpected errors.
nettests/ts-039-randomtraffic.md
Outdated
## Example output sample | ||
|
||
```JSON | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, make sure you update the JSON to the latest version of the experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the specification accordingly! 👍
Checklist
Description
This test aims to detect the censorship of fully random traffic. In short, the experiment sends random bytes to an IP address chosen at random from a list of pre-determined public IP addresses that were affected by this censorship in the past and records information about the nature of censorship. This censorship was originally detected from the Great Firewall of China (GFW).
Censorship Description
Our team reverse engineered the GFW's new censorship system and determined that it uses the following rules to exempt traffic from blocking:
For the first TCP payload sent by the client, allow the traffic to continue if any of the following hold:
In addition to these rules, the censorship only occurs when connecting to a certain list of IP addresses.
If the IP address is in the censored range and none of the above hold, there is an approximate 26.3% chance the connection is censored. For a more detailed description of the censorship please see the reading copy of our paper.
Test Goals and Procedure
The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following:
False Negative and False Positive Rates
Using an IP known to be in the censored range, the false negative rate (the rate at which the test will say there is no censorship present when in fact there is) of this test was calculated to be approximately 1.05%. On the other hand, after running the test 10,000 times from a location not experiencing censorship, no false positives were recorded.
IP List Construction
The IP list was created by first obtaining a large list of public TCP servers. The test was then performed five times on each IP from a computer where censorship is expected. The final list of IP addresses is made up of only the IP addresses which reported censorship all five times. In order for one of these IP addresses to not be in the censored range, each of the five reports of censorship would have had to have been false positives, which we know to be extremely unlikely, meaning we can label these IP addresses as in the censored range.