feat(engine): add randomtraffic experiment #1026
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Checklist
Description
This test aims to detect the censorship of fully random traffic. In short, the experiment sends random bytes to an IP address chosen at random from a list of pre-determined public IP addresses that were affected by this censorship in the past and records information about the nature of censorship. This censorship was originally detected from the Great Firewall of China (GFW).
Censorship Description
Our team reverse engineered the GFW's new censorship system and determined that it uses the following rules to exempt traffic from blocking:
For the first TCP payload sent by the client, allow the traffic to continue if any of the following hold:
In addition to these rules, the censorship only occurs when connecting to a certain list of IP addresses.
If the IP address is in the censored range and none of the above hold, there is an approximate 26.3% chance the connection is censored. For a more detailed description of the censorship please see the reading copy of our paper.
Test Goals and Procedure
The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following:
False Negative and False Positive Rates
Using an IP known to be in the censored range, the false negative rate (the rate at which the test will say there is no censorship present when in fact there is) of this test was calculated to be approximately 1.05%. On the other hand, after running the test 10,000 times from a location not experiencing censorship, no false positives were recorded.
IP List Construction
The IP list was created by first obtaining a large list of public TCP servers. The test was then performed five times on each IP from a computer where censorship is expected. The final list of IP addresses is made up of only the IP addresses which reported censorship all five times. In order for one of these IP addresses to not be in the censored range, each of the five reports of censorship would have had to have been false positives, which we know to be extremely unlikely, meaning we can label these IP addresses as in the censored range.