View on GitHub


On the Accuracy of Tor Bandwidth Estimation


This page provides instructions for reproducing our relay speed test analysis. See the front page for more context.

The speed test itself was run on the Tor network using some changes to Tor and a python controller script as discussed in the paper. Here are the components that we used when running the speed test.

The results are stored in speedtester.json.xz and are used in the analysis below, wherein we attempt to better understand the effects of the speed test.

Note: the data processing tasks in Steps 1-4 have already been done, and the output from those steps have been cached in this repository. If you just want to re-plot the graphs, do Step 0 and then skip to Step 5.

Step 0: prepare python virtual environment

python3 -m venv myenv
source myenv/bin/activate
pip3 install stem matplotlib numpy scipy

Step 1: download raw Tor metrics data


Step 2: decompress

tar xJf server-descriptors-2019-08.tar.xz
tar xJf consensuses-2019-08.tar.xz

Step 3: extract bandwidth info

source myenv/bin/activate

# output is tor.archive.json
python3 consensuses-2019-08 server-descriptors-2019-08

Step 4: compute

source myenv/bin/activate

# input is speedtester.json.xz, output is speedtest.measured.json.xz

# input is speedtest.measured.json.xz and tor.archive.json
# output is speedtest.diffs.json.xz and advbw_over_time.json.xz

Step 5: plot the graphs

source myenv/bin/activate

# input is advbw_over_time.json.xz
python3 > stats_timeseries.txt

# input is speedtest.measured.json.xz, speedtest.diffs.json.xz,
# ../capacity_variation/relay_uptime.json.xz, and
# ../capacity_variation/relay_position.json.xz
python3 > stats_explore.txt