Skip to content

Commit

Permalink
tweaks and corrections in binning analysis tutorial, make it use the …
Browse files Browse the repository at this point in the history
…same data sets as the autocorrelation tutorial
  • Loading branch information
biermanncarl committed May 12, 2021
1 parent 360587b commit ae19c5a
Showing 1 changed file with 12 additions and 13 deletions.
25 changes: 12 additions & 13 deletions doc/tutorials/error_analysis/error_analysis_part1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tutorial: Error Estimation"
"# Tutorial: Error Estimation - Part 1 (Introduction and Binning Analysis)"
]
},
{
Expand Down Expand Up @@ -43,7 +43,7 @@
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"np.random.seed(44)\n",
"np.random.seed(43)\n",
"\n",
"def ar_1_process(n_samples, y0, c, phi, eps, n_warmup):\n",
" y = y0\n",
Expand All @@ -57,11 +57,11 @@
"\n",
"N_SAMPLES = 100000\n",
"\n",
"time_series_1 = ar_1_process(N_SAMPLES, 0.0, 1.0, 0.9, 3.0, 100)\n",
"time_series_2 = ar_1_process(N_SAMPLES, 0.0, 0.05, 0.998, 1.0, 1000)\n",
"time_series_1 = ar_1_process(N_SAMPLES, 0.0, 2.0, 0.85, 2.0, 100)\n",
"time_series_2 = ar_1_process(N_SAMPLES, 0.0, 0.05, 0.995, 1.0, 1000)\n",
"\n",
"\n",
"plt.title(\"The first 2000 samples of both time series\")\n",
"plt.title(\"The first 1000 samples of both time series\")\n",
"plt.plot(time_series_1[0:1000], label=\"time series 1\")\n",
"plt.plot(time_series_2[0:1000], label=\"time series 2\")\n",
"plt.legend()\n",
Expand Down Expand Up @@ -140,6 +140,7 @@
"outputs": [],
"source": [
"plt.plot(time_series_1[1000:1050],\"x\")\n",
"plt.ylim((8,19))\n",
"plt.show()"
]
},
Expand All @@ -149,7 +150,7 @@
"source": [
"One can clearly see that each sample lies in the vicinity of the previous one.\n",
"\n",
"Below is an example for almost completely uncorrelated samples. The data points are taken from the same time series as in the previous example, but this time they are chosen with large gaps in between (every 200th sample is used). These samples appear to fluctuate a lot more randomly."
"Below is an example for almost completely uncorrelated samples. The data points are taken from the same time series as in the previous example, but this time they are chosen with large gaps in between (every 800th sample is used). These samples appear to fluctuate a lot more randomly."
]
},
{
Expand All @@ -158,7 +159,8 @@
"metadata": {},
"outputs": [],
"source": [
"plt.plot(time_series_1[1000:11000:200],\"x\")\n",
"plt.plot(time_series_1[2000:42000:800],\"x\")\n",
"plt.ylim((8,19))\n",
"plt.show()"
]
},
Expand All @@ -179,9 +181,7 @@
"\n",
"Once we have computed the bin averages $\\overline{X}_i$, getting the SEM is straightforward: we can simply treat $\\overline{X}_i$ as an uncorrelated time series. In other words, we can compute the SEM by using equation (1) and (2)!\n",
"\n",
"Let's implement this.\n",
"\n",
"In the code cell below, we load the simulation data into numpy arrays so that we can analyze them."
"Let's implement this."
]
},
{
Expand All @@ -190,7 +190,6 @@
"metadata": {},
"outputs": [],
"source": [
"N_SAMPLES = len(time_series_1[1000:])\n",
"BIN_SIZE = 2000"
]
},
Expand Down Expand Up @@ -352,7 +351,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You should see that the series converges to a value between 0.05 and 0.06, before transitioning into a noisy tail. The tail becomes increasingly noisy, because as the block size increases, the number of blocks decreases, thus resulting in worse statistics.\n",
"You should see that the series converges to a value between 0.02 and 0.03, before transitioning into a noisy tail. The tail becomes increasingly noisy, because as the block size increases, the number of blocks decreases, thus resulting in worse statistics.\n",
"\n",
"To extract the correct SEM from this plot, we can fit an exponential function to the first part of the data, that doesn't suffer from too much noise."
]
Expand All @@ -366,7 +365,7 @@
"from scipy.optimize import curve_fit\n",
"\n",
"# only fit to the first couple of SEMs\n",
"CUTOFF = 300\n",
"CUTOFF = 600\n",
"\n",
"# sizes of the corresponding bins\n",
"sizes = np.arange(3,3+CUTOFF,dtype=int)\n",
Expand Down

0 comments on commit ae19c5a

Please sign in to comment.