tweaks and corrections in binning analysis tutorial, make it use the …

…same data sets as the autocorrelation tutorial
espressomd · May 12, 2021 · ae19c5a · ae19c5a
1 parent 360587b
commit ae19c5a
Showing 1 changed file with 12 additions and 13 deletions.
diff --git a/doc/tutorials/error_analysis/error_analysis_part1.ipynb b/doc/tutorials/error_analysis/error_analysis_part1.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Tutorial: Error Estimation"
+    "# Tutorial: Error Estimation - Part 1 (Introduction and Binning Analysis)"
    ]
   },
   {
@@ -43,7 +43,7 @@
     "import numpy as np\n",
     "import matplotlib.pyplot as plt\n",
     "\n",
-    "np.random.seed(44)\n",
+    "np.random.seed(43)\n",
     "\n",
     "def ar_1_process(n_samples, y0, c, phi, eps, n_warmup):\n",
     "    y = y0\n",
@@ -57,11 +57,11 @@
     "\n",
     "N_SAMPLES = 100000\n",
     "\n",
-    "time_series_1 = ar_1_process(N_SAMPLES, 0.0, 1.0, 0.9, 3.0, 100)\n",
-    "time_series_2 = ar_1_process(N_SAMPLES, 0.0, 0.05, 0.998, 1.0, 1000)\n",
+    "time_series_1 = ar_1_process(N_SAMPLES, 0.0, 2.0, 0.85, 2.0, 100)\n",
+    "time_series_2 = ar_1_process(N_SAMPLES, 0.0, 0.05, 0.995, 1.0, 1000)\n",
     "\n",
     "\n",
-    "plt.title(\"The first 2000 samples of both time series\")\n",
+    "plt.title(\"The first 1000 samples of both time series\")\n",
     "plt.plot(time_series_1[0:1000], label=\"time series 1\")\n",
     "plt.plot(time_series_2[0:1000], label=\"time series 2\")\n",
     "plt.legend()\n",
@@ -140,6 +140,7 @@
    "outputs": [],
    "source": [
     "plt.plot(time_series_1[1000:1050],\"x\")\n",
+    "plt.ylim((8,19))\n",
     "plt.show()"
    ]
   },
@@ -149,7 +150,7 @@
    "source": [
     "One can clearly see that each sample lies in the vicinity of the previous one.\n",
     "\n",
-    "Below is an example for almost completely uncorrelated samples. The data points are taken from the same time series as in the previous example, but this time they are chosen with large gaps in between (every 200th sample is used). These samples appear to fluctuate a lot more randomly."
+    "Below is an example for almost completely uncorrelated samples. The data points are taken from the same time series as in the previous example, but this time they are chosen with large gaps in between (every 800th sample is used). These samples appear to fluctuate a lot more randomly."
    ]
   },
   {
@@ -158,7 +159,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "plt.plot(time_series_1[1000:11000:200],\"x\")\n",
+    "plt.plot(time_series_1[2000:42000:800],\"x\")\n",
+    "plt.ylim((8,19))\n",
     "plt.show()"
    ]
   },
@@ -179,9 +181,7 @@
     "\n",
     "Once we have computed the bin averages $\\overline{X}_i$, getting the SEM is straightforward: we can simply treat $\\overline{X}_i$ as an uncorrelated time series. In other words, we can compute the SEM by using equation (1) and (2)!\n",
     "\n",
-    "Let's implement this.\n",
-    "\n",
-    "In the code cell below, we load the simulation data into numpy arrays so that we can analyze them."
+    "Let's implement this."
    ]
   },
   {
@@ -190,7 +190,6 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "N_SAMPLES = len(time_series_1[1000:])\n",
     "BIN_SIZE = 2000"
    ]
   },
@@ -352,7 +351,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "You should see that the series converges to a value between 0.05 and 0.06, before transitioning into a noisy tail. The tail becomes increasingly noisy, because as the block size increases, the number of blocks decreases, thus resulting in worse statistics.\n",
+    "You should see that the series converges to a value between 0.02 and 0.03, before transitioning into a noisy tail. The tail becomes increasingly noisy, because as the block size increases, the number of blocks decreases, thus resulting in worse statistics.\n",
     "\n",
     "To extract the correct SEM from this plot, we can fit an exponential function to the first part of the data, that doesn't suffer from too much noise."
    ]
@@ -366,7 +365,7 @@
     "from scipy.optimize import curve_fit\n",
     "\n",
     "# only fit to the first couple of SEMs\n",
-    "CUTOFF = 300\n",
+    "CUTOFF = 600\n",
     "\n",
     "# sizes of the corresponding bins\n",
     "sizes = np.arange(3,3+CUTOFF,dtype=int)\n",