V0.4.1 (#162)

* Change README logo if in dark mode (#95) * Start transform tests & minor `RandomTimeShift` optimization (#94) * Start transform tests & slight optimization in RandomTimeShift * Add new length check for TimeCrop tests * Code style * Fixed name mangling. * QAM/PSK Pulse shaping filter transition bandwidth corrected (#98) * excess bandwidth is defined in porportion to signal bandwidth, not sampling rate, thus needs to be scaled by the samples per symbol * filling in a comment to describe modification to code * QAM/PSK Pulse shaping filter transition bandwidth corrected (#98) * excess bandwidth is defined in porportion to signal bandwidth, not sampling rate, thus needs to be scaled by the samples per symbol * filling in a comment to describe modification to code * OFDM Modulator filter lengths estimated and bandwidth randomized (#99) * * cutoff frequency for LPF now randomized when using 'rand_lpf' * derives a transition bandwidth from the cutoff frequency * uses filter length approximating function for the randomized LPF * using filter estimation function for pre-computed LPF taps * Tests for visual inspection. (#103) * 91 create generation performance benchmarks for each modulation type (#104) * Initial benchmarking code. * Some benchmarks * Adding initial benchmarks. * Fix action. --------- * 75 examine ofdm generation for potential speedups for sig53 (#105) * Tests for visual inspection of modulation generation. (#102) * Optimizations show significant improvement in generation speed. * Nominal behavior after using scipy. * Adding initial Dockerfile (#108) * Incrementing version * Fix float issue (#111) * Initial draft of restructuring transforms (#106) * Flatten transforms to mirror torchvision/audio structure & add reprs * Address flake8 errors * Update transform imports with restructuring * Formatting. --------- * Fixing broken examples, formatting files, etc. * Adjustments for generation speed * Wideband generation working. * 45 consider compatibility with torch 20 (#115) * Seems to work * Workflow is broken * Python 3.7 not supported for Torch 2.0 * Adjusting test workflow * Extracted out a method for filter design. Put DSP-only things in utils/dsp (#116) * Benchmark and visualize wideband dataset generation. (#118) * Extracted out a method for filter design. Put DSP-only things in utils/dsp * New tests. * Migrated to pytest. (#119) * Added model instantiation tests for narrowband signals. (#120) * 85 gmskgfsk also needs faster filtering due to convolution of long signals with gaussian pulse shape (#121) * Change all references to convolution to scipy-based convolutions. * Missed one. * Add mypy workflow check and fix all mypy-found bugs (#123) * Fix mypy in target_transforms * Add mypy workflow for static type checking * Fix typo in mypy.yml workflow * Fix mypy in torchsig/transforms/functional.py * Fix mypy in utils/types.py * Fix mypy in torchsig/utils/ * Fix mypy in torchsig/transforms/ * Fix mypy in torchsig/datasets/ * Fix mypy in torchsig/models/ * Format with pyfmt * Fix isinstance(x, Callable) * Add PR template * Adjusting module (#130) * 131 configure package for pypi release (#132) * Adjusting installation * More information on pyproject.toml * More information on pyproject.toml * I guess some keywords in current documentation do not work :/. * Ignore distributable artifacts * More documentation and examples (#134) * More documentation and examples * Added script to train * Remove duplicate script * 126 create dockerfile and script for generating all versions of widebandsig53 (#135) * Added generation script. * Adding scripts * Fixing mypy issue. * 127 create test suite for visually validating transforms (#136) * Some transforms here * Most transforms included * Fix path in generation script * Added dependencies in Docker (#141) * Removed use_gpu (#142) * Adjustments for dataset generation. * fixing syntax error, os.root is invalid, but os.path is valid (#147) * Working example notebooks * Working on another platofmr * Bumping version --------- Co-authored-by: lboegner <[email protected]> Co-authored-by: Garrett Vanhoy <[email protected]> Co-authored-by: MattCarrickPL <[email protected]>
TorchDSP · Jul 27, 2023 · 8049b43 · 8049b43
1 parent 6bd7509
commit 8049b43
Show file tree

Hide file tree

Showing 25 changed files with 1,177 additions and 956 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1 @@
+examples/*
diff --git a/.gitignore b/.gitignore
@@ -12,3 +12,4 @@ lightning_logs/
 *.jpg
 *.benchmarks/
 dist/
+examples/*.ipynb_checkpoints/
diff --git a/Dockerfile b/Dockerfile
@@ -2,14 +2,17 @@ FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
 
 ENV DEBIAN_FRONTEND=noninteractive
 
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    libgl1-mesa-glx && \
+    rm -rf /var/lib/apt/lists/*
+
 ADD torchsig/ /build/torchsig
 
 ADD pyproject.toml /build/pyproject.toml
 
 RUN pip3 install /build
 
-RUN pip3 install notebook
+RUN pip3 install notebook jupyterlab
 
 WORKDIR /workspace/code
-
-ADD examples/ /workspace/code/examples
diff --git a/README.md b/README.md
@@ -48,36 +48,36 @@ If you'd like to generate the named datasets without messing with your current P
 
 ```
 docker build -t torchsig -f Dockerfile .
-docker run -u $(id -u ${USER}):$(id -g ${USER}) -v `pwd`:/workspace/code/torchsig torchsig python3 torchsig/scripts/generate_sig53.py --root=/workspace/code/torchsig/data --all=True
+docker run -u $(id -u ${USER}):$(id -g ${USER}) -v `pwd`:/workspace/code/torchsig torchsig python3 torchsig/scripts/generate_sig53.py --root=/workspace/code/torchsig/examples/sig53 --all=True
 ```
 
 For the wideband dataset, you can do:
 
 ```
 docker build -t torchsig -f Dockerfile .
-docker run -u $(id -u ${USER}):$(id -g ${USER}) -v `pwd`:/workspace/code/torchsig torchsig python3 torchsig/scripts/generate_wideband_sig53.py --root=/workspace/code/torchsig/data --all=True
+docker run -u $(id -u ${USER}):$(id -g ${USER}) -v `pwd`:/workspace/code/torchsig torchsig python3 torchsig/scripts/generate_wideband_sig53.py --root=/workspace/code/torchsig/examples/wideband_sig53 --all=True
 ```
 
 If you do not need to use Docker, you can also just generate using the regular command-line interface
 
 ```
-python3 torchsig/scripts/generate_sig53.py --root=torchsig/data --all=True
+python3 torchsig/scripts/generate_sig53.py --root=torchsig/examples --all=True
 ```
 
 or for the wideband dataset:
 
 ```
-python3 torchsig/scripts/generate_wideband_sig53.py --root=torchsig/data --all=True
+python3 torchsig/scripts/generate_wideband_sig53.py --root=torchsig/examples --all=True
 ```
 
-Then, be sure to point scripts looking for ```root``` to ```torchsig/data```.
+Then, be sure to point scripts looking for ```root``` to ```torchsig/examples```.
 
 ## Using the Dockerfile
 If you have Docker installed along with compatible GPUs and drivers, you can try:
 
 ```
 docker build -t torchsig -f Dockerfile .
-docker run -d --rm --network=host --shm-size=32g --gpus all --name torchsig_workspace torchsig tail -f /dev/null
+docker run -d --rm --network=host --shm-size=32g --gpus all --name torchsig_workspace -v `pwd`/examples:/workspace/code/examples torchsig tail -f /dev/null
 docker exec torchsig_workspace jupyter notebook --allow-root --ip=0.0.0.0 --no-browser
 ```
 

diff --git a/examples/00_example_sig53_dataset.ipynb b/examples/00_example_sig53_dataset.ipynb
@@ -0,0 +1,287 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "579e425b-e5de-4fdc-9908-ed8706d57194",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "# Example 00 - The Official Sig53 Dataset\n",
+    "This notebook walks through an example of how the official Sig53 dataset can be instantiated and analyzed."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0d636a9e-55c1-47a1-bc20-9c472acecc3b",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "----\n",
+    "### Import Libraries\n",
+    "First, import all the necessary public libraries as well as a few classes from the `torchsig` toolkit."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "edd181f0-893f-4646-8d7a-2fe2ee2280f6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from torchsig.utils.visualize import IQVisualizer, SpectrogramVisualizer\n",
+    "from torchsig.utils.dataset import SignalDataset\n",
+    "from torchsig.datasets.sig53 import Sig53\n",
+    "from torch.utils.data import DataLoader\n",
+    "from matplotlib import pyplot as plt\n",
+    "from typing import List\n",
+    "from tqdm import tqdm\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9d511e6b-7670-473b-a962-c08a9d341ec8",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "----\n",
+    "### Instantiate Sig53 Dataset\n",
+    "To instantiate the Sig53 dataset, several parameters are given to the imported `Sig53` class. These paramters are:\n",
+    "- `root` ~ A string to specify the root directory of where to instantiate and/or read an existing Sig53 dataset\n",
+    "- `train` ~ A boolean to specify if the Sig53 dataset should be the training (True) or validation (False) sets\n",
+    "- `impaired` ~ A boolean to specify if the Sig53 dataset should be the clean version or the impaired version\n",
+    "- `transform` ~ Optionally, pass in any data transforms here if the dataset will be used in an ML training pipeline\n",
+    "- `target_transform` ~ Optionally, pass in any target transforms here if the dataset will be used in an ML training pipeline\n",
+    "\n",
+    "A combination of the `train` and the `impaired` booleans determines which of the four (4) distinct Sig53 datasets will be instantiated:\n",
+    "- `train=True` & `impaired=False` = Clean training set of 1.06M examples\n",
+    "- `train=True` & `impaired=True` = Impaired training set of 5.3M examples\n",
+    "- `train=False` & `impaired=False` = Clean validation set of 106k examples\n",
+    "- `train=False` & `impaired=True` = Impaired validation set of 106k examples\n",
+    "\n",
+    "The final option of the impaired validation set is the dataset to be used when reporting any results with the official Sig53 dataset.\n",
+    "\n",
+    "Additional optional parameters of potential interest are:\n",
+    "- `regenerate` ~ A boolean specifying if the dataset should be regenerated even if an existing dataset is detected (Default: False)\n",
+    "- `eb_no` ~ A boolean specifying if the SNR should be defined as Eb/No if True (making higher order modulations more powerful) or as Es/No if False (Defualt: False)\n",
+    "- `use_signal_data` ~ A boolean specifying if the data and target information should be converted to `SignalData` objects as they are read in (Default: False)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ee772ec3-c2b8-4cde-af9a-b1284df09342",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Specify Sig53 Options\n",
+    "root = \"sig53/\"\n",
+    "train = False\n",
+    "impaired = False\n",
+    "transform = None\n",
+    "target_transform = None\n",
+    "\n",
+    "# Instantiate the Sig53 Dataset\n",
+    "sig53 = Sig53(\n",
+    "    root=root,\n",
+    "    train=train,\n",
+    "    impaired=impaired,\n",
+    "    transform=transform,\n",
+    "    target_transform=target_transform,\n",
+    ")\n",
+    "\n",
+    "# Retrieve a sample and print out information\n",
+    "idx = np.random.randint(len(sig53))\n",
+    "data, (label, snr) = sig53[idx]\n",
+    "print(\"Dataset length: {}\".format(len(sig53)))\n",
+    "print(\"Data shape: {}\".format(data.shape))\n",
+    "print(\"Label Index: {}\".format(label))\n",
+    "print(\"Label Class: {}\".format(Sig53.convert_idx_to_name(label)))\n",
+    "print(\"SNR: {}\".format(snr))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "80db34ff-80c2-49a0-96f3-d206cb307809",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "----\n",
+    "### Plot Subset to Verify\n",
+    "The `IQVisualizer` and the `SpectrogramVisualizer` can be passed a `Dataloader` and plot visualizations of the dataset. The `batch_size` of the `DataLoader` determines how many examples to plot for each iteration over the visualizer. Note that the dataset itself can be indexed and plotted sequentially using any familiar python plotting tools as an alternative plotting method to using the `torchsig` `Visualizer` as shown below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b6b1d1fb-3663-459a-a6f7-35ca255c1365",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# For plotting, omit the SNR values\n",
+    "class DataWrapper(SignalDataset):\n",
+    "    def __init__(self, dataset):\n",
+    "        self.dataset = dataset\n",
+    "        super().__init__(dataset)\n",
+    "\n",
+    "    def __getitem__(self, idx):\n",
+    "        x, (y, _) = self.dataset[idx]\n",
+    "        return x, y\n",
+    "\n",
+    "    def __len__(self) -> int:\n",
+    "        return len(self.dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "84e05a27",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot_dataset = DataWrapper(sig53)\n",
+    "\n",
+    "data_loader = DataLoader(dataset=plot_dataset, batch_size=16, shuffle=True)\n",
+    "\n",
+    "\n",
+    "# Transform the plotting titles from the class index to the name\n",
+    "def target_idx_to_name(tensor: np.ndarray) -> List[str]:\n",
+    "    batch_size = tensor.shape[0]\n",
+    "    label = []\n",
+    "    for idx in range(batch_size):\n",
+    "        label.append(Sig53.convert_idx_to_name(int(tensor[idx])))\n",
+    "    return label\n",
+    "\n",
+    "\n",
+    "visualizer = IQVisualizer(\n",
+    "    data_loader=data_loader,\n",
+    "    visualize_transform=None,\n",
+    "    visualize_target_transform=target_idx_to_name,\n",
+    ")\n",
+    "\n",
+    "for figure in iter(visualizer):\n",
+    "    figure.set_size_inches(14, 9)\n",
+    "    plt.show()\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "16be4f03-fa82-4d29-9f08-fe547fd7053a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Repeat but plot the spectrograms for a new random sampling of the data\n",
+    "visualizer = SpectrogramVisualizer(\n",
+    "    data_loader=data_loader,\n",
+    "    nfft=1024,\n",
+    "    visualize_transform=None,\n",
+    "    visualize_target_transform=target_idx_to_name,\n",
+    ")\n",
+    "\n",
+    "for figure in iter(visualizer):\n",
+    "    figure.set_size_inches(14, 9)\n",
+    "    plt.show()\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0e8e793e-48f9-45a7-81a0-8276f61cc94a",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "----\n",
+    "### Analyze Dataset\n",
+    "The dataset can also be analyzed at the macro level for details such as the distribution of classes and SNR values. This exercise is performed below to show the nearly uniform distribution across each."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a988b188-fa07-4505-8f59-9bfab387243d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Loop through the dataset recording classes and SNRs\n",
+    "class_counter_dict = {\n",
+    "    class_name: 0 for class_name in list(Sig53._idx_to_name_dict.values())\n",
+    "}\n",
+    "all_snrs = []\n",
+    "\n",
+    "for idx in tqdm(range(len(sig53))):\n",
+    "    data, (modulation, snr) = sig53[idx]\n",
+    "    class_counter_dict[Sig53.convert_idx_to_name(modulation)] += 1\n",
+    "    all_snrs.append(snr)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "020b5655-c6c4-4806-8b6a-dd027dbdb36f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Plot the distribution of classes\n",
+    "class_names = list(class_counter_dict.keys())\n",
+    "num_classes = list(class_counter_dict.values())\n",
+    "\n",
+    "plt.figure(figsize=(9, 9))\n",
+    "plt.pie(num_classes, labels=class_names)\n",
+    "plt.title(\"Class Distribution Pie Chart\")\n",
+    "plt.show()\n",
+    "\n",
+    "plt.figure(figsize=(11, 4))\n",
+    "plt.bar(class_names, num_classes)\n",
+    "plt.xticks(rotation=90)\n",
+    "plt.title(\"Class Distribution Bar Chart\")\n",
+    "plt.xlabel(\"Modulation Class Name\")\n",
+    "plt.ylabel(\"Counts\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c12ff742-cf0f-47f4-96ee-7ccdd147add2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Plot the distribution of SNR values\n",
+    "plt.figure(figsize=(11, 4))\n",
+    "plt.hist(x=all_snrs, bins=100)\n",
+    "plt.title(\"SNR Distribution\")\n",
+    "plt.xlabel(\"SNR Bins (dB)\")\n",
+    "plt.ylabel(\"Counts\")\n",
+    "plt.show()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}