Skip to content

Commit

Permalink
updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
rmj3197 committed Sep 4, 2024
1 parent 472486e commit 787a957
Show file tree
Hide file tree
Showing 8 changed files with 191 additions and 4 deletions.
1 change: 1 addition & 0 deletions QuadratiK/spherical_clustering/_pkbd.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
class PKBD:
"""
Class for estimating density and generating samples of Poisson-kernel based distribution (PKBD).
Details on PKBD can be found in :ref:`User Guide <pkbd>`.
"""

def __init__(self) -> None:
Expand Down
11 changes: 8 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,11 @@ Introduction

The QuadratiK package is implemented in both **R** and **Python**, providing a comprehensive set of goodness-of-fit tests and a clustering technique using kernel-based quadratic distances. This framework aims to bridge the gap between the statistical and machine learning literatures. It includes:

* **Goodness-of-Fit Tests** : The software implements one, two, and k-sample tests for goodness of fit, offering an efficient and mathematically sound way to assess the fit of probability distributions. Expanded capabilities include supporting tests for uniformity on the $d$-dimensional Sphere based on Poisson kernel densities.
* **Goodness-of-Fit Tests** : The software implements one, two, and k-sample tests for goodness of fit, offering an efficient and mathematically sound way to assess the fit of probability distributions. Expanded capabilities include supporting tests for uniformity on the $d$-dimensional Sphere based on Poisson kernel densities. Our tests are particularly useful for large, high-dimensional datasets where the assessment of fit of probability models is of interest. Specifically, we offer tests for normality, as well as two- and k-sample tests, where testing equality of two or more distributions is of interest, i.e. $H_0: F_1 = F_2$ and $H_0: F_1 = \\ldots = F_k$ respectively. The proposed tests perform well in terms of level and power for contiguous alternatives, heavy tailed distributions and in higher dimensions.

* **Clustering Algorithm for Spherical Data**: the package incorporates a unique clustering algorithm specifically tailored for spherical data. This algorithm leverages a mixture of Poisson-kernel-based densities on the sphere, enabling effective clustering of spherical data or data that has been spherically transformed. This facilitates the uncovering of underlying patterns and relationships in the data. Additionally, the package also includes Poisson Kernel-based Densities random number generation.
* **Poisson Kernel-based Distribution (PKBD)** : The package also includes functionality for generating random samples from PKBD and computing the density value. A short guide on PKBD is included in `User Guide <user_guide>`_. For more details please see `Golzy and Markatou (2020) <https://www.tandfonline.com/doi/abs/10.1080/10618600.2020.1740713>`_.

* **Clustering Algorithm for Spherical Data**: The package incorporates a unique clustering algorithm specifically tailored for spherical data. This algorithm leverages a mixture of Poisson-kernel-based densities on the sphere, enabling effective clustering of spherical data or data that has been spherically transformed. This facilitates the uncovering of underlying patterns and relationships in the data. The clustering algorithm is especially useful in the presence of noise in the data and the presence of non-negligible overlap between clusters.

* **Additional Features**: Alongside these functionalities, the software includes additional graphical functions, aiding users in validating cluster results as well as visualizing and representing clustering results. This enhances the interpretability and usability of the analysis.

Expand Down Expand Up @@ -107,6 +109,9 @@ Usage Examples
- `QuadratiK Examples <https://quadratik.readthedocs.io/en/latest/user_guide/basic_usage.html>`_:
A collection of basic examples that demonstrate how to use the core functionalities of the QuadratiK package. Ideal for new users to get started quickly.

- `An Introduction to Poisson Kernel-Based distributions <https://quadratik.readthedocs.io/en/latest/user_guide/pkbd.html>`
A short introduction to the Poisson Kernel-Based distributions.

- `Random sampling from the Poisson kernel-based density <https://quadratik.readthedocs.io/en/latest/user_guide/gen_plot_rpkb.html>`_:
Learn how to generate random samples from the Poisson kernel-based density and visualize the results.

Expand Down Expand Up @@ -217,7 +222,7 @@ The code of conduct can be found at `Code of Conduct <https://quadratik.readthed
License
--------

This project uses the GPL-3.0 license, with a full version of the license included in the repository `here <https://github.com/rmj3197/QuadratiK/blob/master/LICENSE>`_.
This project uses the GPL-3.0 license, with a full version of the license included in the `repository <https://github.com/rmj3197/QuadratiK/blob/master/LICENSE>`_.


Citation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,68 @@




































































Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,32 @@
































Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,48 @@
















































Expand Down
2 changes: 1 addition & 1 deletion doc/source/changelog/v1.2.dev0.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
QuadratiK Version 1.2.dev0
=========================
============================

This is the current version under development. So far this version includes -
1. **[NEW]** Included usage instructions for Dashboard Application under User Guide.
1 change: 1 addition & 0 deletions doc/source/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Random Sampling from PKBD
.. toctree::
:maxdepth: 4

pkbd.rst
gen_plot_rpkb

Usage Instructions for Dashboard Application
Expand Down
50 changes: 50 additions & 0 deletions doc/source/user_guide/pkbd.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.. _pkbd:

An Introduction to Poisson Kernel-Based Distributions
=======================================================

The Poisson kernel-based densities are based on the normalized Poisson kernel
and are defined on the :math:`d`-dimensional unit sphere. Given a vector
:math:`\mathbf{\mu} \in \mathcal{S}^{d-1}`, where :math:`\mathcal{S}^{d-1}=
\{x \in \mathbb{R}^d : ||x|| = 1\}`, and a parameter :math:`\rho` such that
:math:`0 < \rho < 1`, the probability density function of a :math:`d`-variate
Poisson kernel-based density is defined by:

.. math::
f(\mathbf{x}|\rho, \mathbf{\mu}) = \frac{1-\rho^2}{\omega_d
||\mathbf{x} - \rho \mathbf{\mu}||^d},
where :math:`\mu` is a vector orienting the center of the distribution,
:math:`\rho` is a parameter to control the concentration of the distribution
around the vector :math:`\mu`, and it is related to the variance of the
distribution. Recall that, for :math:`x = (x_1, \ldots, x_d) \in \mathbb{R}^d`,
:math:`||x|| = \sqrt{x_1^2 + \ldots + x_d^2}`. Furthermore, :math:`\omega_d =
2\pi^{d/2} [\Gamma(d/2)]^{-1}` is the surface area of the unit sphere in
:math:`\mathbb{R}^d` (see Golzy and Markatou, 2020). When :math:`\rho \to 0`,
the Poisson kernel-based density tends to the uniform density on the sphere.

The connection of the PKBDs to other distributions is discussed in detail in
Golzy and Markatou (2020). Here we note that when :math:`d=2`, PKBDs reduce to
the wrapped Cauchy distribution. Additionally, with precise choice of the
parameters :math:`\rho` and :math:`\mu`, the two-dimensional PKBD becomes a
two-dimensional projected normal distribution. However, the connection with
:math:`d`-dimensional projected normal distributions does not extend beyond
:math:`d=2`.

Golzy and Markatou (2020) proposed an acceptance-rejection method for
simulating data from a PKBD using von Mises-Fisher envelopes (:code:`rejvmf`).
Furthermore, Sablica, Hornik, and Leydold (2023) proposed new methods
for simulating from the PKBD, using angular central Gaussian envelopes
(:code:`rejacg`).

Please see :py:mod:`QuadratiK.spherical_clustering.PKBD` for details on using the
random sample generation and density estimation functions of the PKB distribution.
Usage examples are included in `User Guide <user_guide>`_.

References
************

Golzy, M., & Markatou, M. (2020). Poisson Kernel-Based Clustering on the Sphere: Convergence Properties, Identifiability, and a Method of Sampling. Journal of Computational and Graphical Statistics, 29(4), 758–770. https://doi.org/10.1080/10618600.2020.1740713

Sablica, L., Hornik, K., & Leydold, J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2), 2180-2209.

0 comments on commit 787a957

Please sign in to comment.