Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMC + FIVO implementation #16

Open
wants to merge 97 commits into
base: main
Choose a base branch
from
Open

SMC + FIVO implementation #16

wants to merge 97 commits into from

Conversation

andrewwarrington
Copy link

Main components:

  • SMC implementation:
    • Self-contained code in ssm -> inference -> smc.py.
    • Implements both BPFs and SMC.
    • To use proposals, the proposal must be passed in as a callable with a fixed set of arguments.
    • Returns an object of type SMCPosterior that contains all sweep information.
    • Can be vmaped pretty easily to do repeated sweeps or over different trials.
    • Accompanying notebook in ssm -> notebooks -> smc-lds.ipynb
    • PyTest scripts in tests -> inference -> smc.py
      • One short test that basically tests that everything compiles, runs, and returns the right shape.
      • One longer test that tests that the estimated marginal likelihood is correct. (Marked by @PyTest.mark.slow)
  • Conditional generators and proposals:
    • ssm -> inference -> conditional_generators.py contains a template implementation for a flexible implementations of conditional distributions.
    • The example included is for a conditional independent multivariate Gaussian.
    • The structure is a bit wild, but you call the build_independent_gaussian_generator which then builds a linen neural network module with the prescribed trunk function, and mean and variance head functions.
    • There are sometimes some problems with jitting this because of the way I build it, but it is sufficient for now.
    • ssm -> inference -> proposals.py then wraps a call to a conditional generator with the prescribed trunk function, and mean and variance head functions, and also contains functions for formatting the input and output to the proposal. These input and output wrappers would need to be changed for different proposal structures or input templates.
    • There can either be a single proposal, or, multiple stacked proposals. To use multiple proposals (indicated by proposal_type = 'INDEPENDENT'), there must be as many proposals as there are time steps, and then the timestep indexes the proposal to use.
  • FIVO:
    • ssm -> inference -> fivo.py contains some helper functions for running FIVO on a model.
    • FIVO uses the SMC sweep to compute a biased estimate of the expected log marginal likelihood.
    • The amount of boilerplate code for implementing FIVO is fairly minimal, with a lot of model-specific configuration stuff to be implemented by the user. There is therefore a template FIVO implementation in the accompanying notebook ssm-> notebooks -> fivo-lds.ipynb.
    • For this code I have introduced a ._parameters paradigm. Using boilerplate code this will capture the default calling arguments when the model is initialised. This then allows for a shallow tree-flatten and un-flatten to be performed, but using the named and interpretable calling arguments.
      • Requires that the inputs are unconstrained parameters. Also silently requires that the inputs are "leaf-like-variables/nodes", which if not satisfied may cause it to silently fail.
      • This will need to be updated to something more thorough, but it is good enough for the time being.
    • The model parameters to be learned are designated using the string values (which will then pull the values out of ._parameters, or inject the values into a new instantiation of the model.
    • Resampling gradients are currently commented out (and will throw a loud NotImplementedError).
  • Added a few little bells and whistles into utils.py and started a utility file for neural networks stuff nn_util.py.

Obviously the FIVO code only works for Gaussian LDS's at the moment. The SMC code should work for everything. There are only independent Gaussian proposals defined at the moment, we should look to add more types of proposal.

I am reasonably confident in this current implementation. But we should sit down and do as thorough-er code review as you like.

A

andrew warrington added 30 commits November 9, 2021 15:57
… is resolved by reducing the emission covariance, which suggests that maybe it isnt actually a shift by one, but is a shift by some parameteric amount) and why the evidence approximations converge to the wrong value
… i expected though. it consistently overestimates the evidence for higher initial and emission covariances, which is kind of weird. but squashing these down and increasing the number of particles and the evidence approximations converge.
…he functions so that it can be notebook-ized and does not require duplicating a ton of code
@github-actions
Copy link

github-actions bot commented Dec 6, 2021

Unit Test Results

  1 files  ±  0    1 suites  ±0   10m 16s ⏱️ - 20m 47s
38 tests  - 34  38 ✔️  - 34  0 💤 ±0  0 ±0 

Results for commit 0111b9e. ± Comparison against base commit d84b3fe.

This pull request removes 72 and adds 38 tests. Note that renamed tests count towards both.
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[10]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[12]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[2]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[4]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[6]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_emissions_dim[8]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_latent_dim[10]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_latent_dim[12]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_latent_dim[2]
tests.timing_comparisons.test_time_hmm.TestGaussianARHMM ‑ test_arhmm_em_fit_latent_dim[4]
…
tests.arhmm.test_arhmm ‑ test_gaussian_arhmm_em_fit
tests.arhmm.test_arhmm ‑ test_gaussian_arhmm_jit
tests.arhmm.test_arhmm ‑ test_gaussian_arhmm_sample
tests.arhmm.test_arhmm ‑ test_gaussian_arhmm_sample_is_consistent
tests.hmm.test_hmm ‑ test_bernoulli_hmm_em_fit
tests.hmm.test_hmm ‑ test_bernoulli_hmm_jit
tests.hmm.test_hmm ‑ test_bernoulli_hmm_sample
tests.hmm.test_hmm ‑ test_bernoulli_hmm_sample_is_consistent
tests.hmm.test_hmm ‑ test_gaussian_hmm_em_fit
tests.hmm.test_hmm ‑ test_gaussian_hmm_jit
…

♻️ This comment has been updated with latest results.

andrew warrington added 8 commits December 7, 2021 13:06
…est script. also added the flexibility for smc posterior to handle states without a latent dimension. also added some of the infrastructure for using arbitrary pytrees in the proposal. i figure that if this is the case then the user has created the _front-end_ for handling such a pytree as output
@andrewwarrington
Copy link
Author

Coolio, so, I've added some extra tests (FIVO and SMC in some discrete models), and I've fixed up some of the interface and tools stuff Collin and I spoke about. Holla and let me know :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants