Skip to content

Releases: pymc-devs/pymc

v4.0.0b2

14 Jan 14:20
1a0b1e9
Compare
Choose a tag to compare
v4.0.0b2 Pre-release
Pre-release

PyMC 4.0.0 beta 2

This beta release includes the removal of warnings, polishing of APIs, more distributions and internal refactorings.

Here is the full list of changes compared to 4.0.0b1.

For a current list of changes w.r.t. the upcoming v3.11.5 see RELEASE-NOTES.md.

Notable changes & features

  • Introduction of pm.Data(..., mutable=False/True) and corresponding pm.ConstantData/pm.MutableData wrappers (see #5295).
  • The warning about theano or pymc3 being installed in parallel was removed.
  • dims can again be specified alongside shape or size (see #5325).
  • pm.draw was added to draw prior samples from a variable (see #5340).
  • Renames of model properties & methods like Model.logpt.
  • A function to find a prior based on lower/upper bounds (see #5231).

v4.0.0b1

16 Dec 10:14
Compare
Choose a tag to compare
v4.0.0b1 Pre-release
Pre-release

PyMC 4.0.0 beta 1

⚠ This is the first beta of the next major release for PyMC 4.0.0 (formerly PyMC3). 4.0.0 is a rewrite of large parts of the PyMC code base which make it faster, adds many new features, and introduces some breaking changes. For the most part, the API remains stable and we expect that most models will work without any changes.

Not-yet working features

We plan to get these working again, but at this point, their inner workings have not been refactored.

  • Timeseries distributions (see #4642)
  • Mixture distributions (see #4781)
  • Cholesky distributions (see WIP PR #4784)
  • Variational inference submodule (see WIP PR #4582)
  • Elliptical slice sampling (see #5137)
  • BaseStochasticGradient (see #5138)
  • pm.sample_posterior_predictive_w (see #4807)
  • Partially observed Multivariate distributions (see #5260)

Also, check out the milestones for a potentially more complete list.

Unexpected breaking changes (action needed)

  • New API is not available in v3.11.5.
  • Old API does not work in v4.0.0.

All of the above applies to:

  • ⚠ The library is now named, installed, and imported as "pymc". For example: pip install pymc. (Use pip install pymc --pre while we are in the pre-release phase.)
  • ⚠ Theano-PyMC has been replaced with Aesara, so all external references to theano, tt, and pymc3.theanof need to be replaced with aesara, at, and pymc.aesaraf (see 4471).
  • pm.Distribution(...).logp(x) is now pm.logp(pm.Distribution(...), x)
  • pm.Distribution(...).logcdf(x) is now pm.logcdf(pm.Distribution(...), x)
  • pm.Distribution(...).random() is now pm.Distribution(...).eval()
  • pm.draw_values(...) and pm.generate_samples(...) were removed. The tensors can now be evaluated with .eval().
  • pm.fast_sample_posterior_predictive was removed.
  • pm.sample_prior_predictive, pm.sample_posterior_predictive and pm.sample_posterior_predictive_w now return an InferenceData object by default, instead of a dictionary (see #5073).
  • pm.sample_prior_predictive no longer returns transformed variable values by default. Pass them by name in var_names if you want to obtain these draws (see 4769).
  • pm.sample(trace=...) no longer accepts MultiTrace or len(.) > 0 traces (see 5019#).
  • The GLM submodule was removed, please use Bambi instead.
  • pm.Bound interface no longer accepts a callable class as an argument, instead, it requires an instantiated distribution (created via the .dist() API) to be passed as an argument. In addition, Bound no longer returns a class instance but works as a normal PyMC distribution. Finally, it is no longer possible to do predictive random sampling from Bounded variables. Please, consult the new documentation for details on how to use Bounded variables (see 4815).
  • pm.logpt(transformed=...) kwarg was removed (816b5f).
  • Model(model=...) kwarg was removed
  • Model(theano_config=...) kwarg was removed
  • Model.size property was removed (use Model.ndim instead).
  • dims and coords handling:
    • Model.RV_dims and Model.coords are now read-only properties. To modify the coords dictionary use Model.add_coord.
    • dims or coordinate values that are None will be auto-completed (see #4625).
    • Coordinate values passed to Model.add_coord are always converted to tuples (see #5061).
  • Model.update_start_values(...) was removed. Initial values can be set in the Model.initial_values dictionary directly.
  • Test values can no longer be set through pm.Distribution(testval=...) and must be assigned manually.
  • Transform.forward and Transform.backward signatures changed.
  • pm.DensityDist no longer accepts the logp as its first positional argument. It is now an optional keyword argument. If you pass a callable as the first positional argument, a TypeError will be raised (see 5026).
  • pm.DensityDist now accepts distribution parameters as positional arguments. Passing them as a dictionary in the observed keyword argument is no longer supported and will raise an error (see 5026).
  • The signature of the logp and random functions that can be passed into a pm.DensityDist has been changed (see 5026).
  • Changes to the Gaussian process (gp) submodule:
    • The gp.prior(..., shape=...) kwarg was renamed to size.
    • Multiple methods including gp.prior now require explicit kwargs.
  • Changes to the BART implementation:
    • A BART variable can be combined with other random variables. The inv_link argument has been removed (see 4914).
    • Moved BART to its own module (see 5058).
  • Changes to the Gaussian Process (GP) submodule (see 5055):
    • For all implementations, gp.Latent, gp.Marginal etc., cov_func and mean_func are required kwargs.
    • In Windows test conda environment the mkl version is fixed to verison 2020.4, and mkl-service is fixed to 2.3.0. This was required for gp.MarginalKron to function properly.
    • gp.MvStudentT uses rotated samples from StudentT directly now, instead of sampling from pm.Chi2 and then from pm.Normal.
    • The "jitter" parameter, or the diagonal noise term added to Gram matrices such that the Cholesky is numerically stable, is now exposed to the user instead of hard-coded. See the function gp.util.stabilize.
    • The is_observed argument for gp.Marginal* implementations has been deprecated.
    • In the gp.utils file, the kmeans_inducing_points function now passes through kmeans_kwargs to scipy's k-means function.
    • The function replace_with_values function has been added to gp.utils.
    • MarginalSparse has been renamed MarginalApprox.

Expected breaks

  • New API was already available in v3.
  • Old API had deprecation warnings since at least 3.11.0 (2021-01).
  • Old API stops working in v4 (preferably with informative errors).

All of the above apply to:

  • pm.sample(return_inferencedata=True) is now the default (see #4744).
  • ArviZ plots and stats wrappers were removed. The functions are now just available by their original names (see #4549 and 3.11.2 release notes).
  • pm.sample_posterior_predictive(vars=...) kwarg was removed in favor of var_names (see #4343).
  • ElemwiseCategorical step method was removed (see #4701)

Ongoing deprecations

  • Old API still works in v4 and has a deprecation warning.
  • Preferably the new API should be available in v3 already

New features

  • The length of dims in the model is now tracked symbolically through Model.dim_lengths (see #4625).
  • The CAR distribution has been added to allow for use of conditional autoregressions which often are used in spatial and network models.
  • The dimensionality of model variables can now be parametrized through either of shape, dims or size (see #4696):
    • With shape the length of dimensions must be given numerically or as scalar Aesara Variables. Numeric entries in shape restrict the model variable to the exact length and re-sizing is no longer possible.
    • dims keeps model variables re-sizeable (for example through pm.Data) and leads to well-defined coordinates in InferenceData objects.
    • The size kwarg behaves as it does in Aesara/NumPy. For univariate RVs it is the same as shape, but for multivariate RVs it depends on how the RV implements broadcasting to dimensionality greater than RVOp.ndim_supp.
    • An Ellipsis (...) in the last position of shape or dims can be used as shorthand notation for implied dimensions.
  • Added a logcdf implementation for the Kumaraswamy distribution (see #4706).
  • The OrderedMultinomial distribution has been added for use on ordinal data which are aggregated by trial, like multinomial observations, whereas OrderedLogistic only accepts ordinal data in a disaggregated format, like categorical
    observations (see #4773).
  • The Polya-Gamma distribution has been added (see #4531). To make use of this distribution, the polyagamma>=1.3.1 library must be installed and available in the user's environment.
  • A small change to the mass matrix tuning methods jitter+adapt_diag (the default) and adapt_diag improves performance early on during tuning for some models. #5004
  • New experimental mass matrix tuning method jitter+adapt_diag_grad. [#5004](https://github.com/pymc-devs/pymc/pu...
Read more

PyMC3 3.11.4 (20 August 2021)

24 Aug 01:16
Compare
Choose a tag to compare
Update __init__.py

Update RELEASE-NOTES.md

Mark 3.11.3 release as broken per discussion

PyMC3 3.11.3 (19 August 2021)

20 Aug 16:14
03be0d8
Compare
Choose a tag to compare
Release PyMC3 v3.11.3 (#4941)

* Release PyMC3 v3.11.3

* Update RELEASE-NOTES.md

PyMC3 3.11.2 (14 March 2021)

14 Mar 22:30
Compare
Choose a tag to compare

PyMC3 3.11.2 (14 March 2021)

DOI

New Features

  • pm.math.cartesian can now handle inputs that are themselves >1D (see #4482).
  • Statistics and plotting functions that were removed in 3.11.0 were brought back, albeit with deprecation warnings if an old naming scheme is used (see #4536). In order to future proof your code, rename these function calls:
    • pm.traceplotpm.plot_trace
    • pm.compareplotpm.plot_compare (here you might need to rename some columns in the input according to the arviz.plot_compare documentation)
    • pm.autocorrplotpm.plot_autocorr
    • pm.forestplotpm.plot_forest
    • pm.kdeplotpm.plot_kde
    • pm.energyplotpm.plot_energy
    • pm.densityplotpm.plot_density
    • pm.pairplotpm.plot_pair

Maintenance

  • ⚠ Our memoization mechanism wasn't robust against hash collisions (#4506), sometimes resulting in incorrect values in, for example, posterior predictives. The pymc3.memoize module was removed and replaced with cachetools. The hashable function and WithMemoization class were moved to pymc3.util (see #4525).
  • pm.make_shared_replacements now retains broadcasting information which fixes issues with Metropolis samplers (see #4492).

Release manager for 3.11.2: Michael Osthege (@michaelosthege)

PyMC3 3.11.1 (12 February 2021)

12 Feb 18:39
Compare
Choose a tag to compare

New Features

  • Automatic imputations now also work with ndarray data, not just pd.Series or pd.DataFrame (see#4439).
  • pymc3.sampling_jax.sample_numpyro_nuts now returns samples from transformed random variables, rather than from the unconstrained representation (see #4427).

Maintenance

  • We upgraded to Theano-PyMC v1.1.2 which includes bugfixes for...
    • ⚠ a problem with tt.switch that affected the behavior of several distributions, including at least the following special cases (see #4448)
      1. Bernoulli when all the observed values were the same (e.g., [0, 0, 0, 0, 0]).
      2. TruncatedNormal when sigma was constant and mu was being automatically broadcasted to match the shape of observations.
    • Warning floods and compiledir locking (see #4444)
  • math.log1mexp_numpy no longer raises RuntimeWarning when given very small inputs. These were commonly observed during NUTS sampling (see #4428).
  • ScalarSharedVariable can now be used as an input to other RVs directly (see #4445).
  • pm.sample and pm.find_MAP no longer change the start argument (see #4458).
  • Fixed Dirichlet.logp method to work with unit batch or event shapes (see #4454).
  • Bugfix in logp and logcdf methods of Triangular distribution (see #4470).

Release manager for 3.11.1: Michael Osthege (@michaelosthege)

PyMC3 3.11.0 (21 January 2021)

21 Jan 08:36
Compare
Choose a tag to compare

This release breaks some APIs w.r.t. 3.10.0. It also brings some dreadfully awaited fixes, so be sure to go through the (breaking) changes below.

Breaking Changes

  • ⚠ Many plotting and diagnostic functions that were just aliasing ArviZ functions were removed (see 4397). This includes pm.summary, pm.traceplot, pm.ess and many more!
  • Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)
  • ⚠ We now depend on Theano-PyMC version 1.1.0 exactly (see #4405). Major refactorings were done in Theano-PyMC 1.1.0. If you implement custom Ops or interact with Theano in any way yourself, make sure to read the Theano-PyMC 1.1.0 release notes.
  • ⚠ Python 3.6 support was dropped (by no longer testing) and Python 3.9 was added (see #4332).
  • ⚠ Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)
    • Applies to random variables and also the .random(size=...) kwarg!
    • To create scalar variables you must now use shape=None or shape=().
    • shape=(1,) and shape=1 now become vectors. Previously they were collapsed into scalars
    • 0-length dimensions are now ruled illegal for random variables and raise a ValueError.
  • In sample_prior_predictive the vars kwarg was removed in favor of var_names (see #4327).
  • Removed theanof.set_theano_config because it illegally changed Theano's internal state (see #4329).

New Features

  • Option to set check_bounds=False when instantiating pymc3.Model(). This turns off bounds checks that ensure that input parameters of distributions are valid. For correctly specified models, this is unneccessary as all parameters get automatically transformed so that all values are valid. Turning this off should lead to faster sampling (see #4377).
  • OrderedProbit distribution added (see #4232).
  • plot_posterior_predictive_glm now works with arviz.InferenceData as well (see #4234)
  • Add logcdf method to all univariate discrete distributions (see #4387).
  • Add random method to MvGaussianRandomWalk (see #4388)
  • AsymmetricLaplace distribution added (see #4392).
  • DirichletMultinomial distribution added (see #4373).
  • Added a new predict method to BART to compute out of sample predictions (see #4310).

Maintenance

  • Fixed bug whereby partial traces returns after keyboard interrupt during parallel sampling had fewer draws than would've been available #4318
  • Make sample_shape same across all contexts in draw_values (see #4305).
  • The notebook gallery has been moved to https://github.com/pymc-devs/pymc-examples (see #4348).
  • math.logsumexp now matches scipy.special.logsumexp when arrays contain infinite values (see #4360).
  • Fixed mathematical formulation in MvStudentT random method. (see #4359)
  • Fix issue in logp method of HyperGeometric. It now returns -inf for invalid parameters (see 4367)
  • Fixed MatrixNormal random method to work with parameters as random variables. (see #4368)
  • Update the logcdf method of several continuous distributions to return -inf for invalid parameters and values, and raise an informative error when multiple values cannot be evaluated in a single call. (see 4393 and #4421)
  • Improve numerical stability in logp and logcdf methods of ExGaussian (see #4407)
  • Issue UserWarning when doing prior or posterior predictive sampling with models containing Potential factors (see #4419)
  • Dirichlet distribution's random method is now optimized and gives outputs in correct shape (see #4416)
  • Attempting to sample a named model with SMC will now raise a NotImplementedError. (see #4365)

Release manager for 3.11.0: Eelke Spaak (@Spaak)

PyMC3 v3.10.0 (7 December 2020)

07 Dec 15:59
a9806db
Compare
Choose a tag to compare

This is a major release with many exciting new features. The biggest change is that we now rely on our own fork of Theano-PyMC. This is in line with our big announcement about our commitment to PyMC3 and Theano.

When upgrading, make sure that Theano-PyMC and not Theano are installed (the imports remain unchanged, however). If not, you can uninstall Theano:

conda remove theano

And to install:

conda install -c conda-forge theano-pymc

Or, if you are using pip (not recommended):

pip uninstall theano

And to install:

pip install theano-pymc

This new version of Theano-PyMC comes with an experimental JAX backend which, when combined with the new and experimental JAX samplers in PyMC3, can greatly speed up sampling in your model. As this is still very new, please do not use it in production yet but do test it out and let us know if anything breaks and what results you are seeing, especially speed-wise.

New features

  • New experimental JAX samplers in pymc3.sample_jax (see notebook and #4247). Requires JAX and either TFP or numpyro.
  • Add MLDA, a new stepper for multilevel sampling. MLDA can be used when a hierarchy of approximate posteriors of varying accuracy is available, offering improved sampling efficiency especially in high-dimensional problems and/or where gradients are not available (see #3926)
  • Add Bayesian Additive Regression Trees (BARTs) #4183)
  • Added pymc3.gp.cov.Circular kernel for Gaussian Processes on circular domains, e.g. the unit circle (see #4082).
  • Added a new MixtureSameFamily distribution to handle mixtures of arbitrary dimensions in vectorized form for improved speed (see #4185).
  • sample_posterior_predictive_w can now feed on xarray.Dataset - e.g. from InferenceData.posterior. (see #4042)
  • Change SMC metropolis kernel to independent metropolis kernel #4115)
  • Add alternative parametrization to NegativeBinomial distribution in terms of n and p (see #4126)
  • Added semantically meaningful str representations to PyMC3 objects for console, notebook, and GraphViz use (see #4076, #4065, #4159, #4217, #4243, and #4260).
  • Add Discrete HyperGeometric Distribution (see #4249)

Maintenance

  • Switch the dependency of Theano to our own fork, Theano-PyMC.
  • Removed non-NDArray (Text, SQLite, HDF5) backends and associated tests.
  • Use dill to serialize user defined logp functions in DensityDist. The previous serialization code fails if it is used in notebooks on Windows and Mac. dill is now a required dependency. (see #3844).
  • Fixed numerical instability in ExGaussian's logp by preventing logpow from returning -inf (see #4050).
  • Numerically improved stickbreaking transformation - e.g. for the Dirichlet distribution. #4129
  • Enabled the Multinomial distribution to handle batch sizes that have more than 2 dimensions. #4169
  • Test model logp before starting any MCMC chains (see #4211)
  • Fix bug in model.check_test_point that caused the test_point argument to be ignored. (see PR #4211)
  • Refactored MvNormal.random method with better handling of sample, batch and event shapes. #4207
  • The InverseGamma distribution now implements a logcdf. #3944
  • Make starting jitter methods for nuts sampling more robust by resampling values that lead to non-finite probabilities. A new optional argument jitter-max-retries can be passed to pm.sample() and pm.init_nuts() to control the maximum number of retries per chain. 4298

Documentation

  • Added a new notebook demonstrating how to incorporate sampling from a conjugate Dirichlet-multinomial posterior density in conjunction with other step methods (see #4199).
  • Mentioned the way to do any random walk with theano.tensor.cumsum() in GaussianRandomWalk docstrings (see #4048).

Release manager for 3.10.0: Eelke Spaak (@Spaak)

PyMC3 v3.9.3 (August 11, 2020)

11 Aug 03:30
63eba59
Compare
Choose a tag to compare

This release includes several fixes, including (but not limited to) the following:

  • Fix keep_size argument in Arviz data structures: #4006
  • Pin Theano 1.0.5: #4032
  • Comprehensively re-wrote radon modeling notebook using latest Arviz features: #3963

NB: The docs/* folder is still removed from the tarball due to an upload size limit on PyPi.

PyMC3 v3.9.2 (24 June 2020)

24 Jun 12:08
0790115
Compare
Choose a tag to compare

Maintenance

  • Warning added in GP module when input_dim is lower than the number of columns in X to compute the covariance function (see #3974).
  • Pass the tune argument from sample when using advi+adapt_diag_grad (see issue #3965, fixed by #3979).
  • Add simple test case for new coords and dims feature in pm.Model (see #3977).
  • Require ArviZ >= 0.9.0 (see #3977).

NB: The docs/* folder is still removed from the tarball due to an upload size limit on PyPi.