Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional input arguments to to_funsor and to_data #316

Merged
merged 10 commits into from
Feb 18, 2020
Merged

Conversation

eb8680
Copy link
Member

@eb8680 eb8680 commented Feb 12, 2020

Overview

Based on our design discussion on Monday, this PR is a first step toward replacing much of Pyro's internals with Funsor and supporting enumeration in NumPyro. It adds a dim_to_name mapping argument to to_funsor and a name_to_dim mapping argument to to_data which provide sufficient information for uniquely converting PyTorch tensors and distributions to Funsors with free variables and vice versa.

The changes in this PR only support Tensors. In followup PRs, I will implement versions of to_funsor and to_data that apply to Distributions.

This code is ready for review, but before changing the design too much I'd like to have first versions of the plate/enum-aware wrappers (discussed below) implemented as well (update: pyro-ppl/pyro#2307)

Design

dim_to_name is a dictionary that maps batch dimensions to (name, domain) pairs. name_to_dim is a dictionary that maps input names to batch dimensions.

This extra information (beyond an ordered list of names) is necessary for unique conversion because in Pyro we often have empty batch dimensions in our unpacked tensor shapes that do not mean anything, and eventually in Funsor we will support non-vector bint domains.

These arguments are optional, and to_funsor and to_data behave the same as before when they are not provided.

Rationale

The primary use case for this new functionality is upstream in Pyro and NumPyro, where we would like to convert distributions and lazy sample values to and from funsors with shapes consistent with the global plate and enumeration dimension state. This code could plausibly live in Pyro, but I've put it here since it is intended to support an implementation of EnumMessenger and tensor variable elimination in NumPyro.

I am implementing wrappers of these two functions in Pyro that construct name_to_dim/dim_to_name automatically using the information in MarkovMessenger, and I do not expect that Pyro users would ever construct these manually or even know that they exist.

This design is very similar to the dim_to_symbol and symbol_to_dim arguments used in pyro.ops.packed.pack and pyro.ops.packed.unpack, but generalized to support event dimensions. As a consequence of this similarity, we should easily be able to replace pyro.ops.packed with Funsor.

Other changes

This design is also very similar to the conversion functions funsor_to_tensor and tensor_to_funsor in pyro/ops/convert.py, which use a particular convention for constructing instances of dim_to_name and name_to_dim. I have modified the implementations of these functions to use to_data and to_funsor to reflect this similarity and to reuse their tests, and eventually I expect that much of funsor.pyro.convert will be deprecated in favor of the plate/enum-aware wrappers being implemented in Pyro.

I have also switched to_funsor to use functools.singledispatch rather than multipledispatch, which may provide a small performance boost and also avoids introducing a multipledispatch dependency in the event that we move this code out of Funsor.

Tested

  • New design exercised by existing tests in test/pyro/test_convert.py
  • Ground to_funsor exercised by existing tests

Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see a Funsor replacement for pyro.ops.packed!

  1. I assume you'll add tests after sketching the Pyro integration? Or could you point out how the dim_to_name kwarg is exercised by existing tests (sorry I don't see immediately).
  2. Do you know whether the reshaping logic is NumPy compatible? If not, consider adding a TODO. No need to support both backends in this PR.

funsor/tensor.py Outdated Show resolved Hide resolved
funsor/tensor.py Outdated Show resolved Hide resolved
funsor/tensor.py Show resolved Hide resolved
funsor/tensor.py Outdated Show resolved Hide resolved
funsor/terms.py Outdated Show resolved Hide resolved
@fritzo
Copy link
Member

fritzo commented Feb 12, 2020

Thanks for the clear PR description!

@fritzo fritzo mentioned this pull request Feb 15, 2020
13 tasks
@fritzo
Copy link
Member

fritzo commented Feb 18, 2020

Sorry I missed this last week. Feel free to "bump" if I take longer than a couple days to review.

@fritzo fritzo merged commit e5c7e77 into master Feb 18, 2020
@fritzo fritzo deleted the to-funsor-inputs branch February 18, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants