Skip to content

AdaptiveImportanceSampling

cteichmann edited this page Aug 2, 2016 · 15 revisions

This page explains the use of the adaptive importance sampling tools that are included in Alto.

The evaluation data and results from our paper at the ACL 2016 Workshop on statstical NLP and weighted automata are available on the page for those experiments. You can find a revised (i.e. without the errors we caught after publication) version of the paper here.

Adaptive Importance Sampler

The core class for adaptive importance sampling is de.up.ling.irtg.sampling.AdaptiveImportanceSampler. There are two methods which can be used to generate a sample: adaSample and adaSampleMinimal. Both implement the same sampling regime, but the former produces a list of all the samples that have been drawn, while the latter only returns the last sample that has been generated during the adaption process. This means that adaSampleMinimal usually uses much less memory.

Both methods return (a list of) de.up.ling.irtg.sampling.TreeSample of a given size (defined by the parameter populationSize). The sample returns the correct values for getLogPropWeight and getLogTargetWeight and if the proposal automaton is not deterministic (must be indicate by setting the parameter "deterministic" to false for the adaSample and adaSampleMinimal) then getLogSumWeight also indicates the correct sum of the probabilities of proposals for the different ways of generating a tree.

The parameters for adaSample and adaSampleMinimal are:

  • rounds: how many rounds the adaption is run for
  • populationSize: how many samples to draw in each round
  • RuleWeighting: implements the adaption of the proposals and the target weights
  • deterministic: true if every tree can only be generated in a single way from the proposal automaton
  • reset: whether to call the reset method for the RuleWeighting, which removes all adaption that has been made

TreeSample

The class de.up.ling.irtg.sampling.TreeSample provides a container for importance samples. All values need to be set externally. This is generally done by the adaSample and adaSampleMinimal methods of the AdaptiveImportanceSampler.

The class provides some additional methods to calculate normalised weights and to resample values. Resampling might invalidate the entries in the TreeSample.

The log proposal weight is the logarithm of the probability of drawing the sample on the path on which it was drawn. The log sum proposal weight is logarithm of the probability of drawing the sample with one path through the automaton. The log target weight is the logarithm of the logarithm of the unnormalised target distribution for the samples.

The class provides a method expoNormalize which computes the importance weights and then self-normalises them so that getSelfNormalizedWeight will return the correct values. If deterministic is set to true, the class will use the proposal weights, while if it is false the class will use the log sum proposal weights. The determinism is with respect to the proposal automaton. If the sampler assumed that the automaton is deterministic then the values for log sum proposal weight might not even be set.

For the purposes of population sampling it is also possible to resample entries in a tree sample. This is done resampleWithNormalize; note that this invalidates all entries in the tree sample except the self normalised weight. You can also call just resample, which assumes that the weights have already been expoNormalized.

flatten also resamples the trees and ensures that all samples are weighted by 1.0 / number of samples. It also invalidates all other settings.

In order to reduce the problem of underflow, it is also possible to compute unnormalised importance weights which are divided by a maximum value. This is done with makeMaxBase(boolean deterministic, double originalBase) where the original base is used iff there is not unnormalised importance weight for the given sampler that is larger than original base. If there is a larger value, then this value is used and returned after the method is finished.

RuleWeighting

This Interface is used to generate proposals and to compute the target weight. Since it creates proposals it also has to implement adaption. The interface includes the possibility of not compute the probabilities for proposals until they are needed. The method prepareProbability needs to be called before probabilities for rule proposals for a given state are valid. The method prepareStartProbability is required before probabilities for start states are valid.

An abstract class that implements almost all the methods of RuleWeighting is given by de.up.ling.irtg.sampling.rule_weighting.RegularizedKLRuleWeighting. This class implements the adaption techniques describe in our adaptive importance sampling paper. In order to extend this class it is only required to implement the method getLogTargetProbability which gives the target probability. This is also the only method needed to implement a new problem setting (along with giving the correct automaton for proposals).

Clone this wiki locally