pymob.inference package#

Submodules#

pymob.inference.numpyro_backend module#

class pymob.inference.numpyro_backend.ErrorModelFunction(*args, **kwargs)#: Bases: Protocol

class pymob.inference.numpyro_backend.NumpyroBackend(simulation: SimulationBase)#

Bases: InferenceBackend

property adapt_state_size#

calculate_log_likelihood(model, posterior_samples)#

static cast_to_precision(value, precision='64')#

property chains#

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-’ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4

check_gradients(theta: ~typing.Dict[str, float | ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] | None = None, vectorize=False)#

check_log_likelihood(theta: ~typing.Dict[str, float | ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] | None = None, vectorize=False)#

check_tolerance_and_jax_mode()#

combine_chains(chain_location='chains', drop_extra_vars=[], cluster_deviation='std')#

Combine chains if chains were computed in a fully parallelized manner (on different machines, jobs, etc.).

In addition, the method drops all data variables and ‘…_norm’ priors (i.e. helper priors with a normal base). This is done, in order to create slim data objects for storage.

Parameters:

chain_location (str, optional) – location of the chains, relative to the simulation.output_path, this parameter is simulteneously the string appended to the saved posterior. By default “chains”
drop_extra_vars (List, optional) – any additional variables to drop from the posterior

create_log_likelihood(seed=1, return_type: Literal['joint-log-likelihood', 'full', 'summed-by-site', 'summed-by-prior-data', 'custom'] = 'joint-log-likelihood', check=True, custom_return_fn: Callable | None = None, scaled=True, vectorize=False, gradients=False) → Tuple[Errorfunction, ErrorModelFunction]#

Log density relies heavily on the substitute utility

The log density is the scaled log-likelihood. In case the the scale handler is used, log_density reflects this. Usually, the scaled log-density should be returned, because it is loss used for the optimizer/sampler

The general method is actually quite simple. Values of all SAMPLE sites are replaced according to the key: value pairs in theta.

Then the model is calculated and the trace is obtained. Everything else is then just post-processing of the sites. Here the log_prob function of the sites in the trace are used and the values of the sites are inserted.

Note that the log-density can randomly fluctuate, if not all sites are replaced.

Note that the data-loglik can be used to calculate a maximum-likelihood estimate. Because it is independent of the prior

The method is equivalent using the log_likelihood method, but returns only the likelihood of the data given the model parameters.

Parameters:

return_type (str) –

The information which should be returned. With increasing level of computation:

joint-log-likelihood: returns a single value, the entire log: likelihood of the model, given the values in theta
full: joint-log, loglik-prior of each site and value,: loglik-data of each site and value
summed-by-site: joint-loglik, loglik-prior of sites,: loglik-data of sites
summed-by-prior-data:: joint-loglik, prior-loglik, data-loglik
custom:: uses the full log

property draws#

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

drop_vars_from_posterior(posterior, drop_extra_vars)#: drops extra variables if they are included in the posterior

property gaussian_base_distribution#

static generate_transform(expression: Expression)#

static get_dict(group: Dataset)#

property init_strategy#

property kernel#

load_results(file='numpyro_posterior.nc', cluster: int | None = None)#

model()#

nuts_posterior(mcmc, model, key, obs)#

observation_parser() → Tuple[Dict, Dict]#

Transform a xarray.Dataset into a dictionary of jnp.Arrays. Creates boolean arrays of masks for nan values (missing values are tagged False)

Returns:: Dictionaries of observations (data) and masks (missing values)
Return type:: Tuple[Dict,Dict]

parse_deterministic_model() → Callable#

Parses an evaluation function from the Simulation object, which takes a single argument theta and defaults to passing no seed to the deterministic evaluator.

Returns:: The evaluation function
Return type:: callable

parse_probabilistic_model()#

property posterior#

posterior_draws_from_svi(guide, svi_result, n, key)#

posterior_predictions(n: int | None = None, seed=1)#

predict_observations(model, posterior_samples, key, n=100)#: there is a very small remark in the numpyro API that explains that if data input for observed variables is None, the data are sampled from the distributions instead of returning the input data https://num.pyro.ai/en/stable/getting_started.html#a-simple-example-8-schools

static preprocessing(**kwargs)#

prior: Dict[str, DistributionMeta]#

prior_predictions(n=None, seed=1)#

run(print_debug=True, render_model=True)#

run_mcmc(model, keys, kernel)#

static run_svi(model, keys, learning_rate, iterations, kernel)#

static select_cluster(idata: InferenceData, cluster: int)#

store_results(output=None)#

property svi_iterations#

property svi_learning_rate#

svi_posterior(svi_result, model, guide, key, n=1000)#

property thinning#

to_arviz_idata(prior: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, posterior: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, log_likelihood: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, prior_predictive: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, posterior_predictive: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, observed_data: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, n_draws: int | None = None, n_chains: int | None = None, **kwargs)#: Create an Arviz idata object from samples. TODO: Outsource to base.InferenceBackend

property user_defined_error_model#

property user_defined_preprocessing#

property user_defined_probability_model#

property warmup#

class pymob.inference.numpyro_backend.NumpyroDistribution(name: str, random_variable: RandomVariable, dims: Tuple[str, ...], shape: Tuple[int, ...])#

Bases: Distribution

property dist_name#

distribution_map: Dict[str, Tuple[DistributionMeta, Dict[str, str]]] = {'bernoulli': (<function Bernoulli>, {'p': 'probs'}), 'beta': (<class 'numpyro.distributions.continuous.Beta'>, {'a': 'concentration1', 'b': 'concentration0'}), 'binom': (<function Binomial>, {'n': 'total_count', 'p': 'probs'}), 'binomial': (<function Binomial>, {'n': 'total_count', 'p': 'probs'}), 'categorical': (<function Categorical>, {'p': 'probs'}), 'cauchy': (<class 'numpyro.distributions.continuous.Cauchy'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'chi2': (<class 'numpyro.distributions.continuous.Chi2'>, {'df': 'df', 'high': 'high', 'low': 'low'}), 'deterministic': (<function deterministic>, {'value': 'value'}), 'dirichlet': (<class 'numpyro.distributions.continuous.Dirichlet'>, {'alpha': 'concentration'}), 'expon': (<class 'numpyro.distributions.continuous.Exponential'>, {'high': 'high', 'scale': 'rate'}), 'exponential': (<class 'numpyro.distributions.continuous.Exponential'>, {'high': 'high', 'scale': 'rate'}), 'gamma': (<class 'numpyro.distributions.continuous.Gamma'>, {'a': 'concentration', 'high': 'high', 'low': 'low', 'scale': 'rate'}), 'geom': (<function Geometric>, {'p': 'probs'}), 'gumbel_l': (<function <lambda>>, {}), 'gumbel_r': (<class 'numpyro.distributions.continuous.Gumbel'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'halfnorm': (<class 'numpyro.distributions.continuous.HalfNormal'>, {'high': 'high', 'scale': 'scale'}), 'halfnormal': (<class 'numpyro.distributions.continuous.HalfNormal'>, {'high': 'high', 'scale': 'scale'}), 'laplace': (<class 'numpyro.distributions.continuous.Laplace'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'logistic': (<class 'numpyro.distributions.continuous.Logistic'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'lognorm': (<class 'numpyro.distributions.continuous.LogNormal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 's': 'scale', 'scale': 'loc'}), 'lognormal': (<class 'numpyro.distributions.continuous.LogNormal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 's': 'scale', 'scale': 'loc'}), 'multinomial': (<function Multinomial>, {'n': 'total_count', 'p': 'probs'}), 'multivariate_normal': (<class 'numpyro.distributions.continuous.MultivariateNormal'>, {'cov': 'covariance_matrix', 'mean': 'loc'}), 'nbinom': (<class 'numpyro.distributions.conjugate.NegativeBinomialProbs'>, {'n': 'total_count', 'p': 'probs'}), 'norm': (<class 'numpyro.distributions.continuous.Normal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'normal': (<class 'numpyro.distributions.continuous.Normal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'pareto': (<class 'numpyro.distributions.continuous.Pareto'>, {'b': 'scale', 'high': 'high', 'low': 'low', 'scale': 'alpha'}), 'poisson': (<class 'numpyro.distributions.discrete.Poisson'>, {'mu': 'rate'}), 't': (<class 'numpyro.distributions.continuous.StudentT'>, {'df': 'df', 'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'uniform': (<class 'numpyro.distributions.continuous.Uniform'>, {'loc': 'low', 'scale': 'high'})}#

static parameter_converter(x)#

pymob.inference.numpyro_backend.catch_patterns(expression_str)#

pymob.inference.scipy_backend module#

class pymob.inference.scipy_backend.ProbabilisticModel(prior_model, error_model, indices, observations, simulation, eps, seed)#

Bases: object

Combined prior, transformation, and error model for inference.

The class orchestrates three components:

Prior model - draws parameter samples and evaluates their log-probability.
Transformation model - runs the deterministic simulation to obtain latent system states.
Error model - generates synthetic observations or evaluates the likelihood of observed data.

The __call__ method implements four usage modes (prior predictive, likelihood, sampling, posterior predictive) as described in the docstring of ScipyBackend.

class pymob.inference.scipy_backend.ScipyBackend(simulation: SimulationBase)#

Bases: InferenceBackend

Backend that uses SciPy distributions for inference.

The backend implements the abstract InferenceBackend interface using SciPy probability distributions. It parses the model priors and error models from the simulation configuration, builds a ProbabilisticModel that can generate prior predictive samples, compute likelihoods, and sample from the prior distribution.

inference_model#

The assembled probabilistic model used for all inference operations.

Type:: ProbabilisticModel

random_state#

Random number generator seeded from the simulation configuration.

Type:: numpy.random.Generator

create_log_likelihood() → Tuple[Errorfunction, Errorfunction]#

Create log-likelihood and (optional) gradient functions.

Returns:: A pair (log_likelihood, gradient) where each element is a callable conforming to the Errorfunction protocol. The current backend does not implement a gradient, so both callables simply return None. This stub satisfies the type checker and can be extended in the future.
Return type:: tuple

distribution: rv_continuous | rv_discrete#

parse_deterministic_model()#

parse_probabilistic_model()#

posterior_predictions()#

prior_predictions()#

run()#

sample_distribution()#

class pymob.inference.scipy_backend.ScipyDistribution(name: str, random_variable: RandomVariable, dims: Tuple[str, ...], shape: Tuple[int, ...])#

Bases: Distribution

Distribution wrapper for SciPy random variables.

This subclass of Distribution provides a thin wrapper around SciPy’s continuous and discrete distributions. It maps a distribution name to the corresponding SciPy rv_continuous or rv_discrete object using the scipy_to_scipy dictionary. The class also defines a parameter_converter that converts parameter arrays to NumPy ndarray objects.

The primary purpose of this class is to expose a dist_name property that returns the name of the underlying SciPy distribution.

property dist_name: str#

distribution_map: Dict[str, Tuple[rv_continuous | rv_discrete | multi_rv_generic, Dict[str, str]]] = {'bernoulli': (<scipy.stats._discrete_distns.bernoulli_gen object>, {}), 'beta': (<scipy.stats._continuous_distns.beta_gen object>, {}), 'betabinom': (<scipy.stats._discrete_distns.betabinom_gen object>, {}), 'binom': (<scipy.stats._discrete_distns.binom_gen object>, {}), 'boltzmann': (<scipy.stats._discrete_distns.boltzmann_gen object>, {}), 'cauchy': (<scipy.stats._continuous_distns.cauchy_gen object>, {}), 'chi2': (<scipy.stats._continuous_distns.chi2_gen object>, {}), 'dlaplace': (<scipy.stats._discrete_distns.dlaplace_gen object>, {}), 'expon': (<scipy.stats._continuous_distns.expon_gen object>, {}), 'exponential': (<scipy.stats._continuous_distns.expon_gen object>, {}), 'exponpow': (<scipy.stats._continuous_distns.exponpow_gen object>, {}), 'exponweib': (<scipy.stats._continuous_distns.exponweib_gen object>, {}), 'fatiguelife': (<scipy.stats._continuous_distns.fatiguelife_gen object>, {}), 'gamma': (<scipy.stats._continuous_distns.gamma_gen object>, {}), 'genextreme': (<scipy.stats._continuous_distns.genextreme_gen object>, {}), 'geom': (<scipy.stats._discrete_distns.geom_gen object>, {}), 'gompertz': (<scipy.stats._continuous_distns.gompertz_gen object>, {}), 'gumbel_l': (<scipy.stats._continuous_distns.gumbel_l_gen object>, {}), 'gumbel_r': (<scipy.stats._continuous_distns.gumbel_r_gen object>, {}), 'halfnorm': (<scipy.stats._continuous_distns.halfnorm_gen object>, {}), 'halfnormal': (<scipy.stats._continuous_distns.halfnorm_gen object>, {}), 'hypergeom': (<scipy.stats._discrete_distns.hypergeom_gen object>, {}), 'kstwobign': (<scipy.stats._continuous_distns.kstwobign_gen object>, {}), 'laplace': (<scipy.stats._continuous_distns.laplace_gen object>, {}), 'levy': (<scipy.stats._continuous_distns.levy_gen object>, {}), 'levy_stable': (<scipy.stats._levy_stable.levy_stable_gen object>, {}), 'loggamma': (<scipy.stats._continuous_distns.loggamma_gen object>, {}), 'logistic': (<scipy.stats._continuous_distns.logistic_gen object>, {}), 'lognorm': (<scipy.stats._continuous_distns.lognorm_gen object>, {}), 'lognormal': (<scipy.stats._continuous_distns.lognorm_gen object>, {}), 'logser': (<scipy.stats._discrete_distns.logser_gen object>, {}), 'nakagami': (<scipy.stats._continuous_distns.nakagami_gen object>, {}), 'nbinom': (<scipy.stats._discrete_distns.nbinom_gen object>, {}), 'norm': (<scipy.stats._continuous_distns.norm_gen object>, {}), 'normal': (<scipy.stats._continuous_distns.norm_gen object>, {}), 'norminvgauss': (<scipy.stats._continuous_distns.norminvgauss_gen object>, {}), 'pareto': (<scipy.stats._continuous_distns.pareto_gen object>, {}), 'planck': (<scipy.stats._discrete_distns.planck_gen object>, {}), 'poisson': (<scipy.stats._discrete_distns.poisson_gen object>, {}), 'powerlaw': (<scipy.stats._continuous_distns.powerlaw_gen object>, {}), 'randint': (<scipy.stats._discrete_distns.randint_gen object>, {}), 'rayleigh': (<scipy.stats._continuous_distns.rayleigh_gen object>, {}), 'rice': (<scipy.stats._continuous_distns.rice_gen object>, {}), 'semicircular': (<scipy.stats._continuous_distns.semicircular_gen object>, {}), 'skellam': (<scipy.stats._discrete_distns.skellam_gen object>, {}), 't': (<scipy.stats._continuous_distns.t_gen object>, {}), 'triang': (<scipy.stats._continuous_distns.triang_gen object>, {}), 'truncexpon': (<scipy.stats._continuous_distns.truncexpon_gen object>, {}), 'truncnorm': (<scipy.stats._continuous_distns.truncnorm_gen object>, {}), 'truncnormal': (<scipy.stats._continuous_distns.truncnorm_gen object>, {}), 'tukeylambda': (<scipy.stats._continuous_distns.tukeylambda_gen object>, {}), 'uniform': (<scipy.stats._continuous_distns.uniform_gen object>, {}), 'vonmises': (<scipy.stats._continuous_distns.vonmises_gen object>, {}), 'wald': (<scipy.stats._continuous_distns.wald_gen object>, {}), 'weibull_max': (<scipy.stats._continuous_distns.weibull_max_gen object>, {}), 'weibull_min': (<scipy.stats._continuous_distns.weibull_min_gen object>, {}), 'wrapcauchy': (<scipy.stats._continuous_distns.wrapcauchy_gen object>, {}), 'yulesimon': (<scipy.stats._discrete_distns.yulesimon_gen object>, {}), 'zipf': (<scipy.stats._discrete_distns.zipf_gen object>, {})}#

static parameter_converter(x)#

class pymob.inference.scipy_backend.ScipyErrorModel(eps, error_model, indices, observations, seed)#

Bases: ErrorModel

Error model that generates observation noise using SciPy distributions.

The class builds a set of random variables based on the user-specified error model expressions. It can draw synthetic noisy observations from the model (forward) or compute the log-probability of observed data given a set of latent variables (reverse).

Parameters:

eps (float) – Small constant added to scales to avoid division by zero.
error_model (dict) – Mapping of data variable names to error model Distribution objects.
indices (dict) – Index arrays for the simulation.
observations (xarray.Dataset) – Observed data used for likelihood evaluation.
seed (int) – Seed for the random number generator.

forward(Y)#: Obtain a realization of Y that depends on the chosen error function

reverse(Y, Y_obs)#: Obtain an error estimate of the difference between Y and Y_obs This difference depends on the chosen error function.

class pymob.inference.scipy_backend.ScipyPriorModel(prior_model, indices, observations, seed)#

Bases: object

Helper class for sampling from prior distributions and computing log-probabilities.

Instances maintain a reference to the prior model definition, the index variables, and observations. The __call__ method forwards to either forward (when no parameters are supplied) or reverse (when a parameter dictionary is provided), enabling a uniform callable interface.

Parameters:

prior_model (dict) – Mapping of parameter names to Distribution objects.
indices (dict) – Index arrays for each indexed dimension of the simulation.
observations (xarray.Dataset) – Observed data used for conditioning (currently not used in the prior).
seed (int) – Seed for the underlying NumPy random generator.

forward()#

reverse(theta)#

class pymob.inference.scipy_backend.ScipyTransModel(simulation)#

Bases: object

Transformation model that runs the deterministic simulation.

This lightweight wrapper forwards the theta (parameter values) to the simulation’s dispatch method, executes the model, and returns the resulting simulated state Y. It is used by the probabilistic model to generate latent system states from sampled parameters.

transform_prior_to_error_model(theta, y0={}, x_in={}, seed=None)#

pymob.inference.pyabc_backend module#

class pymob.inference.pyabc_backend.Posterior(samples)#

Bases: object

draw(i)#

mean()#

to_dict()#

class pymob.inference.pyabc_backend.PyabcBackend(simulation: SimulationBase)#

Bases: InferenceBackend

static array_param_to_1d(name, distribution, dist_param_dict)#

property database#

distance_function_parser()#

load_results()#

static map_parameters(theta, parameter_map)#

property max_nr_populations#

property min_eps_diff#

property minimum_epsilon#

model_parser()#

static param_to_prior(par)#

plot()#

plot_chains()#

plot_predictions(data_variable: str, x_dim: str, ax=None, subset={})#

property population_size#

property posterior_coordinates#

property posterior_data_structure#

posterior_predictions(n=50, seed=1)#

prior_parser(free_model_parameters: list)#

run()#

property sampler#

store_results()#: results are stored by default in database

pymob.inference.pymoo_backend module#

class pymob.inference.pymoo_backend.OptimizationProblem(backend: PymooBackend, **kwargs)#: Bases: Problem

class pymob.inference.pymoo_backend.PymooBackend(simulation: SimulationBase)#

Bases: object

distance_function_parser()#

idata: PymobInferenceData#

load_results()#

optimize()#

plot_predictions(data_variable: str, x_dim: str, ax=None, subset={}, upscale_x=True)#

post_processing(pop)#

run()#: Implements the parallelization in pymoo

store_results(results)#

variable_mapper(x)#

variable_parser()#

pymob.inference package#

Submodules#

pymob.inference.numpyro_backend module#

pymob.inference.scipy_backend module#

pymob.inference.pyabc_backend module#

pymob.inference.pymoo_backend module#

Module contents#