pymob.inference package#
Submodules#
pymob.inference.numpyro_backend module#
- class pymob.inference.numpyro_backend.ErrorModelFunction(*args, **kwargs)#
Bases:
Protocol
- class pymob.inference.numpyro_backend.NumpyroBackend(simulation: SimulationBase)#
Bases:
InferenceBackend- property adapt_state_size#
- calculate_log_likelihood(model, posterior_samples)#
- static cast_to_precision(value, precision='64')#
- property chains#
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-’ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4
- check_gradients(theta: ~typing.Dict[str, float | ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] | None = None, vectorize=False)#
- check_log_likelihood(theta: ~typing.Dict[str, float | ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] | None = None, vectorize=False)#
- check_tolerance_and_jax_mode()#
- combine_chains(chain_location='chains', drop_extra_vars=[], cluster_deviation='std')#
Combine chains if chains were computed in a fully parallelized manner (on different machines, jobs, etc.).
In addition, the method drops all data variables and ‘…_norm’ priors (i.e. helper priors with a normal base). This is done, in order to create slim data objects for storage.
- Parameters:
chain_location (str, optional) – location of the chains, relative to the simulation.output_path, this parameter is simulteneously the string appended to the saved posterior. By default “chains”
drop_extra_vars (List, optional) – any additional variables to drop from the posterior
- create_log_likelihood(seed=1, return_type: Literal['joint-log-likelihood', 'full', 'summed-by-site', 'summed-by-prior-data', 'custom'] = 'joint-log-likelihood', check=True, custom_return_fn: Callable | None = None, scaled=True, vectorize=False, gradients=False) Tuple[Errorfunction, ErrorModelFunction]#
Log density relies heavily on the substitute utility
The log density is the scaled log-likelihood. In case the the scale handler is used, log_density reflects this. Usually, the scaled log-density should be returned, because it is loss used for the optimizer/sampler
The general method is actually quite simple. Values of all SAMPLE sites are replaced according to the key: value pairs in theta.
Then the model is calculated and the trace is obtained. Everything else is then just post-processing of the sites. Here the log_prob function of the sites in the trace are used and the values of the sites are inserted.
Note that the log-density can randomly fluctuate, if not all sites are replaced.
Note that the data-loglik can be used to calculate a maximum-likelihood estimate. Because it is independent of the prior
The method is equivalent using the log_likelihood method, but returns only the likelihood of the data given the model parameters.
- Parameters:
return_type (str) –
The information which should be returned. With increasing level of computation:
- joint-log-likelihood: returns a single value, the entire log
likelihood of the model, given the values in theta
- full: joint-log, loglik-prior of each site and value,
loglik-data of each site and value
- summed-by-site: joint-loglik, loglik-prior of sites,
loglik-data of sites
- summed-by-prior-data:
joint-loglik, prior-loglik, data-loglik
- custom:
uses the full log
- property draws#
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-’ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4
- drop_vars_from_posterior(posterior, drop_extra_vars)#
drops extra variables if they are included in the posterior
- property gaussian_base_distribution#
- static generate_transform(expression: Expression)#
- static get_dict(group: Dataset)#
- property init_strategy#
- property kernel#
- load_results(file='numpyro_posterior.nc', cluster: int | None = None)#
- model()#
- nuts_posterior(mcmc, model, key, obs)#
- observation_parser() Tuple[Dict, Dict]#
Transform a xarray.Dataset into a dictionary of jnp.Arrays. Creates boolean arrays of masks for nan values (missing values are tagged False)
- Returns:
Dictionaries of observations (data) and masks (missing values)
- Return type:
Tuple[Dict,Dict]
- parse_deterministic_model() Callable#
Parses an evaluation function from the Simulation object, which takes a single argument theta and defaults to passing no seed to the deterministic evaluator.
- Returns:
The evaluation function
- Return type:
callable
- parse_probabilistic_model()#
- property posterior#
- posterior_draws_from_svi(guide, svi_result, n, key)#
- posterior_predictions(n: int | None = None, seed=1)#
- predict_observations(model, posterior_samples, key, n=100)#
there is a very small remark in the numpyro API that explains that if data input for observed variables is None, the data are sampled from the distributions instead of returning the input data https://num.pyro.ai/en/stable/getting_started.html#a-simple-example-8-schools
- static preprocessing(**kwargs)#
- prior: Dict[str, DistributionMeta]#
- prior_predictions(n=None, seed=1)#
- run(print_debug=True, render_model=True)#
- run_mcmc(model, keys, kernel)#
- static run_svi(model, keys, learning_rate, iterations, kernel)#
- static select_cluster(idata: InferenceData, cluster: int)#
- store_results(output=None)#
- property svi_iterations#
- property svi_learning_rate#
- svi_posterior(svi_result, model, guide, key, n=1000)#
- property thinning#
- to_arviz_idata(prior: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, posterior: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, log_likelihood: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, prior_predictive: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, posterior_predictive: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, observed_data: ~typing.Dict[str, ~numpydantic.vendor.nptyping.base_meta_classes.NDArray[~typing.Any, (<class 'numpy.float64'>, <class 'numpy.int64'>, <class 'numpy.float32'>, <class 'numpy.int32'>)]] = {}, n_draws: int | None = None, n_chains: int | None = None, **kwargs)#
Create an Arviz idata object from samples. TODO: Outsource to base.InferenceBackend
- property user_defined_error_model#
- property user_defined_preprocessing#
- property user_defined_probability_model#
- property warmup#
- class pymob.inference.numpyro_backend.NumpyroDistribution(name: str, random_variable: RandomVariable, dims: Tuple[str, ...], shape: Tuple[int, ...])#
Bases:
Distribution- property dist_name#
- distribution_map: Dict[str, Tuple[DistributionMeta, Dict[str, str]]] = {'bernoulli': (<function Bernoulli>, {'p': 'probs'}), 'beta': (<class 'numpyro.distributions.continuous.Beta'>, {'a': 'concentration1', 'b': 'concentration0'}), 'binom': (<function Binomial>, {'n': 'total_count', 'p': 'probs'}), 'binomial': (<function Binomial>, {'n': 'total_count', 'p': 'probs'}), 'categorical': (<function Categorical>, {'p': 'probs'}), 'cauchy': (<class 'numpyro.distributions.continuous.Cauchy'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'chi2': (<class 'numpyro.distributions.continuous.Chi2'>, {'df': 'df', 'high': 'high', 'low': 'low'}), 'deterministic': (<function deterministic>, {'value': 'value'}), 'dirichlet': (<class 'numpyro.distributions.continuous.Dirichlet'>, {'alpha': 'concentration'}), 'expon': (<class 'numpyro.distributions.continuous.Exponential'>, {'high': 'high', 'scale': 'rate'}), 'exponential': (<class 'numpyro.distributions.continuous.Exponential'>, {'high': 'high', 'scale': 'rate'}), 'gamma': (<class 'numpyro.distributions.continuous.Gamma'>, {'a': 'concentration', 'high': 'high', 'low': 'low', 'scale': 'rate'}), 'geom': (<function Geometric>, {'p': 'probs'}), 'gumbel_l': (<function <lambda>>, {}), 'gumbel_r': (<class 'numpyro.distributions.continuous.Gumbel'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'halfnorm': (<class 'numpyro.distributions.continuous.HalfNormal'>, {'high': 'high', 'scale': 'scale'}), 'halfnormal': (<class 'numpyro.distributions.continuous.HalfNormal'>, {'high': 'high', 'scale': 'scale'}), 'laplace': (<class 'numpyro.distributions.continuous.Laplace'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'logistic': (<class 'numpyro.distributions.continuous.Logistic'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'lognorm': (<class 'numpyro.distributions.continuous.LogNormal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 's': 'scale', 'scale': 'loc'}), 'lognormal': (<class 'numpyro.distributions.continuous.LogNormal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 's': 'scale', 'scale': 'loc'}), 'multinomial': (<function Multinomial>, {'n': 'total_count', 'p': 'probs'}), 'multivariate_normal': (<class 'numpyro.distributions.continuous.MultivariateNormal'>, {'cov': 'covariance_matrix', 'mean': 'loc'}), 'nbinom': (<class 'numpyro.distributions.conjugate.NegativeBinomialProbs'>, {'n': 'total_count', 'p': 'probs'}), 'norm': (<class 'numpyro.distributions.continuous.Normal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'normal': (<class 'numpyro.distributions.continuous.Normal'>, {'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'pareto': (<class 'numpyro.distributions.continuous.Pareto'>, {'b': 'scale', 'high': 'high', 'low': 'low', 'scale': 'alpha'}), 'poisson': (<class 'numpyro.distributions.discrete.Poisson'>, {'mu': 'rate'}), 't': (<class 'numpyro.distributions.continuous.StudentT'>, {'df': 'df', 'high': 'high', 'loc': 'loc', 'low': 'low', 'scale': 'scale'}), 'uniform': (<class 'numpyro.distributions.continuous.Uniform'>, {'loc': 'low', 'scale': 'high'})}#
- static parameter_converter(x)#
- pymob.inference.numpyro_backend.catch_patterns(expression_str)#
pymob.inference.scipy_backend module#
- class pymob.inference.scipy_backend.ProbabilisticModel(prior_model, error_model, indices, observations, simulation, eps, seed)#
Bases:
objectCombined prior, transformation, and error model for inference.
The class orchestrates three components:
Prior model - draws parameter samples and evaluates their log-probability.
Transformation model - runs the deterministic simulation to obtain latent system states.
Error model - generates synthetic observations or evaluates the likelihood of observed data.
The
__call__method implements four usage modes (prior predictive, likelihood, sampling, posterior predictive) as described in the docstring ofScipyBackend.
- class pymob.inference.scipy_backend.ScipyBackend(simulation: SimulationBase)#
Bases:
InferenceBackendBackend that uses SciPy distributions for inference.
The backend implements the abstract
InferenceBackendinterface using SciPy probability distributions. It parses the model priors and error models from the simulation configuration, builds aProbabilisticModelthat can generate prior predictive samples, compute likelihoods, and sample from the prior distribution.- inference_model#
The assembled probabilistic model used for all inference operations.
- Type:
- random_state#
Random number generator seeded from the simulation configuration.
- Type:
numpy.random.Generator
- create_log_likelihood() Tuple[Errorfunction, Errorfunction]#
Create log-likelihood and (optional) gradient functions.
- Returns:
A pair
(log_likelihood, gradient)where each element is a callable conforming to theErrorfunctionprotocol. The current backend does not implement a gradient, so both callables simply returnNone. This stub satisfies the type checker and can be extended in the future.- Return type:
tuple
- distribution: rv_continuous | rv_discrete#
- parse_deterministic_model()#
- parse_probabilistic_model()#
- posterior_predictions()#
- prior_predictions()#
- run()#
- sample_distribution()#
- class pymob.inference.scipy_backend.ScipyDistribution(name: str, random_variable: RandomVariable, dims: Tuple[str, ...], shape: Tuple[int, ...])#
Bases:
DistributionDistribution wrapper for SciPy random variables.
This subclass of
Distributionprovides a thin wrapper around SciPy’s continuous and discrete distributions. It maps a distribution name to the corresponding SciPyrv_continuousorrv_discreteobject using thescipy_to_scipydictionary. The class also defines aparameter_converterthat converts parameter arrays to NumPyndarrayobjects.The primary purpose of this class is to expose a
dist_nameproperty that returns the name of the underlying SciPy distribution.- property dist_name: str#
- distribution_map: Dict[str, Tuple[rv_continuous | rv_discrete | multi_rv_generic, Dict[str, str]]] = {'bernoulli': (<scipy.stats._discrete_distns.bernoulli_gen object>, {}), 'beta': (<scipy.stats._continuous_distns.beta_gen object>, {}), 'betabinom': (<scipy.stats._discrete_distns.betabinom_gen object>, {}), 'binom': (<scipy.stats._discrete_distns.binom_gen object>, {}), 'boltzmann': (<scipy.stats._discrete_distns.boltzmann_gen object>, {}), 'cauchy': (<scipy.stats._continuous_distns.cauchy_gen object>, {}), 'chi2': (<scipy.stats._continuous_distns.chi2_gen object>, {}), 'dlaplace': (<scipy.stats._discrete_distns.dlaplace_gen object>, {}), 'expon': (<scipy.stats._continuous_distns.expon_gen object>, {}), 'exponential': (<scipy.stats._continuous_distns.expon_gen object>, {}), 'exponpow': (<scipy.stats._continuous_distns.exponpow_gen object>, {}), 'exponweib': (<scipy.stats._continuous_distns.exponweib_gen object>, {}), 'fatiguelife': (<scipy.stats._continuous_distns.fatiguelife_gen object>, {}), 'gamma': (<scipy.stats._continuous_distns.gamma_gen object>, {}), 'genextreme': (<scipy.stats._continuous_distns.genextreme_gen object>, {}), 'geom': (<scipy.stats._discrete_distns.geom_gen object>, {}), 'gompertz': (<scipy.stats._continuous_distns.gompertz_gen object>, {}), 'gumbel_l': (<scipy.stats._continuous_distns.gumbel_l_gen object>, {}), 'gumbel_r': (<scipy.stats._continuous_distns.gumbel_r_gen object>, {}), 'halfnorm': (<scipy.stats._continuous_distns.halfnorm_gen object>, {}), 'halfnormal': (<scipy.stats._continuous_distns.halfnorm_gen object>, {}), 'hypergeom': (<scipy.stats._discrete_distns.hypergeom_gen object>, {}), 'kstwobign': (<scipy.stats._continuous_distns.kstwobign_gen object>, {}), 'laplace': (<scipy.stats._continuous_distns.laplace_gen object>, {}), 'levy': (<scipy.stats._continuous_distns.levy_gen object>, {}), 'levy_stable': (<scipy.stats._levy_stable.levy_stable_gen object>, {}), 'loggamma': (<scipy.stats._continuous_distns.loggamma_gen object>, {}), 'logistic': (<scipy.stats._continuous_distns.logistic_gen object>, {}), 'lognorm': (<scipy.stats._continuous_distns.lognorm_gen object>, {}), 'lognormal': (<scipy.stats._continuous_distns.lognorm_gen object>, {}), 'logser': (<scipy.stats._discrete_distns.logser_gen object>, {}), 'nakagami': (<scipy.stats._continuous_distns.nakagami_gen object>, {}), 'nbinom': (<scipy.stats._discrete_distns.nbinom_gen object>, {}), 'norm': (<scipy.stats._continuous_distns.norm_gen object>, {}), 'normal': (<scipy.stats._continuous_distns.norm_gen object>, {}), 'norminvgauss': (<scipy.stats._continuous_distns.norminvgauss_gen object>, {}), 'pareto': (<scipy.stats._continuous_distns.pareto_gen object>, {}), 'planck': (<scipy.stats._discrete_distns.planck_gen object>, {}), 'poisson': (<scipy.stats._discrete_distns.poisson_gen object>, {}), 'powerlaw': (<scipy.stats._continuous_distns.powerlaw_gen object>, {}), 'randint': (<scipy.stats._discrete_distns.randint_gen object>, {}), 'rayleigh': (<scipy.stats._continuous_distns.rayleigh_gen object>, {}), 'rice': (<scipy.stats._continuous_distns.rice_gen object>, {}), 'semicircular': (<scipy.stats._continuous_distns.semicircular_gen object>, {}), 'skellam': (<scipy.stats._discrete_distns.skellam_gen object>, {}), 't': (<scipy.stats._continuous_distns.t_gen object>, {}), 'triang': (<scipy.stats._continuous_distns.triang_gen object>, {}), 'truncexpon': (<scipy.stats._continuous_distns.truncexpon_gen object>, {}), 'truncnorm': (<scipy.stats._continuous_distns.truncnorm_gen object>, {}), 'truncnormal': (<scipy.stats._continuous_distns.truncnorm_gen object>, {}), 'tukeylambda': (<scipy.stats._continuous_distns.tukeylambda_gen object>, {}), 'uniform': (<scipy.stats._continuous_distns.uniform_gen object>, {}), 'vonmises': (<scipy.stats._continuous_distns.vonmises_gen object>, {}), 'wald': (<scipy.stats._continuous_distns.wald_gen object>, {}), 'weibull_max': (<scipy.stats._continuous_distns.weibull_max_gen object>, {}), 'weibull_min': (<scipy.stats._continuous_distns.weibull_min_gen object>, {}), 'wrapcauchy': (<scipy.stats._continuous_distns.wrapcauchy_gen object>, {}), 'yulesimon': (<scipy.stats._discrete_distns.yulesimon_gen object>, {}), 'zipf': (<scipy.stats._discrete_distns.zipf_gen object>, {})}#
- static parameter_converter(x)#
- class pymob.inference.scipy_backend.ScipyErrorModel(eps, error_model, indices, observations, seed)#
Bases:
ErrorModelError model that generates observation noise using SciPy distributions.
The class builds a set of random variables based on the user-specified error model expressions. It can draw synthetic noisy observations from the model (
forward) or compute the log-probability of observed data given a set of latent variables (reverse).- Parameters:
eps (float) – Small constant added to scales to avoid division by zero.
error_model (dict) – Mapping of data variable names to error model
Distributionobjects.indices (dict) – Index arrays for the simulation.
observations (xarray.Dataset) – Observed data used for likelihood evaluation.
seed (int) – Seed for the random number generator.
- forward(Y)#
Obtain a realization of Y that depends on the chosen error function
- reverse(Y, Y_obs)#
Obtain an error estimate of the difference between Y and Y_obs This difference depends on the chosen error function.
- class pymob.inference.scipy_backend.ScipyPriorModel(prior_model, indices, observations, seed)#
Bases:
objectHelper class for sampling from prior distributions and computing log-probabilities.
Instances maintain a reference to the prior model definition, the index variables, and observations. The
__call__method forwards to eitherforward(when no parameters are supplied) orreverse(when a parameter dictionary is provided), enabling a uniform callable interface.- Parameters:
prior_model (dict) – Mapping of parameter names to
Distributionobjects.indices (dict) – Index arrays for each indexed dimension of the simulation.
observations (xarray.Dataset) – Observed data used for conditioning (currently not used in the prior).
seed (int) – Seed for the underlying NumPy random generator.
- forward()#
- reverse(theta)#
- class pymob.inference.scipy_backend.ScipyTransModel(simulation)#
Bases:
objectTransformation model that runs the deterministic simulation.
This lightweight wrapper forwards the
theta(parameter values) to the simulation’sdispatchmethod, executes the model, and returns the resulting simulated stateY. It is used by the probabilistic model to generate latent system states from sampled parameters.- transform_prior_to_error_model(theta, y0={}, x_in={}, seed=None)#
pymob.inference.pyabc_backend module#
- class pymob.inference.pyabc_backend.PyabcBackend(simulation: SimulationBase)#
Bases:
InferenceBackend- static array_param_to_1d(name, distribution, dist_param_dict)#
- property database#
- distance_function_parser()#
- load_results()#
- static map_parameters(theta, parameter_map)#
- property max_nr_populations#
- property min_eps_diff#
- property minimum_epsilon#
- model_parser()#
- static param_to_prior(par)#
- plot()#
- plot_chains()#
- plot_predictions(data_variable: str, x_dim: str, ax=None, subset={})#
- property population_size#
- property posterior_coordinates#
- property posterior_data_structure#
- posterior_predictions(n=50, seed=1)#
- prior_parser(free_model_parameters: list)#
- run()#
- property sampler#
- store_results()#
results are stored by default in database
pymob.inference.pymoo_backend module#
- class pymob.inference.pymoo_backend.OptimizationProblem(backend: PymooBackend, **kwargs)#
Bases:
Problem
- class pymob.inference.pymoo_backend.PymooBackend(simulation: SimulationBase)#
Bases:
object- distance_function_parser()#
- idata: PymobInferenceData#
- load_results()#
- optimize()#
- plot_predictions(data_variable: str, x_dim: str, ax=None, subset={}, upscale_x=True)#
- post_processing(pop)#
- run()#
Implements the parallelization in pymoo
- store_results(results)#
- variable_mapper(x)#
- variable_parser()#