# Pymob Introduction
## Overview
**Pymob** is a Python-based platform for parameter estimation across a wide range of models. It abstracts repetitive tasks in the modeling process so that you can focus on building models, asking questions to the real world and learn from observations.
The idea of pymob originated from the frustration with fitting complex models to complicated datasets (missing observations, non-uniform data structure, non-linear models, ODE models). In such scenarios a lot of time is spent matching observations with model results.
One of Pymobβs key strengths is its streamlined model definition workflow. This not only simplifies the process of building models but also lets you apply a host of advanced optimization and inference algorithms, giving you the flexibility to iterate and discover solutions more effectively.
### What's the focus of this introduction?
This introduction will give you an overview of the pymob package and an easy example on how to use it. After, you can explore more advanced tutorials and deepen your pymob kowledge.
First the general structure of the pymob package will be explained. You will get to know the function of the components. Subsequentenly you will get instructions to use pymob for your first parameter estimation with a simple example.
### How pymob is structured:
Here you can see the structure of the structure of pymob package:

The Pymob package consists of several elements:
1) __Simulation__
First, we need to initialize a Simulation object by calling the {class}`pymob.simulation.SimulationBase` class from the simulation module.
Optionally, we can configure the simulation object with {attr}`pymob.simulation.SimulationBase.config.case_study.name` = "linear-regression", {attr}`pymob.simulation.SimulationBase.config.case_study.scenario` = "test" and many more options.
2) __Model__
The model is a python function you define. With the model you try to describe the data you observed. A classical model is, for example, the Lotka-Volterra model to describe the interactions of predators and prey. In the tutorial today, the model will be a simple linear function.
The model will be added to the simualtion by using {class}`pymob.simulation.SimulationBase.model`
3) __Observations__
The obseravtions are the data points, to which we want to fit our model. The observation data needs to be an `xarray.Dataset` ([learn more here](https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html)).
We assign it to our Simulation object by {attr}`pymob.simulation.SimulationBase.observations`.
{attr}`pymob.simulation.SimulationBase.config.data_structure` will give us some information about the layout of our data.
4) __Solver__
A solver is required for many models e.g. models that contain differential equations. Solvers in pymob are callables that need to return a dictionary of results mapped to the data variables.
The solver is assigned to the Simulation object by {class}`pymob.simulation.SimulationBase.solver`.
These solvers are currently implemented in pymob:
- analytic module
- solve_analytic_1d
- base module
- curve_jumps
- jump_interpolation
- mappar
- radius_interpolation
- rect_interpolation
- smoothed_interpolation
- diffrax module
- JaxSolver
- scipy module
- solve_ivp_1d
The documentation can be found [here](https://pymob.readthedocs.io/en/stable/api/pymob.solvers.html)
5) __Inferer__
The inferer serves as the parameter estimator. Pymob provides various backends. You can find detailed information [here](https://pymob.readthedocs.io/en/stable/user_guide/framework_overview.html).
Currently, supported inference backends are:
* interactive (interactive backend in jupyter notebookswith parameter sliders)
* numpyro (bayesian inference and stochastic variational inference)
* pyabc (approximate bayesian inference)
* pymoo (experimental multi-objective optimization)
6) __Evaluator__
The Evaluator is an instance to manage model evaluations. It sets up tasks, coordinates parallel runs of the simulation and keeps track of the results from each simulation or parameter inference process.
7) __Config__
Pymob uses `pydantic` models to validate configuration files, with the configuration organized into separate sections. You can modify these configurations either by editing the files before initializing a simulation from a config file, or directly within the script. During parameter estimation setup, all configuration settings are stored in a config object, which can later be exported as a `.cfg` file.
### Let's get started π
You will need several packages during this introduction:
```python
# imports from pymob
from pymob.simulation import SimulationBase
from pymob.sim.solvetools import solve_analytic_1d
from pymob.sim.config import Param
# other imports
import numpy as np
import xarray as xr
from matplotlib import pyplot as plt
import os
from numpy import random
```
In the following tutorial, youβll notice some import statements included as comments. These are provided to indicate which package is required for each step.
## Generate artificial data
In the real world, you will have measured a dataset. For demonstration, we generate some artifical data. Later we will fit the model to our artifical data.
$y_{obs}$ represents the observation data over the time $t$ [0, 10].
```python
# Parameter for the artificial data generation
rng = np.random.default_rng(seed=1) # for reproducibility
slope = rng.uniform(1,4)
intercept = 1.0
num_points = 100
noise_level = 1.7
# generating x-values
x = np.linspace(0, 10, num_points)
# generating y-values with noise
noise = rng.normal(0, noise_level, num_points)
y_obs = slope * x + intercept + noise
data = np.array(y_obs)
# visualising our data
plt.scatter(x, y_obs, label='Datapoints')
plt.xlabel('t [-]')
plt.ylabel('y_obs [-]')
plt.title('Artificial Data')
plt.legend()
plt.show()
```

Above you can see you're generated artificial data. At the moment it's stored in a normal array as you can see below:
```python
# our artificial data is now in the variable data
print(data)
```
[ 2.39675084 1.81785059 -0.70315217 3.30742766 2.78326703 1.36771732
3.52454616 3.41252601 3.54888575 3.35328588 4.49048771 2.56521125
3.79634384 3.50979549 5.60354444 4.90914103 4.60054453 4.02458419
5.17270933 5.8798854 5.65362632 8.57816731 8.34579772 2.28149774
3.93525899 7.10557652 6.94107294 8.2780973 8.54045905 12.02744521
6.79279159 8.29740594 12.66815375 10.55094467 10.83486488 9.08995387
7.41814448 10.7606699 10.91741134 8.90169647 10.0828172 11.37793583
10.15043989 11.84556627 12.43105392 12.58533694 11.92025208 14.04642718
14.80814685 14.09471271 12.41438677 15.3052946 13.46514525 16.06827389
13.0077698 16.64051021 15.30791566 13.47525798 15.32060955 16.20232009
16.83019906 14.95284153 14.99613473 17.47407018 16.59740969 18.04735114
19.19428235 15.3562682 18.84777408 20.75332169 18.42173378 17.80525218
20.71855905 20.12671118 21.47496089 19.62120052 17.94508373 20.53326405
20.21848206 22.55054798 21.81778089 18.97226891 19.96904293 23.75936909
23.66863583 21.68072914 23.02346747 24.03883303 24.33375292 25.28318484
24.48570624 24.14458006 24.12185409 26.61276612 21.24765866 25.09450444
25.64242623 23.41934038 26.66432432 25.24747102]
The pymob package operates with `xarray.Dataset`. We avoid most of the mess by using `xarray` as a common input/output format. So we have to transform our data into a `xarray.Dataset`.
```python
obs_data = xr.DataArray(data, dims = ("t"), coords={"t": x}).to_dataset(name="data")
```
Note: If you want to rename your data-dimension you have to change every {class}`sim.config.data_structure.data` to the new name!
It can be helpful to look at the data befor going forward, especially if you never worked with *xarray Datasets*. At the section 'Data variables' you'll find the data you just generated.
```python
obs_data
```
<xarray.Dataset>
Dimensions: (t: 100)
Coordinates:
* t (t) float64 0.0 0.101 0.202 0.303 0.404 ... 9.697 9.798 9.899 10.0
Data variables:
data (t) float64 2.397 1.818 -0.7032 3.307 ... 25.64 23.42 26.66 25.25<xarray.Dataset>
Dimensions: (t: 100)
Coordinates:
* t (t) float64 0.0 0.101 0.202 0.303 0.404 ... 9.697 9.798 9.899 10.0
Data variables:
data (t) float64 0.0 0.303 0.6061 0.9091 1.212 ... 29.09 29.39 29.7 30.0<xarray.Dataset>
Dimensions: (chain: 1, draw: 2000)
Coordinates:
* chain (chain) int64 0
* draw (draw) int64 0 1 2 3 4 5 6 7 ... 1993 1994 1995 1996 1997 1998 1999
cluster (chain) int64 0
Data variables:
b (chain, draw) float32 2.614 2.701 2.686 2.663 ... 2.631 2.626 2.639
sigma_y (chain, draw) float32 1.68 1.447 1.514 1.4 ... 1.417 1.656 1.532
Attributes:
created_at: 2025-12-03T12:00:50.959713+00:00
arviz_version: 0.21.0