Basic Usage of TemporalMixtureModels.jl
The main functionality revolves around the fit_mixture function, which fits a temporal mixture model to the provided time series data, using the EM algorithm.
First, we need to import the package:
using TemporalMixtureModels, Random
Random.seed!(27052023) # For reproducibilityRandom.TaskLocalRNG()Data Preparation
First of all, you'll need some data that is in the right format. The basic format for the data consists of three elements:
t: A vector of time points. This should be a one-dimensional array of real numbers representing the time axis for the time series data.Y: A vector or matrix of observed values. Each row corresponds to a time point int, and each column corresponds to a different outcome variable (for multivariate time series). For univariate time series, this can be a one-dimensional array.ids: A vector of identifiers for each independent time series. The length of this vector should match the number of rows inYand indicates which time series each observation belongs to.
Optionally, you can also provide:
inputs: A matrix of input variables (covariates) that may influence the observed values. Each row corresponds to a time point, and each column corresponds to a different input variable. Inputs are currently not used in the built-in component models, but the support is there for custom models.
Say, we have a dataset with blood pressure measurements over time for multiple patients while they have been given either a placebo or a drug. Our table of data might look like this:
| id | time | systolic_pressure | diastolic_pressure | treatment |
|---|---|---|---|---|
| 1 | 0.0 | 120 | 80 | drug |
| 1 | 1.0 | 118 | 78 | drug |
| 1 | 2.0 | 115 | 75 | drug |
| 2 | 0.0 | 130 | 85 | placebo |
| 2 | 1.0 | 128 | 83 | placebo |
| 2 | 2.0 | 125 | 80 | placebo |
| ... | ... | ... | ... | ... |
| 43 | 2.0 | 110 | 70 | placebo |
For the examples in the manual, the package contains a function called example_bp_data() that can be used to generate a dataset like this.
TemporalMixtureModels.example_bp_data — Function
example_bp_data(;n_subjects_drug=50, n_subjects_placebo=50, n_timepoints=5, rng::AbstractRNG=Random.GLOBAL_RNG)Generate example blood pressure data for testing and examples.
Arguments
n_subjects_drug: Number of subjects in the drug group (default: 50)n_subjects_placebo: Number of subjects in the placebo group (default: 50)n_timepoints: Number of time points per subject (default: 5)rng: Random number generator (default:Random.GLOBAL_RNG)
Returns
t: Vector of time pointsy: Matrix of blood pressure measurements (systolic and diastolic)ids: Vector of subject IDsclass_labels: Vector of class labels (1 for drug, 0 for placebo)
Example
using TemporalMixtureModels: example_bp_data
t, y, ids, class_labels = example_bp_data(n_subjects_drug=30, n_subjects_placebo=30, n_timepoints=4)Let's generate the example blood pressure dataset:
t, y, ids, class_labels = example_bp_data(;n_subjects_drug=21, n_subjects_placebo=22, n_timepoints=5)([0.0, 1.25, 2.5, 3.75, 5.0, 0.0, 1.25, 2.5, 3.75, 5.0 … 0.0, 1.25, 2.5, 3.75, 5.0, 0.0, 1.25, 2.5, 3.75, 5.0], [125.48572170434635 82.70734137033764; 103.06459556000243 71.3956312347923; … ; 109.19769689115391 71.89320129586659; 108.0637238450581 77.0854240014355], [1, 1, 1, 1, 1, 2, 2, 2, 2, 2 … 42, 42, 42, 42, 42, 43, 43, 43, 43, 43], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])We can quickly visualize the systolic blood pressure measurements over time for all subjects:

It may occur that some time series have missing values at certain time points. The current implementation of fit_mixture handles missing data by ignoring those specific observations during the fitting process.
This may for example also happen if you measure two variables that are not always both measured at the same time points. In that case, you can fill the missing values with missing and the fitting process will skip those values. TemporalMixtureModels.jl is designed to handle such cases gracefully.
An example of such a dataset, where we may measure systolic pressure at the start and after 1 hour, while we diastolic pressure at the start and at 2 hours, would look like this:
| id | time | systolic_pressure | diastolic_pressure | treatment |
|---|---|---|---|---|
| 1 | 0.0 | 120 | 80 | drug |
| 1 | 1.0 | 118 | missing | drug |
| 1 | 2.0 | missing | 75 | drug |
| 2 | 0.0 | 130 | 85 | placebo |
| 2 | 1.0 | 128 | missing | placebo |
| 2 | 2.0 | missing | 80 | placebo |
| ... | ... | ... | ... | ... |
| 43 | 2.0 | missing | 70 | placebo |
As long as we have enough data points overall, the fitting process will still work.
Fitting a Temporal Mixture Model
Once you have your data prepared, you can fit a temporal mixture model using the fit_mixture function. You need to specify the number of components (clusters) you want to fit, as well as the component model to use (e.g., PolynomialRegression). For now, we are interested in the systolic pressure only, so we will extract that column from Y for fitting.
component_model = PolynomialRegression(2)
num_components = 2 # For example, we want to fit 2 clusters
model = fit_mixture(component_model, num_components, t, y[:, 1], ids)MixtureResult(PolynomialRegression(2), 2, [[119.34891353034865, -7.2868884094666875, 0.5764883810632598], [122.7913507650051, 0.2949062907715054, -0.08341428634647943]], [[94.96960000162001], [59.68091478886761]], [0.5259997562454423, 0.4740002437545577], [0.9999979104182807 2.0895817192977623e-6; 0.9999998337442878 1.6625571223359278e-7; … ; 0.0004310743373957102 0.9995689256626042; 0.9393909635620004 0.06060903643799958], -797.0558338777277, true, 30, NormalError())This will fit a temporal mixture model with 2 polynomial regression components of degree 2 to the systolic blood pressure data.
The Polynomial Regression Component
The PolynomialRegression component model fits a polynomial regression of a specified degree to the time series data. In this example, we used a polynomial of degree 2, which means that each component will fit a quadratic curve to the data. You can choose different degrees based on the complexity of the temporal patterns you expect in your data.
TemporalMixtureModels.PolynomialRegression — Type
PolynomialRegression(degree::Int)Polynomial regression component model.
Arguments
degree: Polynomial degree (e.g., 2 for quadratic)
Customizing the fitting process
The fit_mixture function provides several optional keyword arguments to customize the fitting process:
TemporalMixtureModels.fit_mixture — Function
fit_mixture(component, n_components, t, y, ids;
n_repeats=5, error_model=NormalError(), inputs=nothing,
max_iter=100, tol=1e-6, verbose=true, initial_assignments=nothing)Fit a mixture model using the Expectation-Maximization (EM) algorithm. By default, fit_mixture will run 5 EM restarts and return the best fitting model based on log-likelihood.
Arguments
component: A Component model (e.g., PolynomialRegression(2))n_components: Number of mixture components/clusterst: Time points (vector)y: Measurements (vector for single measurement, matrix for multiple)ids: Subject identifiers (vector)
Keyword Arguments
n_repeats: Number of EM restarts (default: 5)error_model: Error distribution (currently only NormalError() is supported, included for possible future extensions)inputs: Additional input variables to be passed to the component model (default: nothing)max_iter: Maximum EM iterations (default: 100)tol: Convergence tolerance (default: 1e-6)verbose: Print progress (default: true)initial_assignments: Optional vector of initial cluster assignments (1 to n_components) for each subject (default: nothing, uses random initialization)
Returns
MixtureResultcontaining fitted parameters and cluster assignments, with fields:component: The component model usedn_clusters: Number of clustersparameters: Fitted parameters for each clusterparams_error: Estimated error model parameters for each measurement and clustercluster_probs: Mixing proportions for each clusterresponsibilities: Posterior responsibilities for each subject and clusterloglikelihood: Final log-likelihood of the fitted modelconverged: Boolean indicating if the EM algorithm convergedn_iterations: Number of iterations performederror_model: The error model used
Examples
# Single measurement with normal errors
result = fit_mixture(PolynomialRegression(2), 3, t, y, ids)
# With initial cluster assignments
initial_clusters = [1, 1, 2, 2, 3, 3] # subjects 1-2 in cluster 1, 3-4 in cluster 2, etc.
result = fit_mixture(PolynomialRegression(2), 3, t, y, ids; initial_assignments=initial_clusters)Making Predictions
After fitting the model, you can make predictions for new time points using the predict function. You can specify the time points at which you want predictions, and the function will return the predicted values for each component.
new_time_points = 0:0.1:2.0
predictions = predict(model, new_time_points)An example of the resulting mixture model is shown below:
