2. Testing a Dinf model

This page explains how to test a Dinf model using the dinf and dinf-plot command line interfaces. See the Creating a Dinf model page for how to write a model file.

2.1. Dinf model files

When Dinf reads a model file (a .py file), it looks for a dinf_model variable which must be an instance of the DinfModel class. This object contains all the information needed to train a discriminator network.

2.2. Checking a model file

The Dinf command line interface can be used to do basic checks of the model.

dinf check --model examples/bottleneck/model.py

This will sample parameters from the prior distribution, then call the generator_func and target_func functions to confirm that their outputs have matching feature shapes. If the model is a simulation-only model (i.e. the target_func is None), then the parameters will be checked to ensure they each have a truth value.

2.3. Visualising feature matrices

To get some intuition about the features given to the discriminator, we can use dinf-plot with the features subcommand. This samples a feature using the generator (with parameter values drawn from the prior), and plots the result as a heatmap. See Feature matrices for how to interpret feature matrices.

dinf-plot features --seed 1 --model examples/bottleneck/model.py
../_images/3f19cb52db0ba4207b7bd7b6373e8b06e0f13c1748ca5a97568689238f633802.svg

Similarly, features extracted from the target dataset can be inspected by using the --target option.

dinf-plot features --seed 1 --model examples/bottleneck/model.py --target
../_images/9aaf7d9f842a55ed0dfa168bfb2daade225e6978aa0be5cff2ef59e4759be60e.svg

2.4. Training a discriminator

As an additional check of the model, it’s useful to train a discriminator with a modest number of replicates to confirm that the discriminator can learn from the training data. We’ll use dinf with the train subcommand to train the model for 10 epochs (i.e. 10 full passes over the training data). Note that short options are available (e.g. -S for --seed), but the documentation uses long options for clarity.

dinf train \
    --seed 1 \
    --epochs 10 \
    --training-replicates 1000 \
    --test-replicates 1000 \
    --model examples/bottleneck/model.py \
    --discriminator /tmp/discriminator.nn

Msprime simulations are quite fast, and on an 8-core i7-8665U laptop with CPU-only training, this completes in about 40 seconds. Loss and accuracy metrics can be plotted using dinf-plot with the metrics subcommand.

dinf-plot metrics /tmp/discriminator.nn
../_images/3ec5e1a973a7651dc25a88b98c610f4ee50d846d871b4af8f549b160ca900670.svg

The plot shows that the test loss is decreasing over time, and the test accuracy is increasing. This suggests that the discriminator is capable of learning from the model. For larger models, particularly with multiple feature matrices, more epochs may be needed before seeing a trend. To obtain more impressive accuracy, additional replicates will be needed. Other ways to improve the accuracy are discussed on the Improving discriminator accuracy page.