2. Testing a Dinf model¶
This page explains how to test a Dinf model
using the dinf
and dinf-plot
command line interfaces.
See the Creating a Dinf model page
for how to write a model file.
2.1. Dinf model files¶
When Dinf reads a model file (a .py
file), it looks for a dinf_model
variable which must be an instance of the DinfModel
class.
This object contains all the information needed to train a discriminator
network.
2.2. Checking a model file¶
The Dinf command line interface can be used to do basic checks of the model.
dinf check --model examples/bottleneck/model.py
This will sample parameters from the prior distribution,
then call the generator_func
and target_func
functions
to confirm that their outputs have matching feature shapes.
If the model is a simulation-only model (i.e. the target_func
is None
),
then the parameters will be checked to ensure they each have a truth
value.
2.3. Visualising feature matrices¶
To get some intuition about the features given to the discriminator,
we can use dinf-plot
with the features
subcommand.
This samples a feature using the generator (with parameter values drawn
from the prior), and plots the result as a heatmap.
See Feature matrices for how to interpret feature matrices.
dinf-plot features --seed 1 --model examples/bottleneck/model.py
Similarly, features extracted from the target dataset can be inspected
by using the --target
option.
dinf-plot features --seed 1 --model examples/bottleneck/model.py --target
2.4. Training a discriminator¶
As an additional check of the model, it’s useful to train a discriminator
with a modest number of replicates to confirm that the discriminator
can learn from the training data. We’ll use dinf
with
the train
subcommand to train the model for 10 epochs (i.e. 10 full
passes over the training data). Note that short options are available
(e.g. -S
for --seed
), but the documentation uses long options for clarity.
dinf train \
--seed 1 \
--epochs 10 \
--training-replicates 1000 \
--test-replicates 1000 \
--model examples/bottleneck/model.py \
--discriminator /tmp/discriminator.nn
Msprime simulations are quite fast, and on an 8-core i7-8665U laptop with
CPU-only training, this completes in about 40 seconds.
Loss and accuracy metrics can be plotted using dinf-plot
with the
metrics
subcommand.
dinf-plot metrics /tmp/discriminator.nn
The plot shows that the test loss is decreasing over time, and the test accuracy is increasing. This suggests that the discriminator is capable of learning from the model. For larger models, particularly with multiple feature matrices, more epochs may be needed before seeing a trend. To obtain more impressive accuracy, additional replicates will be needed. Other ways to improve the accuracy are discussed on the Improving discriminator accuracy page.