CLI reference¶
Dinf provides two command line programs, dinf
and dinf-plot
.
The former provides subcommands for running analyses, while the latter
provides subcommands for making plots of various things.
When invoked with the -h
/--help
option, subcomands offer a concise
description and list the available options.
This help output is reproduced below.
Once Dinf is installed, the commands can be run by typing dinf
,
dinf-plot
, or dinf-tabulate
. In addition, the commands can be run using
Python module hooks by typing python -m dinf
, python -m dinf.plot
,
or python -m dinf.tabulate
.
The module hooks can be useful for running the commands from a
cloned git repository without requiring installation (e.g. during development).
Analysis/inference¶
usage: dinf [-h] [-V] {check,train,predict,mc,mcmc,pg-gan} ...
Discriminator-based inference of population parameters.
positional arguments:
{check,train,predict,mc,mcmc,pg-gan}
check Basic dinf_model health checks.
train Train a discriminator.
predict Make predictions using a trained discriminator.
mc Adversarial Monte Carlo.
mcmc Adversarial MCMC.
pg-gan PG-GAN style simulated annealing.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
dinf check
¶
usage: dinf check [-h] [-v | -q] -m model.py
Basic dinf_model health checks.
Checks that the target and generator functions work and return the
same feature shapes and dtypes.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
dinf train
¶
usage: dinf train [-h] [-v | -q] [-S SEED] [-j PARALLELISM]
[-r TRAINING_REPLICATES] [-R TEST_REPLICATES] [-e EPOCHS] -m
model.py -d discriminator.nn
Train a discriminator.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
common arguments:
-S SEED, --seed SEED Seed for the random number generator. CPU-based
training is expected to produce deterministic results.
Results may differ between CPU and GPU trained
networks for the same seed value. Also note that
operations on a GPU are not fully determinstic, so
training or applying a neural network twice with the
same seed value will not produce identical results.
(default: None)
-j PARALLELISM, --parallelism PARALLELISM
Number of processes to use for parallelising calls to
the DinfModel's generator_func and target_func. If not
specified, all CPU cores will be used. The number of
cores used for CPU-based neural networks is not set
with this parameter---instead use the`taskset`
command. See https://github.com/google/jax/issues/1539
(default: None)
training arguments:
-r TRAINING_REPLICATES, --training-replicates TRAINING_REPLICATES
Size of the dataset used to train the discriminator.
(default: 1000)
-R TEST_REPLICATES, --test-replicates TEST_REPLICATES
Size of the test dataset used to evaluate the
discriminator after each training epoch. (default:
1000)
-e EPOCHS, --epochs EPOCHS
Number of full passes over the training dataset when
training the discriminator. (default: 1)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
-d discriminator.nn, --discriminator discriminator.nn
Output file where the discriminator will be saved.
(default: None)
dinf predict
¶
usage: dinf predict [-h] [-v | -q] [-S SEED] [-j PARALLELISM] [-r REPLICATES]
[--target] -m model.py -d discriminator.nn [-o output.npz]
Make predictions using a trained discriminator.
By default, features will be obtained by sampling replicates from
the generator (using parameters from the prior distribution).
To instead sample features from the target dataset, use the
--target option.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
--target Sample features from the target dataset. (default:
False)
common arguments:
-S SEED, --seed SEED Seed for the random number generator. CPU-based
training is expected to produce deterministic results.
Results may differ between CPU and GPU trained
networks for the same seed value. Also note that
operations on a GPU are not fully determinstic, so
training or applying a neural network twice with the
same seed value will not produce identical results.
(default: None)
-j PARALLELISM, --parallelism PARALLELISM
Number of processes to use for parallelising calls to
the DinfModel's generator_func and target_func. If not
specified, all CPU cores will be used. The number of
cores used for CPU-based neural networks is not set
with this parameter---instead use the`taskset`
command. See https://github.com/google/jax/issues/1539
(default: None)
predict arguments:
-r REPLICATES, --replicates REPLICATES
Number of theta replicates to generate and predict
with the discriminator. (default: 1000)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
-d discriminator.nn, --discriminator discriminator.nn
File containing discriminator network weights.
(default: None)
-o output.npz, --output-file output.npz
Output data, matching thetas to discriminator
predictions. (default: None)
dinf mc
¶
usage: dinf mc [-h] [-v | -q] [-S SEED] [-j PARALLELISM]
[-r TRAINING_REPLICATES] [-R TEST_REPLICATES] [-e EPOCHS]
[--top N] [-P PROPOSAL_REPLICATES] [-i ITERATIONS]
[-o OUTPUT_FOLDER] -m model.py
Adversarial Monte Carlo.
In the first iteration, p[0] is the prior distribution.
The following steps are taken for iteration j:
- sample training and proposal datasets from distribution p[j],
- train the discriminator,
- make predictions with the discriminator on the proposal dataset,
- construct distribution p[j+1] as a weighted KDE of the proposals,
where the weights are given by the discriminator predictions.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
common arguments:
-S SEED, --seed SEED Seed for the random number generator. CPU-based
training is expected to produce deterministic results.
Results may differ between CPU and GPU trained
networks for the same seed value. Also note that
operations on a GPU are not fully determinstic, so
training or applying a neural network twice with the
same seed value will not produce identical results.
(default: None)
-j PARALLELISM, --parallelism PARALLELISM
Number of processes to use for parallelising calls to
the DinfModel's generator_func and target_func. If not
specified, all CPU cores will be used. The number of
cores used for CPU-based neural networks is not set
with this parameter---instead use the`taskset`
command. See https://github.com/google/jax/issues/1539
(default: None)
training arguments:
-r TRAINING_REPLICATES, --training-replicates TRAINING_REPLICATES
Size of the dataset used to train the discriminator.
(default: 1000)
-R TEST_REPLICATES, --test-replicates TEST_REPLICATES
Size of the test dataset used to evaluate the
discriminator after each training epoch. (default:
1000)
-e EPOCHS, --epochs EPOCHS
Number of full passes over the training dataset when
training the discriminator. (default: 1)
SMC arguments:
--top N In each iteration, accept only the N top proposals,
ranked by discriminator prediction. (default: None)
-P PROPOSAL_REPLICATES, --proposal-replicates PROPOSAL_REPLICATES
Number of replicates for Monte Carlo proposals.
(default: 1000)
GAN arguments:
-i ITERATIONS, --iterations ITERATIONS
Number of iterations. (default: 1)
-o OUTPUT_FOLDER, --output-folder OUTPUT_FOLDER
Folder to output results. If not specified, the
current directory will be used. (default: None)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
dinf mcmc
¶
usage: dinf mcmc [-h] [-v | -q] [-S SEED] [-j PARALLELISM]
[-r TRAINING_REPLICATES] [-R TEST_REPLICATES] [-e EPOCHS]
[-w WALKERS] [-s STEPS] [--Dx-replicates DX_REPLICATES]
[-i ITERATIONS] [-o OUTPUT_FOLDER] -m model.py
Adversarial MCMC.
In the first iteration, p[0] is the prior distribution.
The following steps are taken for iteration j:
- sample training dataset from the distribution p[j],
- train the discriminator,
- run the MCMC,
- obtain distribution p[j+1] as a KDE of the MCMC sample.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
common arguments:
-S SEED, --seed SEED Seed for the random number generator. CPU-based
training is expected to produce deterministic results.
Results may differ between CPU and GPU trained
networks for the same seed value. Also note that
operations on a GPU are not fully determinstic, so
training or applying a neural network twice with the
same seed value will not produce identical results.
(default: None)
-j PARALLELISM, --parallelism PARALLELISM
Number of processes to use for parallelising calls to
the DinfModel's generator_func and target_func. If not
specified, all CPU cores will be used. The number of
cores used for CPU-based neural networks is not set
with this parameter---instead use the`taskset`
command. See https://github.com/google/jax/issues/1539
(default: None)
training arguments:
-r TRAINING_REPLICATES, --training-replicates TRAINING_REPLICATES
Size of the dataset used to train the discriminator.
(default: 1000)
-R TEST_REPLICATES, --test-replicates TEST_REPLICATES
Size of the test dataset used to evaluate the
discriminator after each training epoch. (default:
1000)
-e EPOCHS, --epochs EPOCHS
Number of full passes over the training dataset when
training the discriminator. (default: 1)
MCMC arguments:
-w WALKERS, --walkers WALKERS
Number of independent MCMC chains. (default: 64)
-s STEPS, --steps STEPS
The chain length for each MCMC walker. (default: 1000)
--Dx-replicates DX_REPLICATES
Number of generator replicates for approximating
E[D(G(θ))]. (default: 32)
GAN arguments:
-i ITERATIONS, --iterations ITERATIONS
Number of iterations. (default: 1)
-o OUTPUT_FOLDER, --output-folder OUTPUT_FOLDER
Folder to output results. If not specified, the
current directory will be used. (default: None)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
dinf pg-gan
¶
usage: dinf pg-gan [-h] [-v | -q] [-S SEED] [-j PARALLELISM]
[-r TRAINING_REPLICATES] [-R TEST_REPLICATES] [-e EPOCHS]
[--Dx-replicates DX_REPLICATES]
[--num-proposals NUM_PROPOSALS]
[--max-pretraining-iterations MAX_PRETRAINING_ITERATIONS]
[-i ITERATIONS] [-o OUTPUT_FOLDER] -m model.py
PG-GAN style simulated annealing.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
common arguments:
-S SEED, --seed SEED Seed for the random number generator. CPU-based
training is expected to produce deterministic results.
Results may differ between CPU and GPU trained
networks for the same seed value. Also note that
operations on a GPU are not fully determinstic, so
training or applying a neural network twice with the
same seed value will not produce identical results.
(default: None)
-j PARALLELISM, --parallelism PARALLELISM
Number of processes to use for parallelising calls to
the DinfModel's generator_func and target_func. If not
specified, all CPU cores will be used. The number of
cores used for CPU-based neural networks is not set
with this parameter---instead use the`taskset`
command. See https://github.com/google/jax/issues/1539
(default: None)
training arguments:
-r TRAINING_REPLICATES, --training-replicates TRAINING_REPLICATES
Size of the dataset used to train the discriminator.
(default: 1000)
-R TEST_REPLICATES, --test-replicates TEST_REPLICATES
Size of the test dataset used to evaluate the
discriminator after each training epoch. (default:
1000)
-e EPOCHS, --epochs EPOCHS
Number of full passes over the training dataset when
training the discriminator. (default: 1)
PG-GAN arguments:
--Dx-replicates DX_REPLICATES
Number of generator replicates for approximating
E[D(G(θ))]. (default: 32)
--num-proposals NUM_PROPOSALS
Number of proposals for each parameter in a given
iteration. (default: 10)
--max-pretraining-iterations MAX_PRETRAINING_ITERATIONS
Maximum number of pretraining rounds. (default: 100)
GAN arguments:
-i ITERATIONS, --iterations ITERATIONS
Number of iterations. (default: 1)
-o OUTPUT_FOLDER, --output-folder OUTPUT_FOLDER
Folder to output results. If not specified, the
current directory will be used. (default: None)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
Plotting results¶
usage: dinf.plot [-h] [-V] {demes,features,metrics,hist,hist2d,gan} ...
Dinf plotting tools.
positional arguments:
{demes,features,metrics,hist,hist2d,gan}
demes Plot a demes-as-tubes demographic model using
DemesDraw.
features Plot feature matrices as heatmaps.
metrics Plot loss and accuracy of discriminator(s).
hist Plot marginal histograms.
hist2d Plot 2d marginal histograms.
gan Plot GAN things.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
dinf-plot metrics
¶
usage: dinf.plot metrics [-h] [-v | -q] [-o output.pdf]
discriminator.nn [discriminator.nn ...]
Plot loss and accuracy of discriminator(s).
Each metric is plotted as a function of the training epoch,
and the resulting multipanel plot shows:
- training loss,
- training accuracy,
- test loss, and
- test accuracy.
If multiple discriminator files are provided, the training metrics for
each file are indicated by a different colour. The legend shows the
corresponding filename.
positional arguments:
discriminator.nn The discriminator network(s) to plot.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.pdf, --output-file output.pdf
Output file for the figure. The file extension
determines the filetype, which can be any format
supported by Matplotlib (e.g. pdf, svg, png).If no
output file is specified, an interactive plot window
will be opened. (default: None)
dinf-plot features
¶
usage: dinf.plot features [-h] [-v | -q] [-o output.pdf] [-S SEED] [--target]
-m model.py
Plot feature matrices as heatmaps.
By default, one simulation will be performed with the generator to obtain
a set of features for plotting. To instead extract features from the
target dataset, use the --target option.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.pdf, --output-file output.pdf
Output file for the figure. The file extension
determines the filetype, which can be any format
supported by Matplotlib (e.g. pdf, svg, png).If no
output file is specified, an interactive plot window
will be opened. (default: None)
-S SEED, --seed SEED Seed for the random number generator (default: None)
--target Extract feature(s) from the target dataset. (default:
False)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
dinf-plot hist
¶
usage: dinf.plot hist [-h] [-v | -q] [-o output.pdf] [--top N] [-W] [-c]
[-x X_PARAM] [--kde] [-m model.py]
data.npz [data.npz ...]
Plot marginal histograms.
One plot is produced for the discriminator predictions,
plus one plot for each model parameter. The choice of which
parameter to plot can be specified using the -x option,
with the special value "_Pr" indicating the discriminator
predictions.
If a pdf requested with the -o option, a multipage pdf is
created. If another format is requested, then one file is
created for each figure (the requested filename will be
modified to include the parameter name).
The resulting figure is a histogram. If the data correspond
to a simulation-only model (provided via the -m option),
then the parameter's truth value will be shown as a vertical
red line. A 95% interval is shown at the bottom of
the figure. By default, all values in the data file contribute
equally to the histogram. For parameter values drawn from the
sampling distribution, this will therefore show the sampling
distribution. The subsequent distribution can be obtained by
weighting parameter values by the discriminator predictions
using the -W option, and/or rejection sampling using the --top
option to accept only the top N samples as ranked by the
discriminator preditions.
positional arguments:
data.npz Data file containing discriminator predictions.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.pdf, --output-file output.pdf
Output file for the figure. The file extension
determines the filetype, which can be any format
supported by Matplotlib (e.g. pdf, svg, png).If no
output file is specified, an interactive plot window
will be opened. (default: None)
--top N Filter data to retain top N samples, ranked by
discriminator prediction. (default: None)
-W, --weighted Weight the parameter contributions by their
discriminator prediction. (default: False)
-c, --cumulative Plot cumulative distribution. (default: False)
-x X_PARAM, --x-param X_PARAM
Name of parameter to plot. The special name "_Pr" is
recognised to plot the predictionsobtained from the
discriminator. (default: None)
--kde Also draw a 1-dimensional marginal kernel density
estimate. (default: False)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
dinf-plot hist2d
¶
usage: dinf.plot hist2d [-h] [-v | -q] [-o output.pdf] [--top N] [-W]
[-x X_PARAM] [-y Y_PARAM] [-m model.py]
data.npz
Plot 2d marginal histograms.
One plot is produced for each unique pair of parameters.
As this may lead to a large number of plots (particularly
for interactive use!), the choice of which parameters to
plot can be specified using the -x and -y options.
If a pdf requested with the -o option, a multipage pdf is
created. If another format is requested, then one file is
created for each figure (the requested filename will be
modified to include the parameter names).
The resulting figure is a 2d histogram, with darker squares
indicating higher densities. If the data correspond to a
simulation-only model, then the parameters' truth values will
be indicated by red lines. By default, all values in the
data file contribute equally to the histogram. For parameter
values drawn from the sampling distribution, this will therefore
show the sampling distribution. The subsequent distribution can
be obtained by weighting parameter values by the discriminator
predictions using the -W option, and/or rejection sampling
using the --top option to accept only the top N samples as ranked
by the discriminator preditions.
positional arguments:
data.npz Data file containing discriminator predictions.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.pdf, --output-file output.pdf
Output file for the figure. The file extension
determines the filetype, which can be any format
supported by Matplotlib (e.g. pdf, svg, png).If no
output file is specified, an interactive plot window
will be opened. (default: None)
--top N Filter data to retain top N samples, ranked by
discriminator prediction. (default: None)
-W, --weighted Weight the parameter contributions by their
discriminator prediction. (default: False)
-x X_PARAM, --x-param X_PARAM
Name of parameter to plot on horizontal axis.
(default: None)
-y Y_PARAM, --y-param Y_PARAM
Name of parameter to plot on vertical axis. (default:
None)
-m model.py, --model model.py
Python script from which to import the variable
"dinf_model". This is a dinf.DinfModel object that
describes the model components. See the examples/
folder of the git repository for example models.
https://github.com/RacimoLab/dinf (default: None)
Tabulating results¶
usage: dinf.tabulate [-h] [-V] {metrics,data,quantiles} ...
Tabulate Dinf output.
positional arguments:
{metrics,data,quantiles}
metrics Print discriminator metrics.
data Print .npz data---predictions from a discriminator.
quantiles Calculate quantiles of the data.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
dinf-tabulate metrics
¶
usage: dinf.tabulate metrics [-h] [-v | -q] [-o output.txt]
[--separator SEPARATOR] [--format FORMAT]
discriminator.nn [discriminator.nn ...]
Print discriminator metrics.
positional arguments:
discriminator.nn The discriminator network(s) from which to tabulate
metrics.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.txt, --output-file output.txt
Output file for the tabulated data. If not specified,
output will be printed to stdout. (default: None)
--separator SEPARATOR
The string that separates columns. (default: )
--format FORMAT Printf-style format specifier for float values.
(default: None)
dinf-tabulate data
¶
usage: dinf.tabulate data [-h] [-v | -q] [-o output.txt]
[--separator SEPARATOR] [--format FORMAT]
data.npz
Print .npz data---predictions from a discriminator.
positional arguments:
data.npz Data file in numpy .npz format.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.txt, --output-file output.txt
Output file for the tabulated data. If not specified,
output will be printed to stdout. (default: None)
--separator SEPARATOR
The string that separates columns. (default: )
--format FORMAT Printf-style format specifier for float values.
(default: None)
dinf-tabulate quantiles
¶
usage: dinf.tabulate quantiles [-h] [-v | -q] [-o output.txt]
[--separator SEPARATOR] [--format FORMAT]
[--top N] [-W] [--quantiles QUANTILES]
data.npz
Calculate quantiles of the data.
positional arguments:
data.npz Data file in numpy .npz format.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase verbosity. Specify once for INFO messages and
twice for DEBUG messages. (default: 0)
-q, --quiet Disable output. Only ERROR and CRITICAL messages are
printed. (default: False)
-o output.txt, --output-file output.txt
Output file for the tabulated data. If not specified,
output will be printed to stdout. (default: None)
--separator SEPARATOR
The string that separates columns. (default: )
--format FORMAT Printf-style format specifier for float values.
(default: None)
--top N Filter data to retain top N samples, ranked by
probability. (default: None)
-W, --weighted Weight the parameter contributions by their
probability. (default: False)
--quantiles QUANTILES
Comma separated list of quantiles to calculate.
(default: 0.025,0.5,0.975)