Improving discriminator accuracy

This page explores the accuracy of the discriminator when modifying various hyperparameters.

Training replicates

In general, more training replicates will improve the accuracy of the discriminator. It’s often reasonable to train for multiple epochs (one epoch is a full pass over the training data), although overfitting is a possibility, particularly with fewer replicates. Below, we’ll train the bottleneck model for 20 epochs while increasing the number of training replicates.

%%bash
mkdir -p out/accuracy
for reps in 10000 100000 1000000; do
    echo $reps training replicates
    dinf train \
        --seed 1 \
        --epochs 20 \
        --training-replicates $reps \
        --test-replicates 10000 \
        ../../examples/bottleneck/model.py \
        out/accuracy/replicates-${reps}.pkl
    echo
done
Hide code cell output
10000 training replicates
[epoch 1|10000] train loss 0.2648, accuracy 0.8912; test loss 0.4547, accuracy 0.8448
[epoch 2|10000] train loss 0.1312, accuracy 0.9683; test loss 0.1865, accuracy 0.9331
[epoch 3|10000] train loss 0.0923, accuracy 0.9770; test loss 0.0713, accuracy 0.9805
[epoch 4|10000] train loss 0.0612, accuracy 0.9873; test loss 0.0845, accuracy 0.9696
[epoch 5|10000] train loss 0.0492, accuracy 0.9900; test loss 0.0634, accuracy 0.9816
[epoch 6|10000] train loss 0.0386, accuracy 0.9918; test loss 0.0597, accuracy 0.9823
[epoch 7|10000] train loss 0.0292, accuracy 0.9945; test loss 0.0575, accuracy 0.9826
[epoch 8|10000] train loss 0.0241, accuracy 0.9957; test loss 0.0586, accuracy 0.9809
[epoch 9|10000] train loss 0.0250, accuracy 0.9953; test loss 0.0713, accuracy 0.9752
[epoch 10|10000] train loss 0.0155, accuracy 0.9979; test loss 0.1352, accuracy 0.9563
[epoch 11|10000] train loss 0.0123, accuracy 0.9980; test loss 0.1489, accuracy 0.9515
[epoch 12|10000] train loss 0.0113, accuracy 0.9982; test loss 0.1116, accuracy 0.9645
[epoch 13|10000] train loss 0.0123, accuracy 0.9982; test loss 0.1230, accuracy 0.9607
[epoch 14|10000] train loss 0.0159, accuracy 0.9965; test loss 0.0851, accuracy 0.9705
[epoch 15|10000] train loss 0.0132, accuracy 0.9977; test loss 0.0586, accuracy 0.9827
[epoch 16|10000] train loss 0.0068, accuracy 0.9990; test loss 0.0591, accuracy 0.9838
[epoch 17|10000] train loss 0.0051, accuracy 0.9996; test loss 0.0638, accuracy 0.9804
[epoch 18|10000] train loss 0.0052, accuracy 0.9993; test loss 0.1433, accuracy 0.9554
[epoch 19|10000] train loss 0.0136, accuracy 0.9966; test loss 0.0605, accuracy 0.9832
[epoch 20|10000] train loss 0.0121, accuracy 0.9972; test loss 0.0828, accuracy 0.9750

100000 training replicates
[epoch 1|100000] train loss 0.1061, accuracy 0.9670; test loss 0.0622, accuracy 0.9861
[epoch 2|100000] train loss 0.0529, accuracy 0.9862; test loss 0.0448, accuracy 0.9889
[epoch 3|100000] train loss 0.0431, accuracy 0.9886; test loss 0.0414, accuracy 0.9909
[epoch 4|100000] train loss 0.0353, accuracy 0.9906; test loss 0.0435, accuracy 0.9900
[epoch 5|100000] train loss 0.0307, accuracy 0.9916; test loss 0.0397, accuracy 0.9909
[epoch 6|100000] train loss 0.0269, accuracy 0.9924; test loss 0.0421, accuracy 0.9908
[epoch 7|100000] train loss 0.0239, accuracy 0.9933; test loss 0.0562, accuracy 0.9876
[epoch 8|100000] train loss 0.0213, accuracy 0.9935; test loss 0.0779, accuracy 0.9798
[epoch 9|100000] train loss 0.0184, accuracy 0.9947; test loss 0.0547, accuracy 0.9883
[epoch 10|100000] train loss 0.0168, accuracy 0.9950; test loss 0.0493, accuracy 0.9878
[epoch 11|100000] train loss 0.0145, accuracy 0.9957; test loss 0.0429, accuracy 0.9909
[epoch 12|100000] train loss 0.0144, accuracy 0.9956; test loss 0.0847, accuracy 0.9834
[epoch 13|100000] train loss 0.0131, accuracy 0.9957; test loss 0.0534, accuracy 0.9881
[epoch 14|100000] train loss 0.0110, accuracy 0.9967; test loss 0.0500, accuracy 0.9917
[epoch 15|100000] train loss 0.0108, accuracy 0.9967; test loss 0.0573, accuracy 0.9873
[epoch 16|100000] train loss 0.0116, accuracy 0.9963; test loss 0.0566, accuracy 0.9901
[epoch 17|100000] train loss 0.0089, accuracy 0.9973; test loss 0.0759, accuracy 0.9809
[epoch 18|100000] train loss 0.0093, accuracy 0.9971; test loss 0.0528, accuracy 0.9915
[epoch 19|100000] train loss 0.0083, accuracy 0.9977; test loss 0.0521, accuracy 0.9908
[epoch 20|100000] train loss 0.0076, accuracy 0.9976; test loss 0.0583, accuracy 0.9880

1000000 training replicates
[epoch 1|1000000] train loss 0.0480, accuracy 0.9873; test loss 0.0303, accuracy 0.9931
[epoch 2|1000000] train loss 0.0321, accuracy 0.9922; test loss 0.0699, accuracy 0.9745
[epoch 3|1000000] train loss 0.0282, accuracy 0.9932; test loss 0.0277, accuracy 0.9941
[epoch 4|1000000] train loss 0.0258, accuracy 0.9939; test loss 0.0296, accuracy 0.9946
[epoch 5|1000000] train loss 0.0240, accuracy 0.9941; test loss 0.0268, accuracy 0.9946
[epoch 6|1000000] train loss 0.0226, accuracy 0.9946; test loss 0.0309, accuracy 0.9938
[epoch 7|1000000] train loss 0.0215, accuracy 0.9948; test loss 0.0306, accuracy 0.9945
[epoch 8|1000000] train loss 0.0207, accuracy 0.9949; test loss 0.0260, accuracy 0.9942
[epoch 9|1000000] train loss 0.0198, accuracy 0.9951; test loss 0.0287, accuracy 0.9942
[epoch 10|1000000] train loss 0.0191, accuracy 0.9952; test loss 0.0286, accuracy 0.9929
[epoch 11|1000000] train loss 0.0184, accuracy 0.9954; test loss 0.0280, accuracy 0.9942
[epoch 12|1000000] train loss 0.0179, accuracy 0.9954; test loss 0.0290, accuracy 0.9934
[epoch 13|1000000] train loss 0.0174, accuracy 0.9955; test loss 0.0322, accuracy 0.9913
[epoch 14|1000000] train loss 0.0170, accuracy 0.9956; test loss 0.0341, accuracy 0.9945
[epoch 15|1000000] train loss 0.0166, accuracy 0.9957; test loss 0.0289, accuracy 0.9930
[epoch 16|1000000] train loss 0.0160, accuracy 0.9959; test loss 0.0294, accuracy 0.9939
[epoch 17|1000000] train loss 0.0155, accuracy 0.9959; test loss 0.0322, accuracy 0.9947
[epoch 18|1000000] train loss 0.0152, accuracy 0.9959; test loss 0.0340, accuracy 0.9943
[epoch 19|1000000] train loss 0.0151, accuracy 0.9959; test loss 0.0293, accuracy 0.9934
[epoch 20|1000000] train loss 0.0146, accuracy 0.9960; test loss 0.0325, accuracy 0.9933
import dinf
import matplotlib.pyplot as plt

fig, axs = plt.subplots(nrows=2, ncols=2, sharex="all", sharey="row", tight_layout=True)

for reps in (10000, 100000, 1000000):
    discriminator = dinf.Discriminator.from_file(f"out/accuracy/replicates-{reps}.pkl")
    for ax, metric in zip(
        axs.flat, ("train_loss", "test_loss", "train_accuracy", "test_accuracy")
    ):
        y = discriminator.train_metrics[metric]
        epoch = range(1, len(y) + 1)
        ax.plot(epoch, y, label=str(reps))
        ax.set_title(metric)

axs[0, 0].legend(title="training replicates")
axs[0, 0].set_ylabel("loss")
axs[1, 0].set_ylabel("accuracy")
axs[1, 0].set_xlabel("epoch")
axs[1, 1].set_xlabel("epoch")
plt.show(fig)
../_images/70c334a71fc46848f51a332b16044f9db47f4149329f060f8d7c375ebe8c6a37.svg

From this, we see overfitting with 10,000 replicates after just a few epochs (indicated by an increasing test loss). For 100,000 replicates, the test loss has at best plateaued, and is at worst overfitting. But for 1,000,000 replicates, the training has not overfit after 20 epochs, although the test loss and accuracy may have plateaued.

With fewer training replicates, the network can memorise the training dataset (leading to overfit). By default, Dinf uses a network with ~30,000 parameters. With more training data it’s harder for the network to memorise the data, but the threshold at which this happens will vary depending on the network architecture and capacity.

Information content of features

  • Bigger feature arrays hold more information.

  • Does the feature matrix constitute a sufficient statistic for the parameter(s) in question?

%%bash

mkdir -p out/accuracy
for nind in 16 32 64 96 128; do
    echo $nind individuals in the feature matrix
    sed "s/num_individuals = [0-9]\+/num_individuals = $nind/" \
        ../../examples/bottleneck/model.py \
        > /tmp/model.py
    dinf train \
        --seed 1 \
        --epochs 20 \
        --training-replicates 10000 \
        --test-replicates 10000 \
        /tmp/model.py \
        out/accuracy/num_individuals-${nind}.pkl
    echo
done
Hide code cell output
16 individuals in the feature matrix
[epoch 1|10000] train loss 0.2648, accuracy 0.8912; test loss 0.4547, accuracy 0.8448
[epoch 2|10000] train loss 0.1312, accuracy 0.9683; test loss 0.1865, accuracy 0.9331
[epoch 3|10000] train loss 0.0923, accuracy 0.9770; test loss 0.0713, accuracy 0.9805
[epoch 4|10000] train loss 0.0612, accuracy 0.9873; test loss 0.0845, accuracy 0.9696
[epoch 5|10000] train loss 0.0492, accuracy 0.9900; test loss 0.0634, accuracy 0.9816
[epoch 6|10000] train loss 0.0386, accuracy 0.9918; test loss 0.0597, accuracy 0.9823
[epoch 7|10000] train loss 0.0292, accuracy 0.9945; test loss 0.0575, accuracy 0.9826
[epoch 8|10000] train loss 0.0241, accuracy 0.9957; test loss 0.0587, accuracy 0.9809
[epoch 9|10000] train loss 0.0250, accuracy 0.9953; test loss 0.0713, accuracy 0.9751
[epoch 10|10000] train loss 0.0155, accuracy 0.9979; test loss 0.1352, accuracy 0.9563
[epoch 11|10000] train loss 0.0123, accuracy 0.9980; test loss 0.1489, accuracy 0.9514
[epoch 12|10000] train loss 0.0113, accuracy 0.9982; test loss 0.1116, accuracy 0.9645
[epoch 13|10000] train loss 0.0123, accuracy 0.9982; test loss 0.1230, accuracy 0.9607
[epoch 14|10000] train loss 0.0159, accuracy 0.9965; test loss 0.0851, accuracy 0.9705
[epoch 15|10000] train loss 0.0132, accuracy 0.9977; test loss 0.0586, accuracy 0.9827
[epoch 16|10000] train loss 0.0068, accuracy 0.9990; test loss 0.0591, accuracy 0.9838
[epoch 17|10000] train loss 0.0051, accuracy 0.9996; test loss 0.0638, accuracy 0.9804
[epoch 18|10000] train loss 0.0052, accuracy 0.9993; test loss 0.1433, accuracy 0.9554
[epoch 19|10000] train loss 0.0136, accuracy 0.9966; test loss 0.0605, accuracy 0.9831
[epoch 20|10000] train loss 0.0121, accuracy 0.9972; test loss 0.0828, accuracy 0.9750

32 individuals in the feature matrix
[epoch 1|10000] train loss 0.2364, accuracy 0.9175; test loss 0.4923, accuracy 0.7615
[epoch 2|10000] train loss 0.0948, accuracy 0.9824; test loss 0.2063, accuracy 0.9191
[epoch 3|10000] train loss 0.0628, accuracy 0.9875; test loss 0.0533, accuracy 0.9893
[epoch 4|10000] train loss 0.0413, accuracy 0.9939; test loss 0.0418, accuracy 0.9903
[epoch 5|10000] train loss 0.0332, accuracy 0.9942; test loss 0.0443, accuracy 0.9897
[epoch 6|10000] train loss 0.0252, accuracy 0.9961; test loss 0.0388, accuracy 0.9893
[epoch 7|10000] train loss 0.0192, accuracy 0.9974; test loss 0.0588, accuracy 0.9830
[epoch 8|10000] train loss 0.0138, accuracy 0.9983; test loss 0.0397, accuracy 0.9879
[epoch 9|10000] train loss 0.0138, accuracy 0.9979; test loss 0.1013, accuracy 0.9666
[epoch 10|10000] train loss 0.0143, accuracy 0.9971; test loss 0.0348, accuracy 0.9901
[epoch 11|10000] train loss 0.0103, accuracy 0.9984; test loss 0.0414, accuracy 0.9900
[epoch 12|10000] train loss 0.0067, accuracy 0.9994; test loss 0.0452, accuracy 0.9885
[epoch 13|10000] train loss 0.0084, accuracy 0.9983; test loss 0.0383, accuracy 0.9886
[epoch 14|10000] train loss 0.0051, accuracy 0.9993; test loss 0.0357, accuracy 0.9916
[epoch 15|10000] train loss 0.0044, accuracy 0.9994; test loss 0.0422, accuracy 0.9900
[epoch 16|10000] train loss 0.0031, accuracy 0.9999; test loss 0.0363, accuracy 0.9900
[epoch 17|10000] train loss 0.0095, accuracy 0.9981; test loss 0.0510, accuracy 0.9878
[epoch 18|10000] train loss 0.0042, accuracy 0.9997; test loss 0.0477, accuracy 0.9859
[epoch 19|10000] train loss 0.0031, accuracy 0.9998; test loss 0.0629, accuracy 0.9830
[epoch 20|10000] train loss 0.0022, accuracy 1.0000; test loss 0.0394, accuracy 0.9894

64 individuals in the feature matrix
[epoch 1|10000] train loss 0.2151, accuracy 0.9312; test loss 0.4580, accuracy 0.8275
[epoch 2|10000] train loss 0.0866, accuracy 0.9863; test loss 0.1446, accuracy 0.9574
[epoch 3|10000] train loss 0.0538, accuracy 0.9913; test loss 0.0464, accuracy 0.9928
[epoch 4|10000] train loss 0.0363, accuracy 0.9945; test loss 0.0349, accuracy 0.9940
[epoch 5|10000] train loss 0.0277, accuracy 0.9954; test loss 0.0385, accuracy 0.9906
[epoch 6|10000] train loss 0.0193, accuracy 0.9972; test loss 0.0316, accuracy 0.9924
[epoch 7|10000] train loss 0.0151, accuracy 0.9978; test loss 0.0325, accuracy 0.9941
[epoch 8|10000] train loss 0.0094, accuracy 0.9994; test loss 0.0290, accuracy 0.9950
[epoch 9|10000] train loss 0.0076, accuracy 0.9991; test loss 0.0537, accuracy 0.9818
[epoch 10|10000] train loss 0.0066, accuracy 0.9995; test loss 0.0385, accuracy 0.9916
[epoch 11|10000] train loss 0.0070, accuracy 0.9992; test loss 0.0281, accuracy 0.9938
[epoch 12|10000] train loss 0.0040, accuracy 0.9999; test loss 0.0299, accuracy 0.9943
[epoch 13|10000] train loss 0.0037, accuracy 0.9999; test loss 0.0357, accuracy 0.9906
[epoch 14|10000] train loss 0.0079, accuracy 0.9983; test loss 0.0337, accuracy 0.9931
[epoch 15|10000] train loss 0.0081, accuracy 0.9983; test loss 0.0402, accuracy 0.9921
[epoch 16|10000] train loss 0.0041, accuracy 0.9994; test loss 0.0325, accuracy 0.9941
[epoch 17|10000] train loss 0.0030, accuracy 0.9999; test loss 0.0439, accuracy 0.9862
[epoch 18|10000] train loss 0.0083, accuracy 0.9983; test loss 0.0328, accuracy 0.9937
[epoch 19|10000] train loss 0.0036, accuracy 0.9996; test loss 0.0305, accuracy 0.9942
[epoch 20|10000] train loss 0.0026, accuracy 0.9995; test loss 0.0338, accuracy 0.9915

96 individuals in the feature matrix
[epoch 1|10000] train loss 0.2107, accuracy 0.9334; test loss 0.4476, accuracy 0.8284
[epoch 2|10000] train loss 0.0794, accuracy 0.9877; test loss 0.1253, accuracy 0.9729
[epoch 3|10000] train loss 0.0468, accuracy 0.9940; test loss 0.0428, accuracy 0.9946
[epoch 4|10000] train loss 0.0339, accuracy 0.9959; test loss 0.0347, accuracy 0.9936
[epoch 5|10000] train loss 0.0246, accuracy 0.9965; test loss 0.0367, accuracy 0.9939
[epoch 6|10000] train loss 0.0197, accuracy 0.9969; test loss 0.0552, accuracy 0.9805
[epoch 7|10000] train loss 0.0143, accuracy 0.9982; test loss 0.0999, accuracy 0.9728
[epoch 8|10000] train loss 0.0110, accuracy 0.9987; test loss 0.0890, accuracy 0.9657
[epoch 9|10000] train loss 0.0184, accuracy 0.9968; test loss 0.0449, accuracy 0.9859
[epoch 10|10000] train loss 0.0102, accuracy 0.9989; test loss 0.0352, accuracy 0.9896
[epoch 11|10000] train loss 0.0054, accuracy 0.9997; test loss 0.0262, accuracy 0.9956
[epoch 12|10000] train loss 0.0045, accuracy 0.9996; test loss 0.0241, accuracy 0.9953
[epoch 13|10000] train loss 0.0042, accuracy 0.9998; test loss 0.0297, accuracy 0.9936
[epoch 14|10000] train loss 0.0055, accuracy 0.9993; test loss 0.0534, accuracy 0.9808
[epoch 15|10000] train loss 0.0039, accuracy 0.9993; test loss 0.0258, accuracy 0.9956
[epoch 16|10000] train loss 0.0042, accuracy 0.9995; test loss 0.0281, accuracy 0.9951
[epoch 17|10000] train loss 0.0054, accuracy 0.9996; test loss 0.0427, accuracy 0.9867
[epoch 18|10000] train loss 0.0026, accuracy 0.9997; test loss 0.0287, accuracy 0.9949
[epoch 19|10000] train loss 0.0038, accuracy 0.9995; test loss 0.0357, accuracy 0.9944
[epoch 20|10000] train loss 0.0028, accuracy 0.9997; test loss 0.0297, accuracy 0.9951

128 individuals in the feature matrix
[epoch 1|10000] train loss 0.2190, accuracy 0.9306; test loss 0.5093, accuracy 0.7101
[epoch 2|10000] train loss 0.0868, accuracy 0.9870; test loss 0.1055, accuracy 0.9823
[epoch 3|10000] train loss 0.0535, accuracy 0.9922; test loss 0.0443, accuracy 0.9943
[epoch 4|10000] train loss 0.0363, accuracy 0.9949; test loss 0.0460, accuracy 0.9866
[epoch 5|10000] train loss 0.0292, accuracy 0.9950; test loss 0.0336, accuracy 0.9926
[epoch 6|10000] train loss 0.0204, accuracy 0.9971; test loss 0.0310, accuracy 0.9932
[epoch 7|10000] train loss 0.0157, accuracy 0.9976; test loss 0.0339, accuracy 0.9945
[epoch 8|10000] train loss 0.0153, accuracy 0.9986; test loss 0.0929, accuracy 0.9786
[epoch 9|10000] train loss 0.0130, accuracy 0.9985; test loss 0.0487, accuracy 0.9828
[epoch 10|10000] train loss 0.0089, accuracy 0.9986; test loss 0.0437, accuracy 0.9923
[epoch 11|10000] train loss 0.0120, accuracy 0.9978; test loss 0.0301, accuracy 0.9940
[epoch 12|10000] train loss 0.0079, accuracy 0.9993; test loss 0.0276, accuracy 0.9944
[epoch 13|10000] train loss 0.0037, accuracy 0.9999; test loss 0.0268, accuracy 0.9949
[epoch 14|10000] train loss 0.0031, accuracy 0.9999; test loss 0.0349, accuracy 0.9941
[epoch 15|10000] train loss 0.0056, accuracy 0.9989; test loss 0.0540, accuracy 0.9907
[epoch 16|10000] train loss 0.0048, accuracy 0.9994; test loss 0.0282, accuracy 0.9936
[epoch 17|10000] train loss 0.0050, accuracy 0.9989; test loss 0.0311, accuracy 0.9926
[epoch 18|10000] train loss 0.0063, accuracy 0.9990; test loss 0.0476, accuracy 0.9922
[epoch 19|10000] train loss 0.0026, accuracy 0.9999; test loss 0.0254, accuracy 0.9946
[epoch 20|10000] train loss 0.0025, accuracy 0.9997; test loss 0.0282, accuracy 0.9927
import dinf
import matplotlib.pyplot as plt

fig, axs = plt.subplots(nrows=2, ncols=2, sharex="all", sharey="row", tight_layout=True)

for nind in (16, 32, 64, 96, 128):
    discriminator = dinf.Discriminator.from_file(
        f"out/accuracy/num_individuals-{nind}.pkl"
    )
    for ax, metric in zip(
        axs.flat, ("train_loss", "test_loss", "train_accuracy", "test_accuracy")
    ):
        y = discriminator.train_metrics[metric]
        epoch = range(1, len(y) + 1)
        ax.plot(epoch, y, label=str(nind))
        ax.set_title(metric)

axs[0, 0].legend(title="num_individuals")
axs[0, 0].set_ylabel("loss")
axs[1, 0].set_ylabel("accuracy")
axs[1, 0].set_xlabel("epoch")
axs[1, 1].set_xlabel("epoch")
plt.show(fig)
../_images/4b26a4b07e981d42bb663e239d6989e72fd835f401a8beb87529def334755845.svg

The results above suggest that the number of individuals has very little effect on the accuracy of the discriminator. This shouldn’t be a big surprise, as Dinf uses an exchangeable neural network that summarises information across individuals using (by default) the max function.

The bottleneck model uses feature matrices with maf_thresh=0.05, which excludes low frequency alleles. In empirical datasets, genotyping errors can produce false positive detection of rare alleles, which is mitigated by MAF filtering. With no MAF filter (and high-quality data) there may be a more pronounced increase in available information as the number of individuals is increased.

%%bash

mkdir -p out/accuracy
for seqlen in 50000 500000 5000000; do
    for nloci in 16 32 64 96 128; do
        echo num_loci=$nloci sequence_length=$seqlen
        sed -e "s/num_loci=[0-9]\+/num_loci=$nloci/" \
            -e "s/sequence_length = [_0-9]/sequence_length = $seqlen/" \
            ../../examples/bottleneck/model.py \
            > /tmp/model.py
        dinf train \
            --seed 1 \
            --epochs 20 \
            --training-replicates 10000 \
            --test-replicates 10000 \
            /tmp/model.py \
            out/accuracy/num_loci-${nloci}_sequence_length-${seqlen}.pkl
        echo
    done
done
Hide code cell output
num_loci=16 sequence_length=50000
[epoch 1|10000] train loss 0.4598, accuracy 0.7529; test loss 0.6227, accuracy 0.6169
[epoch 2|10000] train loss 0.4080, accuracy 0.8019; test loss 0.4735, accuracy 0.7639
[epoch 3|10000] train loss 0.3758, accuracy 0.8242; test loss 0.4023, accuracy 0.8136
[epoch 4|10000] train loss 0.3430, accuracy 0.8523; test loss 0.3919, accuracy 0.8349
[epoch 5|10000] train loss 0.3096, accuracy 0.8725; test loss 0.3945, accuracy 0.8326
[epoch 6|10000] train loss 0.2818, accuracy 0.8827; test loss 0.4213, accuracy 0.8205
[epoch 7|10000] train loss 0.2573, accuracy 0.8959; test loss 0.4017, accuracy 0.8341
[epoch 8|10000] train loss 0.2370, accuracy 0.9049; test loss 0.4006, accuracy 0.8373
[epoch 9|10000] train loss 0.2260, accuracy 0.9083; test loss 0.4015, accuracy 0.8351
[epoch 10|10000] train loss 0.2007, accuracy 0.9211; test loss 0.4281, accuracy 0.8395
[epoch 11|10000] train loss 0.1856, accuracy 0.9283; test loss 0.4410, accuracy 0.8403
[epoch 12|10000] train loss 0.1735, accuracy 0.9318; test loss 0.4928, accuracy 0.8361
[epoch 13|10000] train loss 0.1606, accuracy 0.9379; test loss 0.4587, accuracy 0.8370
[epoch 14|10000] train loss 0.1388, accuracy 0.9483; test loss 0.4821, accuracy 0.8333
[epoch 15|10000] train loss 0.1292, accuracy 0.9538; test loss 0.6307, accuracy 0.8235
[epoch 16|10000] train loss 0.1249, accuracy 0.9549; test loss 0.5678, accuracy 0.8345
[epoch 17|10000] train loss 0.1117, accuracy 0.9593; test loss 0.5563, accuracy 0.8258
[epoch 18|10000] train loss 0.0954, accuracy 0.9662; test loss 0.5903, accuracy 0.8214
[epoch 19|10000] train loss 0.0935, accuracy 0.9673; test loss 0.6018, accuracy 0.8320
[epoch 20|10000] train loss 0.0830, accuracy 0.9729; test loss 0.5853, accuracy 0.8363

num_loci=32 sequence_length=50000
[epoch 1|10000] train loss 0.4567, accuracy 0.7565; test loss 0.5677, accuracy 0.6507
[epoch 2|10000] train loss 0.3869, accuracy 0.8187; test loss 0.4946, accuracy 0.7494
[epoch 3|10000] train loss 0.3432, accuracy 0.8507; test loss 0.4011, accuracy 0.8201
[epoch 4|10000] train loss 0.2969, accuracy 0.8732; test loss 0.3814, accuracy 0.8384
[epoch 5|10000] train loss 0.2605, accuracy 0.8907; test loss 0.3766, accuracy 0.8403
[epoch 6|10000] train loss 0.2305, accuracy 0.9050; test loss 0.4252, accuracy 0.8198
[epoch 7|10000] train loss 0.2045, accuracy 0.9167; test loss 0.4151, accuracy 0.8430
[epoch 8|10000] train loss 0.1840, accuracy 0.9262; test loss 0.4078, accuracy 0.8451
[epoch 9|10000] train loss 0.1658, accuracy 0.9369; test loss 0.4144, accuracy 0.8405
[epoch 10|10000] train loss 0.1417, accuracy 0.9486; test loss 0.5886, accuracy 0.8268
[epoch 11|10000] train loss 0.1260, accuracy 0.9569; test loss 0.4993, accuracy 0.8415
[epoch 12|10000] train loss 0.1148, accuracy 0.9585; test loss 0.5670, accuracy 0.8363
[epoch 13|10000] train loss 0.0990, accuracy 0.9669; test loss 0.5235, accuracy 0.8401
[epoch 14|10000] train loss 0.0857, accuracy 0.9713; test loss 0.6919, accuracy 0.8188
[epoch 15|10000] train loss 0.0743, accuracy 0.9775; test loss 0.5811, accuracy 0.8268
[epoch 16|10000] train loss 0.0649, accuracy 0.9841; test loss 0.6994, accuracy 0.8267
[epoch 17|10000] train loss 0.0644, accuracy 0.9801; test loss 0.6622, accuracy 0.8195
[epoch 18|10000] train loss 0.0563, accuracy 0.9835; test loss 0.7056, accuracy 0.8130
[epoch 19|10000] train loss 0.0443, accuracy 0.9893; test loss 0.6773, accuracy 0.8242
[epoch 20|10000] train loss 0.0380, accuracy 0.9912; test loss 0.6819, accuracy 0.8300

num_loci=64 sequence_length=50000
[epoch 1|10000] train loss 0.4630, accuracy 0.7520; test loss 0.5321, accuracy 0.7803
[epoch 2|10000] train loss 0.3954, accuracy 0.8144; test loss 0.4738, accuracy 0.7740
[epoch 3|10000] train loss 0.3467, accuracy 0.8468; test loss 0.3803, accuracy 0.8300
[epoch 4|10000] train loss 0.3023, accuracy 0.8710; test loss 0.3858, accuracy 0.8234
[epoch 5|10000] train loss 0.2668, accuracy 0.8897; test loss 0.3652, accuracy 0.8485
[epoch 6|10000] train loss 0.2367, accuracy 0.9037; test loss 0.3743, accuracy 0.8346
[epoch 7|10000] train loss 0.2041, accuracy 0.9222; test loss 0.4202, accuracy 0.8456
[epoch 8|10000] train loss 0.1818, accuracy 0.9344; test loss 0.4043, accuracy 0.8387
[epoch 9|10000] train loss 0.1628, accuracy 0.9384; test loss 0.4084, accuracy 0.8484
[epoch 10|10000] train loss 0.1364, accuracy 0.9534; test loss 0.5590, accuracy 0.8309
[epoch 11|10000] train loss 0.1186, accuracy 0.9605; test loss 0.5519, accuracy 0.8095
[epoch 12|10000] train loss 0.1044, accuracy 0.9661; test loss 0.5378, accuracy 0.8424
[epoch 13|10000] train loss 0.0957, accuracy 0.9669; test loss 0.6044, accuracy 0.8012
[epoch 14|10000] train loss 0.0836, accuracy 0.9730; test loss 0.5276, accuracy 0.8318
[epoch 15|10000] train loss 0.0714, accuracy 0.9791; test loss 0.5913, accuracy 0.8323
[epoch 16|10000] train loss 0.0601, accuracy 0.9842; test loss 0.6678, accuracy 0.8324
[epoch 17|10000] train loss 0.0530, accuracy 0.9850; test loss 0.6291, accuracy 0.8343
[epoch 18|10000] train loss 0.0497, accuracy 0.9857; test loss 0.6552, accuracy 0.8116
[epoch 19|10000] train loss 0.0411, accuracy 0.9911; test loss 0.7043, accuracy 0.8186
[epoch 20|10000] train loss 0.0400, accuracy 0.9891; test loss 0.6458, accuracy 0.8341

num_loci=96 sequence_length=50000
[epoch 1|10000] train loss 0.4639, accuracy 0.7498; test loss 0.5382, accuracy 0.7824
[epoch 2|10000] train loss 0.3982, accuracy 0.8123; test loss 0.4951, accuracy 0.7499
[epoch 3|10000] train loss 0.3516, accuracy 0.8443; test loss 0.4203, accuracy 0.7982
[epoch 4|10000] train loss 0.3068, accuracy 0.8690; test loss 0.3988, accuracy 0.8089
[epoch 5|10000] train loss 0.2761, accuracy 0.8838; test loss 0.3536, accuracy 0.8432
[epoch 6|10000] train loss 0.2477, accuracy 0.8983; test loss 0.4010, accuracy 0.8123
[epoch 7|10000] train loss 0.2178, accuracy 0.9105; test loss 0.3743, accuracy 0.8350
[epoch 8|10000] train loss 0.1980, accuracy 0.9234; test loss 0.3746, accuracy 0.8462
[epoch 9|10000] train loss 0.1749, accuracy 0.9336; test loss 0.4019, accuracy 0.8437
[epoch 10|10000] train loss 0.1550, accuracy 0.9424; test loss 0.3824, accuracy 0.8508
[epoch 11|10000] train loss 0.1377, accuracy 0.9528; test loss 0.5820, accuracy 0.7885
[epoch 12|10000] train loss 0.1268, accuracy 0.9520; test loss 0.4564, accuracy 0.8478
[epoch 13|10000] train loss 0.1100, accuracy 0.9650; test loss 0.4788, accuracy 0.8188
[epoch 14|10000] train loss 0.0972, accuracy 0.9703; test loss 0.6715, accuracy 0.8181
[epoch 15|10000] train loss 0.0944, accuracy 0.9688; test loss 0.5528, accuracy 0.8208
[epoch 16|10000] train loss 0.0772, accuracy 0.9766; test loss 0.5863, accuracy 0.8296
[epoch 17|10000] train loss 0.0692, accuracy 0.9794; test loss 0.5198, accuracy 0.8357
[epoch 18|10000] train loss 0.0608, accuracy 0.9831; test loss 0.5479, accuracy 0.8386
[epoch 19|10000] train loss 0.0611, accuracy 0.9847; test loss 0.6241, accuracy 0.8051
[epoch 20|10000] train loss 0.0491, accuracy 0.9874; test loss 0.9259, accuracy 0.8051

num_loci=128 sequence_length=50000
[epoch 1|10000] train loss 0.4634, accuracy 0.7494; test loss 0.5339, accuracy 0.7930
[epoch 2|10000] train loss 0.3971, accuracy 0.8154; test loss 0.4935, accuracy 0.7524
[epoch 3|10000] train loss 0.3573, accuracy 0.8429; test loss 0.4399, accuracy 0.7891
[epoch 4|10000] train loss 0.3163, accuracy 0.8663; test loss 0.4973, accuracy 0.7410
[epoch 5|10000] train loss 0.2845, accuracy 0.8821; test loss 0.3487, accuracy 0.8442
[epoch 6|10000] train loss 0.2626, accuracy 0.8924; test loss 0.4651, accuracy 0.7742
[epoch 7|10000] train loss 0.2323, accuracy 0.9050; test loss 0.3463, accuracy 0.8495
[epoch 8|10000] train loss 0.2157, accuracy 0.9150; test loss 0.3661, accuracy 0.8466
[epoch 9|10000] train loss 0.1948, accuracy 0.9217; test loss 0.4012, accuracy 0.8436
[epoch 10|10000] train loss 0.1793, accuracy 0.9309; test loss 0.5248, accuracy 0.8312
[epoch 11|10000] train loss 0.1567, accuracy 0.9446; test loss 0.4515, accuracy 0.8254
[epoch 12|10000] train loss 0.1437, accuracy 0.9465; test loss 0.4768, accuracy 0.8429
[epoch 13|10000] train loss 0.1313, accuracy 0.9542; test loss 0.4622, accuracy 0.8212
[epoch 14|10000] train loss 0.1189, accuracy 0.9591; test loss 0.8318, accuracy 0.7892
[epoch 15|10000] train loss 0.1139, accuracy 0.9583; test loss 0.4777, accuracy 0.8411
[epoch 16|10000] train loss 0.0962, accuracy 0.9693; test loss 0.9666, accuracy 0.7635
[epoch 17|10000] train loss 0.0970, accuracy 0.9674; test loss 0.4790, accuracy 0.8462
[epoch 18|10000] train loss 0.0805, accuracy 0.9775; test loss 0.5138, accuracy 0.8303
[epoch 19|10000] train loss 0.0779, accuracy 0.9773; test loss 0.7205, accuracy 0.7651
[epoch 20|10000] train loss 0.0634, accuracy 0.9844; test loss 0.5770, accuracy 0.8385

num_loci=16 sequence_length=500000
[epoch 1|10000] train loss 0.3337, accuracy 0.8284; test loss 0.4803, accuracy 0.7790
[epoch 2|10000] train loss 0.2431, accuracy 0.9145; test loss 0.2797, accuracy 0.9075
[epoch 3|10000] train loss 0.1858, accuracy 0.9406; test loss 0.1726, accuracy 0.9384
[epoch 4|10000] train loss 0.1450, accuracy 0.9481; test loss 0.1511, accuracy 0.9467
[epoch 5|10000] train loss 0.1276, accuracy 0.9547; test loss 0.1462, accuracy 0.9495
[epoch 6|10000] train loss 0.1038, accuracy 0.9654; test loss 0.1458, accuracy 0.9474
[epoch 7|10000] train loss 0.0986, accuracy 0.9658; test loss 0.1395, accuracy 0.9524
[epoch 8|10000] train loss 0.0796, accuracy 0.9745; test loss 0.1377, accuracy 0.9532
[epoch 9|10000] train loss 0.0687, accuracy 0.9791; test loss 0.1612, accuracy 0.9501
[epoch 10|10000] train loss 0.0630, accuracy 0.9795; test loss 0.1560, accuracy 0.9525
[epoch 11|10000] train loss 0.0502, accuracy 0.9850; test loss 0.1417, accuracy 0.9528
[epoch 12|10000] train loss 0.0556, accuracy 0.9830; test loss 0.2030, accuracy 0.9435
[epoch 13|10000] train loss 0.0519, accuracy 0.9827; test loss 0.1501, accuracy 0.9481
[epoch 14|10000] train loss 0.0349, accuracy 0.9903; test loss 0.1480, accuracy 0.9521
[epoch 15|10000] train loss 0.0319, accuracy 0.9908; test loss 0.1597, accuracy 0.9530
[epoch 16|10000] train loss 0.0267, accuracy 0.9932; test loss 0.2297, accuracy 0.9417
[epoch 17|10000] train loss 0.0326, accuracy 0.9894; test loss 0.1667, accuracy 0.9523
[epoch 18|10000] train loss 0.0244, accuracy 0.9931; test loss 0.1588, accuracy 0.9566
[epoch 19|10000] train loss 0.0253, accuracy 0.9924; test loss 0.1746, accuracy 0.9547
[epoch 20|10000] train loss 0.0221, accuracy 0.9952; test loss 0.2006, accuracy 0.9419

num_loci=32 sequence_length=500000
[epoch 1|10000] train loss 0.2998, accuracy 0.8724; test loss 0.4990, accuracy 0.7342
[epoch 2|10000] train loss 0.1736, accuracy 0.9468; test loss 0.3010, accuracy 0.8664
[epoch 3|10000] train loss 0.1290, accuracy 0.9604; test loss 0.1282, accuracy 0.9565
[epoch 4|10000] train loss 0.0910, accuracy 0.9747; test loss 0.1143, accuracy 0.9629
[epoch 5|10000] train loss 0.0769, accuracy 0.9792; test loss 0.1202, accuracy 0.9632
[epoch 6|10000] train loss 0.0645, accuracy 0.9829; test loss 0.1256, accuracy 0.9576
[epoch 7|10000] train loss 0.0655, accuracy 0.9820; test loss 0.1179, accuracy 0.9598
[epoch 8|10000] train loss 0.0506, accuracy 0.9879; test loss 0.1165, accuracy 0.9622
[epoch 9|10000] train loss 0.0418, accuracy 0.9894; test loss 0.1201, accuracy 0.9625
[epoch 10|10000] train loss 0.0312, accuracy 0.9932; test loss 0.1512, accuracy 0.9560
[epoch 11|10000] train loss 0.0264, accuracy 0.9937; test loss 0.1246, accuracy 0.9638
[epoch 12|10000] train loss 0.0231, accuracy 0.9956; test loss 0.1606, accuracy 0.9566
[epoch 13|10000] train loss 0.0178, accuracy 0.9968; test loss 0.1302, accuracy 0.9660
[epoch 14|10000] train loss 0.0297, accuracy 0.9920; test loss 0.1571, accuracy 0.9521
[epoch 15|10000] train loss 0.0234, accuracy 0.9948; test loss 0.1614, accuracy 0.9635
[epoch 16|10000] train loss 0.0163, accuracy 0.9963; test loss 0.1604, accuracy 0.9625
[epoch 17|10000] train loss 0.0118, accuracy 0.9977; test loss 0.1683, accuracy 0.9651
[epoch 18|10000] train loss 0.0293, accuracy 0.9927; test loss 0.2430, accuracy 0.9282
[epoch 19|10000] train loss 0.0241, accuracy 0.9938; test loss 0.1680, accuracy 0.9594
[epoch 20|10000] train loss 0.0130, accuracy 0.9974; test loss 0.1357, accuracy 0.9615

num_loci=64 sequence_length=500000
[epoch 1|10000] train loss 0.3091, accuracy 0.8598; test loss 0.4604, accuracy 0.8014
[epoch 2|10000] train loss 0.1902, accuracy 0.9394; test loss 0.2033, accuracy 0.9276
[epoch 3|10000] train loss 0.1379, accuracy 0.9602; test loss 0.1247, accuracy 0.9589
[epoch 4|10000] train loss 0.1010, accuracy 0.9737; test loss 0.1109, accuracy 0.9622
[epoch 5|10000] train loss 0.0806, accuracy 0.9804; test loss 0.1368, accuracy 0.9480
[epoch 6|10000] train loss 0.0668, accuracy 0.9823; test loss 0.1047, accuracy 0.9632
[epoch 7|10000] train loss 0.0506, accuracy 0.9899; test loss 0.1054, accuracy 0.9646
[epoch 8|10000] train loss 0.0396, accuracy 0.9918; test loss 0.2988, accuracy 0.8877
[epoch 9|10000] train loss 0.0377, accuracy 0.9914; test loss 0.1105, accuracy 0.9635
[epoch 10|10000] train loss 0.0270, accuracy 0.9949; test loss 0.1804, accuracy 0.9417
[epoch 11|10000] train loss 0.0240, accuracy 0.9962; test loss 0.1047, accuracy 0.9685
[epoch 12|10000] train loss 0.0199, accuracy 0.9967; test loss 0.1146, accuracy 0.9688
[epoch 13|10000] train loss 0.0211, accuracy 0.9957; test loss 0.1631, accuracy 0.9471
[epoch 14|10000] train loss 0.0137, accuracy 0.9980; test loss 0.0977, accuracy 0.9706
[epoch 15|10000] train loss 0.0120, accuracy 0.9982; test loss 0.2470, accuracy 0.9354
[epoch 16|10000] train loss 0.0099, accuracy 0.9987; test loss 0.1344, accuracy 0.9583
[epoch 17|10000] train loss 0.0097, accuracy 0.9991; test loss 0.1280, accuracy 0.9660
[epoch 18|10000] train loss 0.0076, accuracy 0.9993; test loss 0.1212, accuracy 0.9682
[epoch 19|10000] train loss 0.0181, accuracy 0.9945; test loss 0.1312, accuracy 0.9621
[epoch 20|10000] train loss 0.0132, accuracy 0.9970; test loss 0.1622, accuracy 0.9584

num_loci=96 sequence_length=500000
[epoch 1|10000] train loss 0.3153, accuracy 0.8555; test loss 0.4379, accuracy 0.8448
[epoch 2|10000] train loss 0.2097, accuracy 0.9240; test loss 0.1986, accuracy 0.9415
[epoch 3|10000] train loss 0.1582, accuracy 0.9489; test loss 0.1416, accuracy 0.9513
[epoch 4|10000] train loss 0.1153, accuracy 0.9699; test loss 0.1446, accuracy 0.9468
[epoch 5|10000] train loss 0.0943, accuracy 0.9747; test loss 0.1222, accuracy 0.9558
[epoch 6|10000] train loss 0.0774, accuracy 0.9799; test loss 0.1223, accuracy 0.9551
[epoch 7|10000] train loss 0.0569, accuracy 0.9877; test loss 0.1539, accuracy 0.9435
[epoch 8|10000] train loss 0.0478, accuracy 0.9891; test loss 0.2981, accuracy 0.8906
[epoch 9|10000] train loss 0.0395, accuracy 0.9918; test loss 0.1095, accuracy 0.9605
[epoch 10|10000] train loss 0.0303, accuracy 0.9939; test loss 0.3223, accuracy 0.9033
[epoch 11|10000] train loss 0.0274, accuracy 0.9951; test loss 0.1167, accuracy 0.9608
[epoch 12|10000] train loss 0.0210, accuracy 0.9964; test loss 0.1595, accuracy 0.9551
[epoch 13|10000] train loss 0.0260, accuracy 0.9944; test loss 0.1298, accuracy 0.9595
[epoch 14|10000] train loss 0.0141, accuracy 0.9977; test loss 0.1230, accuracy 0.9561
[epoch 15|10000] train loss 0.0096, accuracy 0.9993; test loss 0.4079, accuracy 0.9100
[epoch 16|10000] train loss 0.0103, accuracy 0.9984; test loss 0.2377, accuracy 0.9365
[epoch 17|10000] train loss 0.0167, accuracy 0.9961; test loss 0.2932, accuracy 0.9308
[epoch 18|10000] train loss 0.0184, accuracy 0.9945; test loss 0.3326, accuracy 0.9282
[epoch 19|10000] train loss 0.0138, accuracy 0.9969; test loss 0.1238, accuracy 0.9616
[epoch 20|10000] train loss 0.0095, accuracy 0.9983; test loss 0.1559, accuracy 0.9559

num_loci=128 sequence_length=500000
[epoch 1|10000] train loss 0.3144, accuracy 0.8543; test loss 0.4008, accuracy 0.8622
[epoch 2|10000] train loss 0.2237, accuracy 0.9126; test loss 0.2123, accuracy 0.9376
[epoch 3|10000] train loss 0.1755, accuracy 0.9433; test loss 0.1600, accuracy 0.9407
[epoch 4|10000] train loss 0.1309, accuracy 0.9629; test loss 0.2177, accuracy 0.9060
[epoch 5|10000] train loss 0.1059, accuracy 0.9694; test loss 0.1340, accuracy 0.9501
[epoch 6|10000] train loss 0.0857, accuracy 0.9749; test loss 0.2583, accuracy 0.8958
[epoch 7|10000] train loss 0.0687, accuracy 0.9823; test loss 0.1400, accuracy 0.9472
[epoch 8|10000] train loss 0.0529, accuracy 0.9883; test loss 0.2982, accuracy 0.8868
[epoch 9|10000] train loss 0.0420, accuracy 0.9908; test loss 0.1333, accuracy 0.9549
[epoch 10|10000] train loss 0.0331, accuracy 0.9937; test loss 0.7273, accuracy 0.8466
[epoch 11|10000] train loss 0.0291, accuracy 0.9941; test loss 0.1327, accuracy 0.9558
[epoch 12|10000] train loss 0.0283, accuracy 0.9935; test loss 0.2020, accuracy 0.9493
[epoch 13|10000] train loss 0.0239, accuracy 0.9952; test loss 0.1363, accuracy 0.9605
[epoch 14|10000] train loss 0.0177, accuracy 0.9969; test loss 0.1245, accuracy 0.9619
[epoch 15|10000] train loss 0.0127, accuracy 0.9977; test loss 0.3098, accuracy 0.9256
[epoch 16|10000] train loss 0.0116, accuracy 0.9980; test loss 0.3644, accuracy 0.9192
[epoch 17|10000] train loss 0.0165, accuracy 0.9955; test loss 0.2072, accuracy 0.9343
[epoch 18|10000] train loss 0.0118, accuracy 0.9976; test loss 0.1496, accuracy 0.9578
[epoch 19|10000] train loss 0.0145, accuracy 0.9961; test loss 0.1567, accuracy 0.9568
[epoch 20|10000] train loss 0.0163, accuracy 0.9963; test loss 0.1612, accuracy 0.9543

num_loci=16 sequence_length=5000000
[epoch 1|10000] train loss 0.3869, accuracy 0.8529; test loss 0.5314, accuracy 0.7218
[epoch 2|10000] train loss 0.2479, accuracy 0.9107; test loss 0.3764, accuracy 0.8263
[epoch 3|10000] train loss 0.1600, accuracy 0.9426; test loss 0.1744, accuracy 0.9292
[epoch 4|10000] train loss 0.1316, accuracy 0.9509; test loss 0.1512, accuracy 0.9399
[epoch 5|10000] train loss 0.1114, accuracy 0.9575; test loss 0.1593, accuracy 0.9356
[epoch 6|10000] train loss 0.0994, accuracy 0.9625; test loss 0.1353, accuracy 0.9472
[epoch 7|10000] train loss 0.0871, accuracy 0.9676; test loss 0.1758, accuracy 0.9336
[epoch 8|10000] train loss 0.0730, accuracy 0.9755; test loss 0.2157, accuracy 0.9179
[epoch 9|10000] train loss 0.0649, accuracy 0.9783; test loss 0.1447, accuracy 0.9450
[epoch 10|10000] train loss 0.0560, accuracy 0.9808; test loss 0.1831, accuracy 0.9424
[epoch 11|10000] train loss 0.0495, accuracy 0.9834; test loss 0.1704, accuracy 0.9377
[epoch 12|10000] train loss 0.0431, accuracy 0.9868; test loss 0.1481, accuracy 0.9496
[epoch 13|10000] train loss 0.0407, accuracy 0.9864; test loss 0.1471, accuracy 0.9489
[epoch 14|10000] train loss 0.0385, accuracy 0.9871; test loss 0.1564, accuracy 0.9487
[epoch 15|10000] train loss 0.0310, accuracy 0.9904; test loss 0.1821, accuracy 0.9466
[epoch 16|10000] train loss 0.0281, accuracy 0.9914; test loss 0.1835, accuracy 0.9476
[epoch 17|10000] train loss 0.0302, accuracy 0.9906; test loss 0.1714, accuracy 0.9464
[epoch 18|10000] train loss 0.0361, accuracy 0.9882; test loss 0.1649, accuracy 0.9499
[epoch 19|10000] train loss 0.0314, accuracy 0.9901; test loss 0.2039, accuracy 0.9456
[epoch 20|10000] train loss 0.0188, accuracy 0.9946; test loss 0.1580, accuracy 0.9508

num_loci=32 sequence_length=5000000
[epoch 1|10000] train loss 0.3250, accuracy 0.9152; test loss 0.3425, accuracy 0.9451
[epoch 2|10000] train loss 0.1555, accuracy 0.9591; test loss 0.1557, accuracy 0.9608
[epoch 3|10000] train loss 0.0876, accuracy 0.9749; test loss 0.0900, accuracy 0.9706
[epoch 4|10000] train loss 0.0614, accuracy 0.9815; test loss 0.0679, accuracy 0.9734
[epoch 5|10000] train loss 0.0478, accuracy 0.9865; test loss 0.0605, accuracy 0.9763
[epoch 6|10000] train loss 0.0451, accuracy 0.9856; test loss 0.0561, accuracy 0.9790
[epoch 7|10000] train loss 0.0309, accuracy 0.9926; test loss 0.0613, accuracy 0.9784
[epoch 8|10000] train loss 0.0268, accuracy 0.9927; test loss 0.0483, accuracy 0.9841
[epoch 9|10000] train loss 0.0244, accuracy 0.9931; test loss 0.0938, accuracy 0.9651
[epoch 10|10000] train loss 0.0189, accuracy 0.9956; test loss 0.0738, accuracy 0.9732
[epoch 11|10000] train loss 0.0219, accuracy 0.9939; test loss 0.0582, accuracy 0.9802
[epoch 12|10000] train loss 0.0184, accuracy 0.9950; test loss 0.0584, accuracy 0.9806
[epoch 13|10000] train loss 0.0112, accuracy 0.9978; test loss 0.0489, accuracy 0.9846
[epoch 14|10000] train loss 0.0101, accuracy 0.9983; test loss 0.0500, accuracy 0.9847
[epoch 15|10000] train loss 0.0215, accuracy 0.9926; test loss 0.1290, accuracy 0.9670
[epoch 16|10000] train loss 0.0111, accuracy 0.9975; test loss 0.0470, accuracy 0.9856
[epoch 17|10000] train loss 0.0086, accuracy 0.9980; test loss 0.0523, accuracy 0.9830
[epoch 18|10000] train loss 0.0077, accuracy 0.9988; test loss 0.0532, accuracy 0.9838
[epoch 19|10000] train loss 0.0133, accuracy 0.9959; test loss 0.0560, accuracy 0.9828
[epoch 20|10000] train loss 0.0065, accuracy 0.9984; test loss 0.0506, accuracy 0.9834

num_loci=64 sequence_length=5000000
[epoch 1|10000] train loss 0.3173, accuracy 0.7906; test loss 0.2455, accuracy 0.8789
[epoch 2|10000] train loss 0.1758, accuracy 0.9500; test loss 0.1672, accuracy 0.9755
[epoch 3|10000] train loss 0.0865, accuracy 0.9829; test loss 0.0745, accuracy 0.9885
[epoch 4|10000] train loss 0.0554, accuracy 0.9881; test loss 0.0482, accuracy 0.9895
[epoch 5|10000] train loss 0.0425, accuracy 0.9905; test loss 0.0485, accuracy 0.9913
[epoch 6|10000] train loss 0.0400, accuracy 0.9906; test loss 0.0299, accuracy 0.9951
[epoch 7|10000] train loss 0.0297, accuracy 0.9930; test loss 0.0321, accuracy 0.9935
[epoch 8|10000] train loss 0.0254, accuracy 0.9950; test loss 0.0271, accuracy 0.9945
[epoch 9|10000] train loss 0.0193, accuracy 0.9950; test loss 0.0448, accuracy 0.9851
[epoch 10|10000] train loss 0.0168, accuracy 0.9962; test loss 0.0324, accuracy 0.9921
[epoch 11|10000] train loss 0.0153, accuracy 0.9964; test loss 0.0253, accuracy 0.9950
[epoch 12|10000] train loss 0.0160, accuracy 0.9963; test loss 0.0272, accuracy 0.9943
[epoch 13|10000] train loss 0.0116, accuracy 0.9973; test loss 0.0251, accuracy 0.9944
[epoch 14|10000] train loss 0.0089, accuracy 0.9984; test loss 0.0232, accuracy 0.9942
[epoch 15|10000] train loss 0.0104, accuracy 0.9974; test loss 0.0226, accuracy 0.9942
[epoch 16|10000] train loss 0.0063, accuracy 0.9989; test loss 0.0259, accuracy 0.9937
[epoch 17|10000] train loss 0.0089, accuracy 0.9978; test loss 0.0327, accuracy 0.9896
[epoch 18|10000] train loss 0.0072, accuracy 0.9982; test loss 0.0308, accuracy 0.9935
[epoch 19|10000] train loss 0.0051, accuracy 0.9993; test loss 0.0476, accuracy 0.9847
[epoch 20|10000] train loss 0.0113, accuracy 0.9972; test loss 0.0361, accuracy 0.9917

num_loci=96 sequence_length=5000000
[epoch 1|10000] train loss 0.3055, accuracy 0.8654; test loss 0.3013, accuracy 0.9235
[epoch 2|10000] train loss 0.1030, accuracy 0.9850; test loss 0.1194, accuracy 0.9738
[epoch 3|10000] train loss 0.0568, accuracy 0.9903; test loss 0.0523, accuracy 0.9906
[epoch 4|10000] train loss 0.0436, accuracy 0.9927; test loss 0.0367, accuracy 0.9929
[epoch 5|10000] train loss 0.0332, accuracy 0.9941; test loss 0.0317, accuracy 0.9950
[epoch 6|10000] train loss 0.0284, accuracy 0.9947; test loss 0.0251, accuracy 0.9957
[epoch 7|10000] train loss 0.0224, accuracy 0.9954; test loss 0.0248, accuracy 0.9957
[epoch 8|10000] train loss 0.0190, accuracy 0.9961; test loss 0.0228, accuracy 0.9953
[epoch 9|10000] train loss 0.0143, accuracy 0.9975; test loss 0.0397, accuracy 0.9876
[epoch 10|10000] train loss 0.0139, accuracy 0.9976; test loss 0.0214, accuracy 0.9947
[epoch 11|10000] train loss 0.0092, accuracy 0.9986; test loss 0.0223, accuracy 0.9956
[epoch 12|10000] train loss 0.0061, accuracy 0.9996; test loss 0.0204, accuracy 0.9962
[epoch 13|10000] train loss 0.0065, accuracy 0.9990; test loss 0.0332, accuracy 0.9912
[epoch 14|10000] train loss 0.0064, accuracy 0.9986; test loss 0.0369, accuracy 0.9944
[epoch 15|10000] train loss 0.0068, accuracy 0.9981; test loss 0.0357, accuracy 0.9883
[epoch 16|10000] train loss 0.0064, accuracy 0.9982; test loss 0.0309, accuracy 0.9925
[epoch 17|10000] train loss 0.0080, accuracy 0.9986; test loss 2.2114, accuracy 0.7420
[epoch 18|10000] train loss 0.0136, accuracy 0.9974; test loss 0.0641, accuracy 0.9885
[epoch 19|10000] train loss 0.0053, accuracy 0.9991; test loss 0.0272, accuracy 0.9918
[epoch 20|10000] train loss 0.0050, accuracy 0.9992; test loss 0.0207, accuracy 0.9944

num_loci=128 sequence_length=5000000
[epoch 1|10000] train loss 0.2239, accuracy 0.9321; test loss 0.3089, accuracy 0.9575
[epoch 2|10000] train loss 0.0811, accuracy 0.9872; test loss 0.0948, accuracy 0.9875
[epoch 3|10000] train loss 0.0490, accuracy 0.9914; test loss 0.0464, accuracy 0.9927
[epoch 4|10000] train loss 0.0404, accuracy 0.9927; test loss 0.0309, accuracy 0.9948
[epoch 5|10000] train loss 0.0312, accuracy 0.9949; test loss 0.0309, accuracy 0.9954
[epoch 6|10000] train loss 0.0240, accuracy 0.9956; test loss 0.0244, accuracy 0.9952
[epoch 7|10000] train loss 0.0193, accuracy 0.9971; test loss 0.0579, accuracy 0.9783
[epoch 8|10000] train loss 0.0187, accuracy 0.9960; test loss 0.0270, accuracy 0.9952
[epoch 9|10000] train loss 0.0146, accuracy 0.9976; test loss 0.0270, accuracy 0.9929
[epoch 10|10000] train loss 0.0184, accuracy 0.9961; test loss 0.0461, accuracy 0.9843
[epoch 11|10000] train loss 0.0115, accuracy 0.9986; test loss 0.0214, accuracy 0.9960
[epoch 12|10000] train loss 0.0074, accuracy 0.9988; test loss 0.0210, accuracy 0.9962
[epoch 13|10000] train loss 0.0065, accuracy 0.9990; test loss 0.0245, accuracy 0.9955
[epoch 14|10000] train loss 0.0120, accuracy 0.9975; test loss 0.0255, accuracy 0.9960
[epoch 15|10000] train loss 0.0098, accuracy 0.9979; test loss 0.0238, accuracy 0.9956
[epoch 16|10000] train loss 0.0054, accuracy 0.9994; test loss 0.0200, accuracy 0.9954
[epoch 17|10000] train loss 0.0052, accuracy 0.9992; test loss 0.0237, accuracy 0.9962
[epoch 18|10000] train loss 0.0038, accuracy 0.9995; test loss 0.0299, accuracy 0.9918
[epoch 19|10000] train loss 0.0041, accuracy 0.9993; test loss 0.0207, accuracy 0.9953
[epoch 20|10000] train loss 0.0022, accuracy 1.0000; test loss 0.0230, accuracy 0.9960
import dinf
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from cycler import cycler

fig, axs = plt.subplots(
    nrows=2,
    ncols=2,
    sharex="all",
    sharey="row",
    tight_layout=True,
    subplot_kw=dict(yscale="log"),
)
cmap = plt.get_cmap("Set2")
cycle = cycler(linestyle=["-", "--", ":"]) * cycler(color=[cmap(i) for i in range(5)])
for ax in axs.flat:
    ax.set_prop_cycle(cycle)

for seqlen, seqlenlabel in zip((50000, 500000, 5000000), ("50kb", "500kb", "5mb")):
    for nloci in (16, 32, 64, 96, 128):
        discriminator = dinf.Discriminator.from_file(
            f"out/accuracy/num_loci-{nloci}_sequence_length-{seqlen}.pkl"
        )
        for ax, metric in zip(
            axs.flat, ("train_loss", "test_loss", "train_accuracy", "test_accuracy")
        ):
            y = discriminator.train_metrics[metric]
            epoch = range(1, len(y) + 1)
            ax.plot(epoch, y, label=f"{nloci} / {seqlenlabel}")
            ax.set_title(metric)

handles, labels = axs[0, 0].get_legend_handles_labels()
fig.legend(
    handles,
    labels,
    title="num_loci / sequence_length",
    bbox_to_anchor=(0.5, 1.15),
    loc="upper center",
    borderaxespad=0.0,
    ncol=3,
)
axs[0, 0].set_ylabel("loss")
axs[1, 0].set_ylabel("accuracy")
axs[1, 0].set_xlabel("epoch")
axs[1, 1].set_xlabel("epoch")
plt.show(fig)
../_images/cd08200b5fa5ed8f9d6e0b43b2f381f60c58aa75a89968bd607c01a01d079f89.svg

Discriminator network capacity

Assuming the features contain sufficient information, the discriminator network needs to be able to extract this. The capacity of the network can be increased by increasing the number of trainable neural network parameters

  • Deeper network

  • More filters in convolution layers

  • More neurons in fully connected (dense) layers

Increased network capacity comes at a cost.

  • Need to train for longer (more training replicates! more epochs?).

  • Can overfit more easily.