Model | WDNN | DeepAMR | CNNGWP | TB-DROP |
---|---|---|---|---|
Features | 1. Wide: Memorization 2. Deep: generalization 3. Custom loss and metrics functions 4. Remove rare variants | 1. Denoising autoencoder 2. Cyclical learning rate | 1. CNN 2. Normalization of input 3. MAF cleaning | Fully connected |
Advantages | 1. Allow missing labels 2. Batch Normalization | 1. Non-linear dimension reduction 2. Quicker converge | 1. Convolution: sparse interactions, parameter sharing and equivariant representations 2. Pooling: approximately invariant to small change of the input | Explore all interactions |
Drawbacks | Too many neurons | Not allow missing labels | Test datasets were also used as validation datasets | Too many neurons |
Regularization | 1. Multi-task learning 2. Dropout 3. Parameter norm penalty | 1. Multi-task learning 2. Early stopping | 1. Single-task learning 2. Model averaged ensemble predictions | Multi-task learning |
Hyperparameter tuning | Bayesian Optimization | Grid search | Bayesian Optimization | Manual search |
Speed |  ~ 3 s/epoch | Pretrain: ~ 76 s/epoch Train: ~ 110 s/epoch |  ~ 40 s/epoch |  ~ 10 s/epoch |
Threshold | max(tpr + tnr) | max(tpr-tnr) | max(tpr + tnr) | max(tpr + tnr) |
CV strategy | KFold | MSSS | MSSS | MSSS |
Softwarea | No | No | No | Yes |