Projection layers improve deep learning models of regulatory DNA function [version 1; peer review: 1 approved, 1 approved with reservations]

With the increasing application of deep learning methods to the modelling of regulatory DNA sequences has come an interest in exploring what types of architecture are best suited to the domain. Networks designed to predict many functional characteristics of noncoding DNA in a multitask framework hav...

Full description

Saved in:

Bibliographic Details
Published in	F1000 research Vol. 8; p. 151
Main Authors	Hawkins-Hooker, Alex, Kenlay, Henry, Reid, John E
Format	Journal Article
Language	English
Published	2019
Subjects	deep learning gene regulation sequence analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the increasing application of deep learning methods to the modelling of regulatory DNA sequences has come an interest in exploring what types of architecture are best suited to the domain. Networks designed to predict many functional characteristics of noncoding DNA in a multitask framework have to recognise a large number of motifs and as a result benefit from large numbers of convolutional filters in the first layer. The use of large first layers in turn motivates an exploration of strategies for addressing the sparsity of output and possibility for overfitting that result. To this end we propose the use of a dimensionality-reducing linear projection layer after the initial motif-recognising convolutions. In experiments with a reduced version of the DeepSEA dataset we find that inserting this layer in combination with dropout into convolutional and convolutional-recurrent architectures can improve predictive performance across a range of first layer sizes. We further validate our approach by incorporating the projection layer into a new convolutional-recurrent architecture which achieves state of the art performance on the full DeepSEA dataset. Analysis of the learned projection weights shows that the inclusion of this layer simplifies the network's internal representation of the occurrence of motifs, notably by projecting features representing forward and reverse-complement motifs to similar positions in the lower dimensional feature space output by the layer.
ISSN:	2046-1402 2046-1402
DOI:	10.12688/f1000research.17125.1