Abstract:
A multiscale input strategy for multiview deep
learning is proposed for supervised multispectral land-use classification
and it is validated on a well-known dataset. The hypothesis
that simultaneous multiscale views can improve compositionbased
inference of classes containing size-varying objects compared
to single-scale multiview is investigated. The end-to-end
learning system learns a hierarchical feature representation with
the aid of convolutional layers to shift the burden of feature
determination from hand-engineering to a deep convolutional
neural network. This allows the classifier to obtain problemspecific
features that are optimal for minimizing the multinomial
logistic regression objective, as opposed to user-defined features
which trades optimality for generality. A heuristic approach to
the optimization of the deep convolutional neural network hyperparameters
is used, based on empirical performance evidence.
It is shown that a single deep convolutional neural network
can be trained simultaneously with multiscale views to improve
prediction accuracy over multiple single-scale views. Competitive
performance is achieved for the UC Merced dataset where
the 93.48% accuracy of multiview deep learning outperforms
the 85.37% accuracy of SIFT-based methods and the 90.26%
accuracy of unsupervised feature learning.