Unsupervised Learning of Dense Optical Flow and Depth from Sparse Event Data


Chengxi Ye*
Anton Mitrokhin*
Chethan M Parameshwara
Cornelia Fermüller
James A. Yorke
Yiannis Aloimonos
*Equal Contribution

Perception and Robotics Group
at
University of Maryland, College Park



Abstract


Fig. 1. Different motion representations acquired from the DAVIS sensor: (a) Grayscale image from a frame-based camera (red bounding box denotes a moving object). (b) Motion-compensated projected event cloud. Color denotes inconsistency in motion. (c) The 3D representation of the event cloud in (x, y, t) coordinate space. Color represents the timestamp with [red - blue] corresponding to [0.0 - 0.5] seconds. The separately moving object (a quadrotor) is clearly visible as a trail of events passing through the entire 3D event cloud.


In this work we present unsupervised learning of depth and motion from sparse event data generated by a Dynamic Vision Sensor (DVS). To tackle this low level vision task, we use a novel encoder-decoder neural network architecture that aggregates multi-level features and addresses the problem at multiple resolutions. A feature decorrelation technique is introduced to improve the training of the network. A non-local sparse smoothness constraint is used to alleviate the challenge of data sparsity. Our work is the first that generates dense depth and optical flow information from sparse event data. Our results show significant improvements upon previous works that used deep learning for flow estimation from both images and events.




Paper

Chengxi Ye*, Anton Mitrokhin*, Chethan Parameshwara, Cornelia Fermüller, James A. Yorke, Yiannis Aloimonos.

* Equal Contribution




[pdf]
[Bibtex]