Publications
Feature Selection and Inferential Procedures for Video Data
Chen, Maximillian G.; Bapst, Aleksander B.; Busche, Kirk B.; Do, Minh D.; Matzen, Laura E.; McNamara, Laura A.; Yeh, Raymond Y.
With the rise of electronic and high-dimensional data, new and innovative feature detection and statistical methods are required to perform accurate and meaningful statistical analysis of these datasets that provide unique statistical challenges. In the area of feature detection, much of the recent feature detection research in the computer vision community has focused on deep learning methods, which require large amounts of labeled training data. However, in many application areas, training data is very limited and often difficult to obtain. We develop methods for fast, unsupervised, precise feature detection for video data based on optical flows, edge detection, and clustering methods. We also use pretrained neural networks and interpretable linear models to extract features using very limited training data. In the area of statistics, while high-dimensional data analysis has been a main focus of recent statistical methodological research, much focus has been on populations of high-dimensional vectors, rather than populations of high-dimensional tensors, which are three- dimensional arrays that can be used to model dependent images, such as images taken of the same person or ripped video frames. Our feature detection method is a non-model-based method that fusses information from dense optical flow, raw image pixels, and frame differences to generate detections. Our hypothesis testing methods are based on the assumption that dependent images are concatenated into a tensor that follows a tensor normal distribution, and from this assumption, we derive likelihood-ratio, score, and regression-based tests for one- and multiple-sample testing problems. Our methods will be illustrated on simulated and real datasets. We conclude this report with comments on the relationship between feature detection and hypothesis testing methods. Acknowledgements This work was funded by the Sandia National Laboratories Laboratory Directed Research and Development (LDRD) pro- gram.