Agency: Department of Defense | Branch: Defense Advanced Research Projects Agency | Program: STTR | Phase: Phase II | Award Amount: 750.00K | Year: 2010
The increasing use of video in force protection, autonomous vehicles reconnaissance, and surveillance in general has created a great demand for automated analysis and monitoring. It is infeasible to have humans monitoring and analyzing the vast amount of video used in such surveillance, and automated analysis yields hope to ease the burden and increase the amount of information extracted from collected video. Motivated by this and ongoing advances in the state-of-the-art in machine vision, we have been developing a basic system for automatically extracting useful information from video and making relevant data quickly available to a user. In particular we aimed to progress beyond typical scenarios presented in automated video analysis of a small set of pre-determined simple activities in high-resolution data. In completing the first phase we have answered many key questions and found many promising avenues to proceed in developing a complete system. From this, we have developed a second phase research plan building on and extending the work done. We plan to develop a complete operational prototype by focusing on yet more realistic data collections, increasingly complex activities, difficult tracking cases, and finally designing a fully integrated system.
Agency: Department of Defense | Branch: Navy | Program: STTR | Phase: Phase I | Award Amount: 149.99K | Year: 2014
Humans are still unsurpassed in their ability to search for objects in visual scenes. On the other hand, object detection continues to be one of the hot areas of computer vision, and it attests both to the fundamental importance of the problem and the fact that existing state-of-the-art algorithms are still shy of performing well in practice. We propose to utilizes human vision modeling of a foveated view along with saccadic movements of the eye to quickly and accurately find and track objects of interest. We will develop a framework that combines state-of-the-art commercial grade video object detection system with foveated visual field and saccade fixation point modeling for better and faster object detection in UAS video feeds.
Agency: Department of Defense | Branch: Navy | Program: STTR | Phase: Phase I | Award Amount: 149.97K | Year: 2012
In this proposal, we present Motion Imagery Exploitation with Compressive Sensing (MIECS), a system for improving the flow and exploitation of data gathered from wide-area surveillance. The proposed system exploits techniques from the recently developed field of compressive sensing compressive sensing for solving this problem. The system exploits significant temporal redundancy in captured data in reducing bandwidth requirements for transmitting wide-area surveillance data gathered from aerial platforms while maintaining the quality of recognition of objects and activities. We plan to have a prototype of the proposed system by the end of Phase I that is capable of object recognition and tracking as well as activity analysis using a compressively sensed version of visual data.
Agency: Department of Defense | Branch: Navy | Program: STTR | Phase: Phase II | Award Amount: 747.61K | Year: 2013
Ramaswamy S.,Mayachitra, Inc. |
Rose K.,University of California at Santa Barbara
IEEE Transactions on Knowledge and Data Engineering | Year: 2011
We consider approaches for similarity search in correlated, high-dimensional data sets, which are derived within a clustering framework. We note that indexing by "vector approximation" (VA-File), which was proposed as a technique to combat the "Curse of Dimensionality," employs scalar quantization, and hence necessarily ignores dependencies across dimensions, which represents a source of suboptimality. Clustering, on the other hand, exploits interdimensional correlations and is thus a more compact representation of the data set. However, existing methods to prune irrelevant clusters are based on bounding hyperspheres and/or bounding rectangles, whose lack of tightness compromises their efficiency in exact nearest neighbor search. We propose a new cluster-adaptive distance bound based on separating hyperplane boundaries of Voronoi clusters to complement our cluster based index. This bound enables efficient spatial filtering, with a relatively small preprocessing storage overhead and is applicable to euclidean and Mahalanobis similarity measures. Experiments in exact nearest-neighbor set retrieval, conducted on real data sets, show that our indexing method is scalable with data set size and data dimensionality and outperforms several recently proposed indexes. Relative to the VA-File, over a wide range of quantization resolutions, it is able to reduce random IO accesses, given (roughly) the same amount of sequential IO operations, by factors reaching 100X and more. © 2011 IEEE.
Agency: Department of Defense | Branch: Defense Advanced Research Projects Agency | Program: SBIR | Phase: Phase II | Award Amount: 750.00K | Year: 2010
We propose to develop ViSearch: an interactive search and retrieval tool for massive video archives, that will enable seamless articulation of complex queries and fast retrieval of relevant clips. The approach, leveraging our extensive experience with large aerial image and video databases, involves building a multi-layered indexing structure, that can seamlessly support interactive query articulation and refinement.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase II | Award Amount: 303.29K | Year: 2013
The primary goal of this SBIR Phase II is to build a software system capable of automated scene understanding and labeling based on advanced image processing and computer vision algorithms. The system allows a user to perform content based search and retrieval queries in a context-aware fashion. The system seeks to answer the following questions in a fully-automated manner: (i) What objects are present in the scene? (ii) Where are they located in the scene with respect to each other? The system does not label every item in the image/video frames rather it reviews an image/video and automatically forms a semantic, easily understandable description of the images/videos. In this Phase II.5 effort, we plan to bring this technology to a commercially ready product to the level that is applicable to an actual system in a mission.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 100.00K | Year: 2011
In this small business innovation research (SBIR) proposal, we present M-ROCK (Mobile Retrieval Of Contextual Knowledge), a system that will enable warfighters to access relevant information from remote knowledge stores in a hands-free fashion. Allowing warfighters to seamlessly access contextual information can vastly improve the safety and efficacy of missions.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 80.00K | Year: 2015
We propose MalSee to leverage recent research performed by principals at Mayachitra to recast the suspect software binaries as images and exploit computer vision techniques to automatically classify malware. This approach offers the following advantages: Robustness to variations, speed and scalability, route for further exploration.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 79.97K | Year: 2015
Intelligence cues in multimedia data are result of multiple sources: inherent knowledge of multiple end-users (analysts), feature-rich digital data content (co-occurrence of specific behaviors and scenes in video, audio, other sensors), and intelligence context (where, when, why, how). Analysts need to fully access video and acoustic data content (when multiple constraints are present), formulate complex queries across features' modalities, and visualize patterns from retrieved results in multiple contextual spaces. To do this real-time requires the sophisticated back-end: storage, common representation, search, annotation, and tagging schemes to manage the rich and diverse information contained in sensor feeds (video metadata, acoustic files, analyst comments, spatial and temporal localisation, context). To do it accurately requires the sophisticated data retrieval that relies on the information fusion of various sources. Analysts expect from the system to produce time-critical actionable intelligence and insights beyond the querying. Sole domain techniques are not applicable here, as they solve only part of the problem (high-dimensional descriptor search for video and audio content or text search for transcripts). To do this effectively, the project will explore deep learning techniques to capture the underlying dynamic of useful insights. The project will develop an end-to-end solution that supports (a) back-end development and integration of a wide range of video and audio descriptions at different semantic levels through unified representation of content description, and inference of the stored semantic knowledge at retrieval time; (b) fast and versatile access (security and bandwidth wise) and addition of rich semantic description in collaborative environments (back-end and front-end feature annotation and tagging); and (c) sequencing, and discovery of information contained in distributed networked sensor data files at the frame level.