Ramaswamy S.,Mayachitra, Inc. |
Rose K.,University of California at Santa Barbara
IEEE Transactions on Knowledge and Data Engineering | Year: 2011
We consider approaches for similarity search in correlated, high-dimensional data sets, which are derived within a clustering framework. We note that indexing by "vector approximation" (VA-File), which was proposed as a technique to combat the "Curse of Dimensionality," employs scalar quantization, and hence necessarily ignores dependencies across dimensions, which represents a source of suboptimality. Clustering, on the other hand, exploits interdimensional correlations and is thus a more compact representation of the data set. However, existing methods to prune irrelevant clusters are based on bounding hyperspheres and/or bounding rectangles, whose lack of tightness compromises their efficiency in exact nearest neighbor search. We propose a new cluster-adaptive distance bound based on separating hyperplane boundaries of Voronoi clusters to complement our cluster based index. This bound enables efficient spatial filtering, with a relatively small preprocessing storage overhead and is applicable to euclidean and Mahalanobis similarity measures. Experiments in exact nearest-neighbor set retrieval, conducted on real data sets, show that our indexing method is scalable with data set size and data dimensionality and outperforms several recently proposed indexes. Relative to the VA-File, over a wide range of quantization resolutions, it is able to reduce random IO accesses, given (roughly) the same amount of sequential IO operations, by factors reaching 100X and more. © 2011 IEEE.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 79.97K | Year: 2015
Intelligence cues in multimedia data are result of multiple sources: inherent knowledge of multiple end-users (analysts), feature-rich digital data content (co-occurrence of specific behaviors and scenes in video, audio, other sensors), and intelligence context (where, when, why, how). Analysts need to fully access video and acoustic data content (when multiple constraints are present), formulate complex queries across features' modalities, and visualize patterns from retrieved results in multiple contextual spaces. To do this real-time requires the sophisticated back-end: storage, common representation, search, annotation, and tagging schemes to manage the rich and diverse information contained in sensor feeds (video metadata, acoustic files, analyst comments, spatial and temporal localisation, context). To do it accurately requires the sophisticated data retrieval that relies on the information fusion of various sources. Analysts expect from the system to produce time-critical actionable intelligence and insights beyond the querying. Sole domain techniques are not applicable here, as they solve only part of the problem (high-dimensional descriptor search for video and audio content or text search for transcripts). To do this effectively, the project will explore deep learning techniques to capture the underlying dynamic of useful insights. The project will develop an end-to-end solution that supports (a) back-end development and integration of a wide range of video and audio descriptions at different semantic levels through unified representation of content description, and inference of the stored semantic knowledge at retrieval time; (b) fast and versatile access (security and bandwidth wise) and addition of rich semantic description in collaborative environments (back-end and front-end feature annotation and tagging); and (c) sequencing, and discovery of information contained in distributed networked sensor data files at the frame level.
Agency: Department of Defense | Branch: Defense Advanced Research Projects Agency | Program: SBIR | Phase: Phase I | Award Amount: 98.90K | Year: 2009
we present ViSearch, an interactive video search and retrieval tool that will enable seamless articulation of complex queries and fast retrieval of relevant video clips from large video repositories. The approach, leveraging on our extensive experience with large aerial image and video databases, involves building a multi-layered indexing structure, that can seamlessly support interactive query articulation and refinement.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 100.00K | Year: 2011
In this small business innovation research (SBIR) proposal, we present M-ROCK (Mobile Retrieval Of Contextual Knowledge), a system that will enable warfighters to access relevant information from remote knowledge stores in a hands-free fashion. Allowing warfighters to seamlessly access contextual information can vastly improve the safety and efficacy of missions.
Agency: Department of Defense | Branch: Navy | Program: SBIR | Phase: Phase I | Award Amount: 80.00K | Year: 2015
We propose MalSee to leverage recent research performed by principals at Mayachitra to recast the suspect software binaries as images and exploit computer vision techniques to automatically classify malware. This approach offers the following advantages: Robustness to variations, speed and scalability, route for further exploration.