Agency: Department of Defense | Branch: Defense Advanced Research Projects Agency | Program: SBIR | Phase: Phase II | Award Amount: 747.49K | Year: 2010
We propose to implement the ideas on parallelization of the Mapper (see ) visualization methodology developed under our SBIR Phase I effort. Specifically, we will use the MapReduce model, within the Hadoop framework. This development will permit the construction of Mapper outputs for very large data sets. Such methods can then be used to obtain understanding of the massive data sets coming out of the study of internet traffic and advertising, financial market time series, monitoring of consumer behavior within the retail area, and from many other settings.
Ayasdi Inc. | Date: 2014-09-09
An example method includes receiving text from a plurality of documents, segmenting text received text of the plurality of documents, calculating a frequency statistic for each segment of each document, determining segments of potential interest of each document based on calculated frequency statistic, calculating distances between each document of the plurality of documents based on a text metric, and storing segments of potential interest of each document and the distances in a search database. The method may further include receiving a search query and performing a search of information contained in the search database, partitioning documents of search results using the distances, for each partition, determining labels of segments of potential interest for documents of that particular partition, the labels being determined based on a plurality of frequency statistics, and providing determined labels of segments of potential interest for documents of each partition.
Ayasdi Inc. | Date: 2014-03-20
In various embodiments, a system comprises a map and a patient data assessment module. The map includes a plurality of groupings and interconnections of the groupings, each grouping having one or more patient members that share biological similarities, each interconnection interconnecting groupings that share at least one common patient member, the map identifying a set of groupings and a set of interconnections having a medical characteristic of a set of medical characteristics. The patient data assessment module may be configured to receive sensor data from a users mobile device and to assess the sensor data to generate user medical attributes, to determine whether the user shares the biological similarities with the one or more patient members of each grouping based, at least in part, on the user medical attributes, thereby enabling association of the user with one or more of the set of medical characteristics.
Ayasdi Inc. | Date: 2014-11-04
An exemplary method may comprise receiving a matrix for a set of documents, each cell of the matrix including a frequency value indicating a number of instances of a corresponding text segment in a corresponding document, receiving an indication of a relationship between two text segments, each of the two text segments associated with a first column and a second column, respectively, of the matrix, adjusting, for each document, a frequency value of the second column based on the frequency value of the first column, projecting each frequency value into a reference space to generate a set of projection values, identifying a plurality of subsets of the reference space, clustering, for each subset of the plurality of subsets, at least some documents that correspond to projection values, and generating a graph of nodes, each of the nodes identifying one or more of the documents corresponding to each cluster.
Ayasdi Inc. | Date: 2015-10-15
An exemplary method comprises receiving data points, selecting a first subset of the data points to generate an initial set of landmarks, each data point of the first subset defining a landmark point and for each non-landmark data point: calculating first data point distances between a respective non-landmark data point and each landmark point of the initial set of landmarks, identifying a first shortest data point distance from among the first data point distances between the respective non-landmark data point and each landmark point of the initial set of landmarks, and storing the first shortest data point distance as a first landmark distance for the respective non-landmark data point. The method further comprising identifying a non-landmark data point with a longest first landmark distance in comparison with other first landmark distances and adding the identified non-landmark data point associated as a first landmark point to the initial set of landmarks.