A method for following a topic in an electronic textual conversation, the method includes selecting, by a computing device, one or more primary terms related to a topic, sending, by the computing device, to at least one communication service, a first query containing the at least one primary term, receiving, by the computing device, from the at least one communication service, at least one first set of messages responsive to the first query, for each first set, extracting, by the computing device, from the first set of messages, a first plurality of additional terms, and for each term of the first plurality of additional terms, enumerating, by the computing device, the messages of the first set in which the term appears and adding the term to a list of secondary terms if the enumeration exceeds a threshold amount.

A method for elimination of spam in a data stream according to information density, includes receiving, by a computing device, a stream of messages. The method includes directing, by the computing device, the stream into at least one buffer. The method includes repeatedly compressing, by the computing device, data in the buffer using a lossless compression algorithm. The method includes identifying, by the computing device, at least one first message in the buffer as spam, by determining that the at least one first message has been compressed below a threshold level.

Disclosed herein is a method and system for producing a term association vector space on demand for a client given a document set in electronic form. The method extracts terms from the document set, stripping out words that do not convey meaning and adding important phrases within the context of the document set to the terms. Associations between terms are calculated, subjected to further analytical processes, and collected in a matrix, whose rows are vectors defining the vector space. Additional associational data can be added by matrix arithmetic, and documents can be rendered as further vectors in the space.

A system and related method are disclosed for rendering a set of words linked to an n-dimensional vector space in a word cloud rendered from a two-dimensional projection of the vector space, where the user can click and drag a word, and the subspace and projection thereon will shift to place the word where the user has dragged it in a new projection, and the other words in the cloud will shift correspondingly, offering the user new insights. The importance of words in a document set is represented by word size, and relatedness between words demonstrated by color similarity.

Luminoso | Date: 2013-03-15

A system and related method are disclosed for searching a data set made up of a set of documents, a set of terms, and a vector associated with each term and each document. The method involves converting a search query to a vector in the vector space spanned by the term and document vectors, and combining vector-proximity searching and term searching to produce a set of results, which may be ranked according to various measures of relatedness to the query. Excerpts from each document in the result set may be displayed that contain the greatest term importance.

