Porway J.,The New York Times |
Zhu S.-C.,University of California at Los Angeles
IEEE Transactions on Pattern Analysis and Machine Intelligence | Year: 2011
This paper presents a novel Markov Chain Monte Carlo (MCMC) inference algorithm called C4Clustering with Cooperative and Competitive Constraintsfor computing multiple solutions from posterior probabilities defined on graphical models, including Markov random fields (MRF), conditional random fields (CRF), and hierarchical models. The graphs may have both positive and negative edges for cooperative and competitive constraints. C4 is a probabilistic clustering algorithm in the spirit of Swendsen-Wang . By turning the positive edges on/off probabilistically, C4 partitions the graph into a number of connected components (ccps) and each ccp is a coupled subsolution with nodes connected by positive edges. Then, by turning the negative edges on/off probabilistically, C4 obtains composite ccps (called cccps) with competing ccps connected by negative edges. At each step, C4 flips the labels of all nodes in a cccp so that nodes in each ccp keep the same label while different ccps are assigned different labels to observe both positive and negative constraints. Thus, the algorithm can jump between multiple competing solutions (or modes of the posterior probability) in a single or a few steps. It computes multiple distinct solutions to preserve the intrinsic ambiguities and avoids premature commitments to a single solution that may not be valid given later context. C4 achieves a mixing rate faster than existing MCMC methods, such as various Gibbs samplers ,  and Swendsen-Wang cuts , . It is also more dynamic than common optimization methods such as ICM , LBP , , and graph cuts , . We demonstrate the C4 algorithm in line drawing interpretation, scene labeling, and object recognition. © 2011 IEEE. Source
Stafford T.,University of Sheffield |
Dewar M.,The New York Times
Psychological Science | Year: 2014
In the present study, we analyzed data from a very large sample (N = 854,064) of players of an online game involving rapid perception, decision making, and motor responding. Use of game data allowed us to connect, for the first time, rich details of training history with measures of performance from participants engaged for a sustained amount of time in effortful practice. We showed that lawful relations exist between practice amount and subsequent performance, and between practice spacing and subsequent performance. Our methodology allowed an in situ confirmation of results long established in the experimental literature on skill acquisition. Additionally, we showed that greater initial variation in performance is linked to higher subsequent performance, a result we link to the exploration/exploitation trade-off from the computational framework of reinforcement learning. We discuss the benefits and opportunities of behavioral data sets with very large sample sizes and suggest that this approach could be particularly fecund for studies of skill acquisition. © The Author(s) 2013. Source
The New York Times | Date: 2015-12-04
A system of generating advertising inventory by marketers sharing content with others via a social network or other electronic communication across a network. This embodiment may include allowing one or more content providers to provide links to content items. The content items may be provided directly by the content provider or by a clearinghouse entity or other intermediary. A subscriber may then search for relevant content items from one or more content providers. The subscriber may provide ancillary content for association with the selected content item. Thus, when the subscriber shares a URL identifying the selected content item via a social network, the URL may be encoded with a unique identifier identifying the subscriber. When the URL is clicked by a user, an ad server on the publishers side may recognize the unique identifier and display the content item with the ancillary content provided by the subscriber.
The New York Times | Date: 2011-09-23
A system and method for automatically detecting and extracting semantically significant text from a HTML document associated with a plurality of HTML documents is disclosed. The method may include receiving a HTML document, parsing the HTML document into a parse tree, segmenting the parse tree into one or more segments of one or more unique paths, processing the one or more segments based at least the HTML document, and extracting one or more processed segments from the at least the HTML document based on a predetermined number.
The New York Times | Date: 2011-04-08
A system for and method of generating and visualizing one or more sharing event cascade structures associated with one or more content sharing events that occur across a network may include generating a plurality of sharing event nodes in the one or more sharing event cascade structures based on data associated with at least one of a system log and a database, wherein each sharing event cascade structure graphically represents a history of one or more content items being shared among a plurality of users of the network, and presenting, on a display device, a content sharing visualization diagram that illustrates the one or more sharing event cascade structures and enables a user to analyze sharing patterns associated with the plurality of users of the network.