Key Laboratory on Machine Perception

Beijing, China

Key Laboratory on Machine Perception

Beijing, China

Time filter

Source Type

Zhao S.,Peking University | Zhang Y.,Key Laboratory on Machine Perception
EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference | Year: 2014

Knowledge graphs are recently used for enriching query representations in an entity-aware way for the rich facts organized around entities in it. However, few of the methods pay attention to non-entity words and clicked websites in queries, which also help conveying user intent. In this paper, we tackle the problem of intent understanding with innovatively representing entity words, refiners and clicked urls as intent topics in a unified knowledge graph based framework, in a way to exploit and expand knowledge graph which we call 'tailor'. We collaboratively exploit global knowledge in knowledge graphs and local contexts in query log to initialize intent representation, then propagate the enriched features in a graph consisting of intent topics using an unsupervised algorithm. The experiments prove intent topics with knowledge graph enriched features significantly enhance intent understanding. © 2014 Association for Computational Linguistics.


Xiao R.,Peking University | Xiao R.,Key Laboratory on Machine Perception | Kong L.,Peking University | Kong L.,Key Laboratory on Machine Perception | And 3 more authors.
Proceedings - 2011 8th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2011 | Year: 2011

The development of information technology brings numerous online news and events to our daily life. One big problem of such information explosion is, many times there are diverse descriptions for one incident which make people confused. Although previous researches have provided various algorithms to detect and track events, few of them focus on uncovering the diversified versions of an event. In this paper, we propose a novel algorithm which is capable of discovering different versions of one event according to the news reports. We map documents to the topic layer to get the information of each topic. Then we extract the highly-differentiated words of each topic to cluster the documents. Compared with previous work, the accuracy of our algorithm is much higher. Experiments conducted on two data sets show that our algorithm is effective and outperforms various related algorithms, including classical methods such as K-means and LDA. © 2011 IEEE.


Lin P.,Peking University | Lin P.,Key Laboratory on Machine Perception | Xu S.,Peking University | Xu S.,Key Laboratory on Machine Perception | And 2 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2014

In this paper, we propose a framework to produce topic-focused summarization of news events, based on biased snippet extraction and selection. Through our approach, a summarization only retaining information related to a predefined topic (e.g. economy or politics) can be generated for a given news event to satisfy users with specific interests. To better balance coherence and coverage of the summarization, snippets rather than sentences or paragraphs are used as textual components. Topic signature is employed in snippet extraction and selection in order to emphasize the topic-biased information. Experiments conducted on real data demonstrate a good coverage, topic-relevancy, and content coherence of the summaries generated by our approach. © Springer International Publishing Switzerland 2014.


Zhu J.,Peking University | Zhu J.,Key Laboratory on Machine Perception | Tan S.,Peking University | Tan S.,Key Laboratory on Machine Perception
INES 2011 - 15th International Conference on Intelligent Engineering Systems, Proceedings | Year: 2011

Interpolative reasoning in sparse rule base has been an important research topic in the field of artificial intelligence. To solve effectively the problem of reasoning in multivariable sparse rule base whose resulting consequences are restricted in a finite set, this paper developed a new interpolative reasoning approach and offered its algorithm. The approach deduced consequent results by converting domains of antecedent and consequent variables into ternary qualitative spaces and building ternary qualitative function among such spaces as model of system for calculation. By applying this approach to an example, the paper illustrated that the new approach is more accurate and simple than the existing interpolative reasoning methods for such problem. © 2011 IEEE.


Wang R.,Peking University | Wang R.,Key Laboratory on Machine Perception | Wang R.,Ucap Corporation | Jiang S.,Peking University | And 8 more authors.
Proceedings - 2011 8th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2011 | Year: 2011

In this paper, we propose a re-ranking method which employs semantic similarity to improve the quality of search results. We fetch the top N results returned by search engine, and use semantic similarities between the candidate and the query to re-rank the results. We first convert the ranking position to an importance score for each candidate. Then we combine the semantic similarity score with this initial importance score and finally we get the new ranks. In the experiment, we use NDCG to evaluate the re-ranking results and the experimental results validate that our proposed method can indeed improve the search performance and meet users' need to a certain extent. © 2011 IEEE.


Zhao L.,Peking University | Zhao L.,Key Laboratory on Machine Perception | Wang Y.,Peking University | Wang Y.,Key Laboratory on Machine Perception | And 4 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2010

Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be generated from the pages on the Intranet of an enterprise. However, the information on such internal pages cannot cover all aspects of the entities. To solve this problem, this paper tries to enrich the explanations of entities by exploiting Web pages on the Internet. This task consists of three steps. First, it obtains pages from the Internet for each entity as an initial page set with the help of search engines. Secondly, it locates the pages which have a high correlation with the entity from the page set. At last, it produces new snippets from such pages and chooses those which can enhance the explanation and throw away the redundant ones. Each candidate snippet is evaluated by two aspects: the correlation between it and the entity, and its ability to enhance the existing explanation. The experimental results based on a real data set show that our proposed method works effectively in supplementing the existing explanation by exploiting web pages from outside the enterprise. © 2010 Springer-Verlag.


Kong L.,Peking University | Kong L.,Key Laboratory on Machine Perception | Yan R.,Peking University | Jiang H.,Peking University | And 4 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2011

Currently news flood spreads throughout the web. The techniques of Event Detection and Tracking makes it feasible to gather and structure text information into events which are constructed online automatically and updated temporally. Users are usually eager to browse the whole event evolution. With the huge quantity of documents, it is almost impossible for users to read all of them. In this paper, we formally define the problem of event evolution phases discovery. We introduce a novel and principled model (called EPD), aiming at temporally outlining the entire news development. A news document is usually not atomic but consists of independent news segments related to the same event. Therefore we first employ a latent ingredients extraction method to extract event snippets. Unlike traditional clustering methods, we propose a novel metrics integrating content feature, temporal feature, distribution feature and bursty feature to measure the correlation between snippets along timeline in a specific event. Combined with bursty feature, we introduce a novel method to compute word weight. We employ HAC to group the news snippets into diversified phases. An optimization problem are utilized to decide the number of phases, which makes EPD applied. With our novel evaluation method, empirical experiments on two real datasets show that EPD is effective and outperforms various related algorithms. Automatic event chronicle generated is introduced as a typical application of EPD. © 2011 Springer-Verlag.

Loading Key Laboratory on Machine Perception collaborators
Loading Key Laboratory on Machine Perception collaborators