Extreme Data Laboratory DEXL
Extreme Data Laboratory DEXL
Fontes C.A.,Brazilian Military Institute of Engineering |
Cavalcanti M.C.,Brazilian Military Institute of Engineering |
De Moura A.M.C.,Extreme Data Laboratory DEXL
Proceedings - 2013 IEEE 7th International Conference on Semantic Computing, ICSC 2013 | Year: 2013
Information management has become an important challenge, especially when most of their relevant and strategic documents are available on the Web only for human interpretation. Annotating documents rises as an interesting strategy to diminish the hard task of retrieving important documents from the Web. Annotations consist of associating metadata with text segments of a document, in order to facilitate its retrieval by search engines. Besides improving their performance, annotations enable an optimized indexation of documents. However, due to the huge amount of existing documents, the idea of generating document annotations is not a trivial task. This paper presents a proposal for automatically enriching documents with semantic annotations, where document terms are annotated according to a domain ontology. Currently there already exist some document annotation tools to automate this process. However, the main contribution of this work is focused on the ability of exploring the ontology inference capability and on the meta-annotation concept, which aim at providing users and automatic agents with a more powerful mechanism to retrieve information. © 2013 IEEE.
De Carvalho Moura A.M.,Extreme Data Laboratory DEXL |
Porto F.,Extreme Data Laboratory DEXL |
Vidal V.,Federal University of Ceará |
Magalhaes R.P.,Federal University of Ceará |
And 3 more authors.
International Journal of Web Information Systems | Year: 2014
Purpose - The purpose of this paper is to present a four-level architecture that aims at integrating, publishing and retrieving ecological data making use of linked data (LD). It allows scientists to explore taxonomical, spatial and temporal ecological information, access trophic chain relations between species and complement this information with other data sets published on the Web of data. The development of ecological information repositories is a crucial step to organize and catalog natural reserves. However, they present some challenges regarding their effectiveness to provide a shared and global view of biodiversity data, such as data heterogeneity, lack of metadata standardization and data interoperability. LD rose as an interesting technology to solve some of these challenges. Design/methodology/approach - Ecological data, which is produced and collected from different media resources, is stored in distinct relational databases and published as RDF triples, using a relational-Resource Description Format mapping language. An application ontology reflects a global view of these datasets and share with them the same vocabulary. Scientists specify their data views by selecting their objects of interest in a friendly way.Adata view is internally represented as an algebraic scientific workflow that applies data transformation operations to integrate data sources. Findings - Despite of years of investment, data integration continues offering scientists challenges in obtaining consolidated data views of a large number of heterogeneous scientific data sources. The semantic integration approach presented in this paper simplifies this process both in terms of mappings and query answering through data views. Social implications - This work provides knowledge about the Guanabara Bay ecosystem, as well as to be a source of answers to the anthropic and climatic impacts on the bay ecosystem. Additionally, this work will enable evaluating the adequacy of actions that are being taken to clean up Guanabara Bay, regarding the marine ecology. Originality/value - Mapping complexity is traded by the process of generating the exported ontology. The approach reduces the problem of integration to that of mappings between homogeneous ontologies. As a byproduct, data views are easily rewritten into queries over data sources. The architecture is general and although applied to the ecological context, it can be extended to other domains.