Su W.,BNUHKBU United International College |
Su W.,PKU HKUST Shenzhen Hong Kong Institution |
Wang J.,City University of Hong Kong |
Lochovsky F.H.,Hong Kong University of Science and Technology
IEEE Transactions on Knowledge and Data Engineering | Year: 2010
Record matching, which identifies the records that represent the same real-world entity, is an important step for data integration. Most state-of-the-art record matching methods are supervised, which requires the user to provide training data. These methods are not applicable for the Web database scenario, where the records to match are query results dynamically generated on-the-fly. Such records are query-dependent and a prelearned method using training examples from previous query results may fail on the results of a new query. To address the problem of record matching in the Web database scenario, we present an unsupervised, online record matching method, UDD, which, for a given query, can effectively identify duplicates from the query result records of multiple Web databases. After removal of the same-source duplicates, the presumed nonduplicate records from the same source can be used as training examples alleviating the burden of users having to manually label training examples. Starting from the nonduplicate set, we use two cooperating classifiers, a weighted component similarity summing classifier and an SVM classifier, to iteratively identify duplicates in the query results from multiple Web databases. Experimental results show that UDD works well for the Web database scenario where existing supervised methods do not apply. © 2010 IEEE.
Gong W.,University of Science and Technology of China |
Zhang Y.,University of Science and Technology of China |
Huang X.,University of Science and Technology of China |
Luan S.,University of Science and Technology of China |
Luan S.,PKU HKUST Shenzhen Hong Kong Institution
Atmospheric Environment | Year: 2013
Loss of ammonia (NH3) as a result of intensive N fertilization, especially due to agronomic practices in South China, is not well characterized. To investigate mechanisms and characteristics of NH3 volatilization after urea application, an on-line monitoring system, with 30-min data resolution, was used to study vegetable and rice fields from January 2009 to September 2010. Ammonia emissions and concurrent meteorological conditions were monitored for up to 20 days after fertilization in 12 experiments. Standard recovery test results indicated that the on-line measurement system was both stable and accurate. The NH3 emission factors (EFs) related to broadcast (soil surface) basal dressing and top dressing to Brassica rapa L. were 23.6% and 21.3%, respectively. The NH3 EFs from holing basal dressing and broadcasting top dressing for lettuce were 17.6% and 24.0%, respectively. The NH3 EFs for early rice in parallel broadcast basal dressing process were 10.7% and 14.2%, while in parallel top dressing process were 24.0% and 22.6%, respectively. The NH3 EFs for late rice were 15.4% and 21.0% in parallel broadcasting basal dressing process, while 13.2% and 17.6% in parallel top dressing process. Emission of NH3 from vegetable and rice fields occurred mainly in the first 2-3 weeks after fertilization. Ammonia emission flux was positively correlated with air temperature and soil temperature in the majority of the experiments. Relationships between NH3 emissions and humidity, soil moisture or wind speed were explored, which were not consistent among all tests. Ammonia emission in vegetable and rice fields was primarily associated with temperature. High-resolution data, such as those gathered in the current investigation, will contribute to a more thorough quantitative understanding of the relationship between fertilizer application, environmental conditions, and NH3 volatilization which, in turn, will improve the accuracy of atmospheric modeling on local, regional and global scales. © 2012 Elsevier Ltd.
Huang X.-F.,University of Science and Technology of China |
Sun T.-L.,University of Science and Technology of China |
Zeng L.-W.,University of Science and Technology of China |
Yu G.-H.,PKU HKUST Shenzhen Hong Kong Institution |
And 2 more authors.
Atmospheric Environment | Year: 2012
Black carbon (BC) is the dominant light-absorbing aerosol component in the atmosphere and plays an important role in atmospheric pollution and climate change. The light-absorbing properties of BC rely on particle size, shape, composition, as well as the BC mixing state with other aerosol components, thus more thorough exploration of BC aerosol characteristics is critical in understanding its atmospheric sources and effects. In this study, a newly-developed Single Particle Soot Photometer (SP2) was deployed in Shenzhen, China, for continuous BC measurements to obtain the important information about size distribution and mixing state of BC under severe air pollution conditions of China. The mean BC mass concentrations were found to be 6.0 and 4.1 μg m -3 at an urban site (UT) in the fall and winter, respectively, while it is much lower (2.6 μg m -3) at a rural site (BG) in the fall. The mass size distributions of BC in volume equivalent diameter (VED) at the three sites showed a similar lognormal pattern, with the peak diameter at BG (222 nm) slightly larger than at the UT (210 nm) site. As to mixing state, the average percentage of internally mixed BC at the UT site was detected to be 40% and 46% in the fall and winter, respectively, while that at the BG site in the fall was only a slightly higher (47%), which implies that fresh local fossil fuel combustions were still significant at this rural site. The analysis of extremely high BC concentrations (>20 μg m -3) at UT indicates that they were a complex of comparable contributions from both local fresh emissions and regional transport under unfavorable meteorology. Other characteristics of BC aerosol and their influencing factors in Shenzhen were also discussed. © 2012 Elsevier Ltd.
Zhao Z.-Y.,PKU HKUST Shenzhen Hong Kong Institution |
Zhao Z.-Y.,University of Hong Kong |
Chu Y.-L.,PKU HKUST Shenzhen Hong Kong Institution |
Gu J.-D.,University of Hong Kong
Ecotoxicology | Year: 2012
The concentration of total polycyclic aromatic hydrocarbons (∑PAHs) and the 16 US EPA priority individual PAH compounds were analyzed in surface sediments from the Mai Po Inner Deep Bay, Ramsar Site of Hong Kong from December 2001 to Jun 2005, to investigate the spatial variability of anthropogenic pollutants. ∑PAHs concentrations ranged from 36.5 to 256.3 ng g-1 dry weight with an average of 148.9 ng g-1, comparable to other urbanized areas of the world, and there was little difference among different sampling times from December 2001 to June 2005. Based on comparison to the results from earlier study, it appears that a decrease of total PAHs concentration has occurred since 1992. Meanwhile, the concentrations of ∑PAHs were positive correlated with total organic carbon contents except sites F and G, suggesting the characteristics of the sediment influences the distribution and concentration of PAHs. There was relatively a good relationship among the individual PAHs and the compounds of fluorene, phenanthrene, anthracene, fluoranthene, pyrene, benzo[a]pyrene and indeno[cd]pyrene yielded a good correlation (r2>0.5) with total PAHs. Principal component analysis and specific PAHs compound ratios (Phe/Ant vs. Flt/Pyr) indicate the pyrogenic origins, especially traffic exhausts, are the dominant sources of PAHs in the Mai Po Inner Deep Bay Nature Reserve. © Springer Science+Business Media, LLC 2012.
Zhao Z.,PKU HKUST Shenzhen Hong Kong Institution |
Zhao Z.,University of Hong Kong |
Zhuang Y.-X.,PKU HKUST Shenzhen Hong Kong Institution |
Gu J.-D.,University of Hong Kong
Ecotoxicology | Year: 2012
The distribution and changes of polycyclic aromatic hydrocarbons (PAHs) contamination in mangrove sediments of Mai Po Inner Deep Bay Ramsar Site of Hong Kong SAR were investigated. Surface sediments (10 cm) collected from four sampling sites (SZ, SP, MF and M) exhibited significant spatial variations in concentrations of total PAH (with ∑PAHs ranging from 161.7 to 383.7 ng g-1 dry wt), as well as the composition of 16 US EPA priority PAH compounds. The highest PAHs concentrations were found in the mangrove sediments. Moreover, a sediment core was extracted from mangrove area is used to reconstruct the high-resolution depositional record of PAHs by 210Pb isotope analysis, showing the amounts of PAHs remained relatively constant for the past 41 years. Urbanization of Shenzhen Economic Zone, the rapid increase in vehicle numbers and energy consumption in the last two decades contributed to the PAHs detected in sediments. The source-diagnostic ratios indicated that pyrogenic input are important throughout the record and the surface sediments, and suggest that diesel fuel combustion, and hence traffic of heavier vehicles, is the most probable cause of PAHs. © Springer Science+Business Media, LLC 2012.