Entity

Time filter

Source Type

Cornell, United States

Panda B.,Google | Riedewald M.,Northeastern University | Fink D.,Cornell Laboratory of Ornithology
Proceedings - International Conference on Data Engineering | Year: 2010

Modern science is collecting massive amounts of data from sensors, instruments, and through computer simulation. It is widely believed that analysis of this data will hold the key for future scientific breakthroughs. Unfortunately, deriving knowledge from large high-dimensional scientific datasets is difficult. One emerging answer is exploratory analysis using data mining; but data mining models that accurately capture natural processes tend to be very complex and are usually not intelligible. Scientists therefore generate model summaries to find the most important patterns learned by the model. We formalize the model-summary problem and introduce it as a novel problem to the database community. Generating model summaries creates serious data management challenges: Scientists usually want to analyze patterns in different "slices" and "dices" of the data space, comparing the effects of various input variables on the output. We propose novel techniques for efficiently generating such summaries for the popular class of tree-based models. Our techniques leverage workload structure on multiple levels. We also propose a scalable implementation of our techniques in MapReduce. For both sequential and parallel implementation, we achieve speedups of one or more orders of magnitude over the naive algorithm, while guaranteeing the exact same results. © 2010 IEEE. Source


Morrison S.A.,California Chapter | Sillett T.S.,Smithsonian Conservation Biology Institute | Ghalambor C.K.,Colorado State University | Fitzpatrick J.W.,Cornell Laboratory of Ornithology | And 19 more authors.
BioScience | Year: 2011

Biodiversity conservation in an era of global change and scarce funding benefits from approaches that simultaneously solve multiple problems. Here, we discuss conservation management of the island scrub-jay (Aphelocoma insularis), the only island-endemic passerine species in the continental United States, which is currently restricted to 250-square-kilometer Santa Cruz Island, California. Although the species is not listed as threatened by state or federal agencies, its viability is nonetheless threatened on multiple fronts. We discuss management actions that could reduce extinction risk, including vaccination, captive propagation, biosecurity measures, and establishing a second free-living population on a neighboring island. Establishing a second population on Santa Rosa Island may have the added benefit of accelerating the restoration and enhancing the resilience of that island's currently highly degraded ecosystem. The proactive management framework for island scrub-jays presented here illustrates how strategies for species protection, ecosystem restoration, and adaptation to and mitigation of climate change can converge into an integrated solution. © 2011 by American Institute of Biological Sciences. All rights reserved. Source


Grant
Agency: Department of Defense | Branch: Air Force | Program: STTR | Phase: Phase I | Award Amount: 99.75K | Year: 2002

"Bird strikes and ingestion of birds into engines pose serious threats to aircraft during takeoff and landing operations at many air bases. AAC and Cornell propose to mitigate these threats by developing an acoustic bird monitoring system that provides bothreal-time snapshots and historical summaries of bird flight activity. This system would utilize a low cost, high gain array in association with acoustic Detection, Classification, and Localization (DCL) techniques designed to monitor bird vocalizations inpotentially noisy environments. The distribution (map coordinates and altitude) and body masses of birds would be would be measured, and predictive models would be developed that relate these data to diurnal, seasonal, and meteorological factors. Alertswill be generated to help aircraft avoid problematic areas that are known or predicted to contain a critical mass of birds. This process will be achieved with a modest number of sensors and sensor sites, and must provide a high probability of detectionwhile generating a very small number of false alarms. Acoustic DCL of birds at useful distances will be facilitated by the use of a multi-element Sparsely Populated Volumetric Array (SPVA). The SPVA uses interferometric processing to provide spatial gain,source localization, and cancellation of interfering sources. Underwater SPVA arrays are currently being deployed on Navy platforms for use in undersea warfare and marine mammal detection applications. Each SPVA system provides an accurate line of bearing.The intersection of lines of bearing from two or more SPVA systems can be used to map the bird's location in map coordinates and altitude. SPVA's can adaptively cancel high intensity noise sources, such as nearby aircraft or ground equipment, which mightotherwise mask the bird signals of interest. Outputs of the SPVA will be fed to a two sets of detectors that will estimate signal parameters for several distinct classes of detected bird vocalizations and will identify manmade airport noises, includingthose from engines, spinning propellers


Van Horn G.,California Institute of Technology | Branson S.,California Institute of Technology | Farrell R.,BYU | Haber S.,Cornell Laboratory of Ornithology | And 3 more authors.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Year: 2015

We introduce tools and methodologies to collect high quality, large scale fine-grained computer vision datasets using citizen scientists - crowd annotators who are passionate and knowledgeable about specific domains such as birds or airplanes. We worked with citizen scientists and domain experts to collect NABirds, a new high quality dataset containing 48,562 images of North American birds with 555 categories, part annotations and bounding boxes. We find that citizen scientists are significantly more accurate than Mechanical Turkers at zero cost. We worked with bird experts to measure the quality of popular datasets like CUB-200-2011 and ImageNet and found class label error rates of at least 4%. Nevertheless, we found that learning algorithms are surprisingly robust to annotation errors and this level of training data corruption can lead to an acceptably small increase in test error if the training set has sufficient size. At the same time, we found that an expert-curated high quality test set like NABirds is necessary to accurately measure the performance of fine-grained computer vision systems. We used NABirds to train a publicly available bird recognition service deployed on the web site of the Cornell Lab of Ornithology. © 2015 IEEE. Source


Ross J.C.,Cornell Laboratory of Ornithology | Allen P.E.,Cornell Laboratory of Ornithology
Ecological Informatics | Year: 2014

Passive acoustic monitoring often leads to large quantities of sound data which are burdensome to process, such that the availability and cost of expert human analysts can be a bottleneck and make ecosystem or landscape-scale projects infeasible. This manuscript presents a method for rapidly analyzing the results of band-limited energy detectors, which are commonly used for the detection of passerine nocturnal flight calls, but which typically are beset by high false positive rates. We first manually classify a subset of the detected events as signals of interest or false detections. From that subset, we build a Random Forest model to eliminate most of the remaining events as false detections without further human inspection. The overall reduction in the labor required to separate signals of interest from false detections can be 80% or more. Additionally, we present an R package, flightcallr, containing functions which can be used to implement this new workflow. © 2013 Elsevier B.V. Source

Discover hidden collaborations