Chapel Hill, NC, United States
Chapel Hill, NC, United States

Time filter

Source Type

Snow K.Z.,University of North Carolina at Chapel Hill | Rogowski R.,University of North Carolina at Chapel Hill | Werner J.,Renaissance Computing Institute RENCI | Koo H.,State University of New York at Stony Brook | And 2 more authors.
Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016 | Year: 2016

The concept of destructive code reads is a new defensive strategy that prevents code reuse attacks by coupling fine-grained address space layout randomization with a mitigation for online knowledge gathering that destroys potentially useful gadgets as they are disclosed by an adversary. The intuition is that by destroying code as it is read, an adversary is left with no usable gadgets to reuse in a control-flow hijacking attack. In this paper, we examine the security of this new mitigation. We show that while the concept initially appeared promising, there are several unforeseen attack tactics that render destructive code reads ineffective in practice. Specifically, we introduce techniques for leveraging constructive reloads, wherein multiple copies of native code are loaded into a process' address space (either side-by-side or one-after-another). Constructive reloads allow the adversary to disclose one code copy, destroying it in the process, then use another code copy for their code reuse payload. For situations where constructive reloads are not viable, we show that an alternative, and equally powerful, strategy exists: leveraging code association via implicit reads, which allows an adversary to undo in-place code randomization by inferring the layout of code that follows already disclosed bytes. As a result, the implicitly learned code is not destroyed, and can be used in the adversary's code reuse attack. We demonstrate the effectiveness of our techniques with concrete instantiations of these attacks against popular applications. In light of our successes, we argue that the code inference strategies presented herein paint a cautionary tale for defensive approaches whose security blindly rests on the perceived inability to undo the application of in-place randomization. © 2016 IEEE.


Ward J.H.,University of North Carolina at Chapel Hill | Xu H.,DICE Inc | Conway M.C.,DICE Inc | Russell T.G.,Renaissance Computing Institute RENCI | de Torcy A.,Renaissance Computing Institute RENCI
Communications in Computer and Information Science | Year: 2013

Developers of preservation repositories need to provide internal audit mechanisms to verify their assertions about how the recommendations outlined in the Open Archival Information System (OAIS) Reference Model are applied. They must also verify the consistent application of preservation policies to both the digital objects and the preservation system itself. We developed a method for mapping between the OAIS Reference Model Functional Model to a data grid implementation, which facilitates such tasks. We have done a preliminary gap analysis to determine the current state of computer task-oriented functions and procedures in support of preservation, and constructed a method for abstracting state transition systems from preservation policies. Our approach facilitates certifying properties of a preservation repository and bridges the gap between computer code and abstract preservation repository standards such as the OAIS Reference Model. © Springer International Publishing Switzerland 2013.


Porterfield A.,Renaissance Computing Institute RENCI | Rountree B.,Lawrence Livermore National Laboratory | Fowler R.,Renaissance Computing Institute RENCI | Deb D.,University of North Carolina at Chapel Hill | And 2 more authors.
Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2015 - In conjunction with HPDC 2015 | Year: 2015

Exascale computing will be as much about solving problems with the least power/energy as about solving them quickly. In the future, systems are likely to be over-provisioned with processors and will rely on hardware-enforced power bounds to allow operation within power budget/thermal design limits. Variability in High Performance Computing (HPC) makes scheduling and application optimization difficult. For three HPC applications - ADCIRC, WRF and LQCD, we show the effects of heterogeneity on run-to-run execution consistency (with and without a power limit applied). A 4% hardware run-to-run variation is seen in case of a perfectly balanced compute-bound application, while up to 6% variation was observed for an application with minor load imbalances. A simple model based on Dynamic Duty Cycle Modulation (DDCM) is introduced. The model was implemented within the MPI profiling interface for collective operations and used in conjunction with the three applications without source code changes. Energy savings of over 10% are obtained for ADCIRC with only a 1-3% slowdown. With a power limit in place, the energy saving is partially translated to performance improvement. With a power limit of 50W, one version of the model executes 3% faster while saving 6% in energy, and a second version executes 1% faster while saving over 10% energy. Copyright 2015 ACM.


Kim N.,Gwangju Institute of Science and Technology | Kim J.,Gwangju Institute of Science and Technology | Heermann C.,Renaissance Computing Institute RENCI | Baldine I.,Renaissance Computing Institute RENCI
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering | Year: 2012

Large research programs have been launched to address both the development of Future Internet architectures as well as suitable experimental facilities for testing. Recent research activities on experimental facilities try to share resources across organizational boundaries. This paper introduces an international cooperation effort on interconnecting network substrates of two Future Internet testbed projects, FIRST@PC (Future Internet Research on Sustainable Testbed based on PC) in Korea and ORCA-BEN in United States. To build a collaborative research infrastructure available to each other, we first present how to interconnect two network substrates. We then present how we support experiments on the interconnected network substrate and show the demonstration result performed on it. © 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering.


Heard J.,Renaissance Computing Institute RENCI | Thakur S.,Renaissance Computing Institute RENCI | Losego J.,University of North Carolina at Chapel Hill | Galluppi K.,Arizona State University
Computer Supported Cooperative Work: CSCW: An International Journal | Year: 2014

Collaborative technologies for information sharing are an invaluable resource for emergency managers to respond to and manage highly dynamic events such as natural disasters and other emergencies. However, many standard collaboration tools can be limited either because they provide passive presentation and dissemination of information, or because they are targeted towards highly specific usage scenarios that require considerable training to use the tools. We present a real-time gather and share system called "Big Board" which facilitates collaboration over maps. The Big Board is an open-source, web based, real time visual collaborative environment that runs on all modern web browsers and uses open-source web standards developed by the Open Geospatial Consortium (OGC) and WorldWideWeb Consortium (W3C). An evaluation of Big Board was conducted by school representatives in North Carolina for use in situational understanding for school closure decisions during winter weather events. The decision to close schools has major societal impacts and is one that is usually made based on how well a teenage driver could handle wintry precipitation on a road. Collecting information on the conditions of roads is especially critical, however gathering and sharing of this information within a county can be difficult. Participants in the study found the Big Board intuitive and useful for sharing real time information, such as road conditions and temperatures, leading up to and during a winter storm scenario. We have adapted the Big Board to manage risks and hazards during other types of emergencies such as tropical storm conditions. © 2013 The Author(s).


Olivier S.L.,University of North Carolina at Chapel Hill | Porterfield A.K.,Renaissance Computing Institute RENCI | Wheeler K.B.,Sandia National Laboratories | Spiegel M.,Renaissance Computing Institute RENCI | Prins J.F.,University of North Carolina at Chapel Hill
International Journal of High Performance Computing Applications | Year: 2012

The recent addition of task parallelism to the OpenMP shared memory API allows programmers to express concurrency at a high level of abstraction and places the burden of scheduling parallel execution on the OpenMP run-time system. Efficient scheduling of tasks on modern multi-socket multicore shared memory systems requires careful consideration of an increasingly complex memory hierarchy, including shared caches and non-uniform memory access (NUMA) characteristics. In order to evaluate scheduling strategies, we extended the open source Qthreads threading library to implement different scheduler designs, accepting OpenMP programs through the ROSE compiler. Our comprehensive performance study of diverse OpenMP task-parallel benchmarks compares seven different task-parallel run-time scheduler implementations on an Intel Nehalem multi-socket multicore system: our proposed hierarchical work-stealing scheduler, a per-core work-stealing scheduler, a centralized scheduler, and LIFO and FIFO versions of the Qthreads round-robin scheduler. In addition, we compare our results against the Intel and GNU OpenMP implementations.Our hierarchical scheduling strategy leverages different scheduling methods at different levels of the hierarchy. By allowing one thread to steal work on behalf of all of the threads within a single chip that share a cache, the scheduler limits the number of costly remote steals. For cores on the same chip, a shared LIFO queue allows exploitation of cache locality between sibling tasks as well as between a parent task and its newly created child tasks. In the performance evaluation, our Qthreads hierarchical scheduler is competitive on all benchmarks tested. On five of the seven benchmarks, it demonstrates speedup and absolute performance superior to both the Intel and GNU OpenMP run-time systems. Our run-time also demonstrates similar performance benefits on AMD Magny Cours and SGI Altix systems, enabling several benchmarks to successfully scale to 192 CPUs of an SGI Altix. © The Author(s) 2012.


Olivier S.L.,University of North Carolina at Chapel Hill | Porterfield A.K.,Renaissance Computing Institute RENCI | Wheeler K.B.,Sandia National Laboratories | Prins J.F.,University of North Carolina at Chapel Hill
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2011 | Year: 2011

The recent addition of task parallelism to the OpenMP shared memory API allows programmers to express concurrency at a high level of abstraction and places the burden of scheduling parallel execution on the OpenMP run time system. This is a welcome development for scientific computing as supercomputer nodes grow "fatter" with multicore and manycore processors. But efficient scheduling of tasks on modern multi-socket multicore shared memory systems requires careful consideration of an increasingly complex memory hierarchy, including shared caches and NUMA characteristics. In this paper, we propose a hierarchical scheduling strategy that leverages different methods at different levels of the hierarchy. By allowing one thread to steal work on behalf of all of the threads within a single chip that share a cache, our scheduler limits the number of costly remote steals. For cores on the same chip, a shared LIFO queue allows exploitation of cache locality between sibling tasks as well between a parent task and its newly created child tasks. We extended the open-source Qthreads threading library to implement our scheduler, accepting OpenMP programs through the ROSE compiler. We also present a comprehensive performance study of diverse OpenMP task parallel benchmarks, comparing seven different task parallel run time scheduler implementations on current generation multi-socket multicore systems: our hierarchical work stealing scheduler, a fully-distributed work stealing scheduler, a centralized scheduler, and LIFO and FIFO versions of the original Qthreads fully-distributed scheduler. In addition, we compare our results against OpenMP implementations from Intel and GCC. Hierarchical scheduling in Qthreads is competitive on all benchmarks. On several benchmarks, hierarchical scheduling in Qthreads demonstrates speedup and absolute performance superior to both the Intel and GCC OpenMP run time systems. © 2011 ACM.

Loading Renaissance Computing Institute RENCI collaborators
Loading Renaissance Computing Institute RENCI collaborators