IMDEA Software Institute

United States

IMDEA Software Institute

United States
SEARCH FILTERS
Time filter
Source Type

News Article | May 22, 2017
Site: www.eurekalert.org

By analyzing network traffic going to suspicious domains, security administrators could detect malware infections weeks or even months before they're able to capture a sample of the invading malware, a new study suggests. The findings point toward the need for new malware-independent detection strategies that will give network defenders the ability to identify network security breaches in a more timely manner. The strategy would take advantage of the fact that malware invaders need to communicate with their command and control computers, creating network traffic that can be detected and analyzed. Having an earlier warning of developing malware infections could enable quicker responses and potentially reduce the impact of attacks, the study's researchers say. "Our study shows that by the time you find the malware, it's already too late because the network communications and domain names used by the malware were active weeks or even months before the actual malware was discovered," said Manos Antonakakis, an assistant professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. "These findings show that we need to fundamentally change the way we think about network defense." Traditional defenses depend on the detection of malware in a network. While analyzing malware samples can identify suspicious domains and help attribute network attacks to their sources, relying on samples to drive defensive actions gives malicious actors a critical time advantage to gather information and cause damage. "What we need to do is minimize the amount of time between the compromise and the detection event," Antonakakis added. The research, which will be presented May 24 at the 38th IEEE Security and Privacy Symposium in San Jose, California, was supported by the U.S. Department of Commerce, the National Science Foundation, the Air Force Research Laboratory and the Defense Advanced Research Projects Agency. The project was done in collaboration with EURECOM in France and the IMDEA Software Institute in Spain - whose work was supported by the regional government of Madrid and the government of Spain. In the study, Antonakakis, Graduate Research Assistant Chaz Lever and colleagues analyzed more than five billion network events from nearly five years of network traffic carried by a major U.S. internet service provider (ISP). They also studied domain name server (DNS) requests made by nearly 27 million malware samples, and examined the timing for the re-registration of expired domains - which often provide the launch sites for malware attacks. "There were certain networks that were more prone to abuse, so looking for traffic into those hot spot networks was potentially a good indicator of abuse underway," said Lever, the first author of the paper and a student in Georgia Tech's School of Electrical and Computer Engineering. "If you see a lot of DNS requests pointing to hot spots of abuse, that should raise concerns about potential infections." The researchers also found that requests for dynamic DNS also related to bad activity, as these often correlate with services used by bad actors because they provide free domain registrations and the ability to add quickly add domains. The researchers had hoped that the registration of previously expired domain names might provide a warning of impending attacks. But Lever found there was often a lag of months between when expired domains were re-registered and attacks from them began. The research required development of a filtering system to separate benign network traffic from malicious traffic in the ISP data. The researchers also conducted what they believe is the largest malware classification effort to date to differentiate the malicious software from potentially unwanted programs (PUPs). To study similarities, they assigned the malware to specific "families." By studying malware-related network traffic seen by the ISPs prior to detection of the malware, the researchers were able to determine that malware signals were present weeks and even months before new malicious software was found. Relating that to human health, Antonakakis compares the network signals to the fever or general feeling of malaise that often precedes identification of the microorganism responsible for an infection. "You know you are sick when you have a fever, before you know exactly what's causing it," he said. "The first thing the adversary does is set up a presence on the internet, and that first signal can indicate an infection. We should try to observe that symptom first on the network because if we wait to see the malware sample, we are almost certainly allowing a major infection to develop." In all, the researchers found more than 300,000 malware domains that were active for at least two weeks before the corresponding malware samples were identified and analyzed. But as with human health, detecting a change indicating infection requires knowledge of the baseline activity, he said. Network administrators must have information about normal network traffic so they can detect the abnormalities that may signal a developing attack. While many aspects of an attack can be hidden, malware must always communicate back to those who sent it. "If you have the ability to detect traffic in a network, regardless of how the malware may have gotten in, the action of communicating through the network will be observable," Antonakais said. "Network administrators should minimize the unknowns in their networks and classify their appropriate communications as much as possible so they can see the bad activity when it happens." Antonakakis and Lever hope their study will lead to development of new strategies for defending computer networks. "The choke point is the network traffic, and that's where this battle should be fought," said Antonakakis. "This study provides a fundamental observation of how the next generation of defense mechanisms should be designed. As more complicated attacks come into being, we will have to become smarter at detecting them earlier." In addition to those already mentioned, the study included Davide Balzarotti from EURECOM, and Platon Kotzias and Juan Cabellero from IMDEA Software Institute. This material is based upon work supported in part by the U.S. Department of Commerce grant 2106DEK, National Science Foundation (NSF) grant 2106DGX and Air Force Research Laboratory/Defense Advanced Research Projects Agency grant 2106DTX. This research was also partially supported by the Regional Government of Madrid through the N-GREENS Software-CM S2013/ICE-2731 project and by the Spanish Government through the DEDETIS grant TIN2015-7013-R. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Department of Commerce, National Science Foundation, Air Force Research Laboratory, or Defense Advanced Research Projects Agency. CITATION: Chaz Lever, et al., "A Lustrum of Malware Network Communication: Evolution and Insights," (38th IEEE Security and Privacy Symposium, 2017).


News Article | May 22, 2017
Site: phys.org

The strategy would take advantage of the fact that malware invaders need to communicate with their command and control computers, creating network traffic that can be detected and analyzed. Having an earlier warning of developing malware infections could enable quicker responses and potentially reduce the impact of attacks, the study's researchers say. "Our study shows that by the time you find the malware, it's already too late because the network communications and domain names used by the malware were active weeks or even months before the actual malware was discovered," said Manos Antonakakis, an assistant professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. "These findings show that we need to fundamentally change the way we think about network defense." Traditional defenses depend on the detection of malware in a network. While analyzing malware samples can identify suspicious domains and help attribute network attacks to their sources, relying on samples to drive defensive actions gives malicious actors a critical time advantage to gather information and cause damage. "What we need to do is minimize the amount of time between the compromise and the detection event," Antonakakis added. The research, which will be presented May 24 at the 38th IEEE Security and Privacy Symposium in San Jose, California, was supported by the U.S. Department of Commerce, the National Science Foundation, the Air Force Research Laboratory and the Defense Advanced Research Projects Agency. The project was done in collaboration with EURECOM in France and the IMDEA Software Institute in Spain - whose work was supported by the regional government of Madrid and the government of Spain. In the study, Antonakakis, Graduate Research Assistant Chaz Lever and colleagues analyzed more than five billion network events from nearly five years of network traffic carried by a major U.S. internet service provider (ISP). They also studied domain name server (DNS) requests made by nearly 27 million malware samples, and examined the timing for the re-registration of expired domains - which often provide the launch sites for malware attacks. "There were certain networks that were more prone to abuse, so looking for traffic into those hot spot networks was potentially a good indicator of abuse underway," said Lever, the first author of the paper and a student in Georgia Tech's School of Electrical and Computer Engineering. "If you see a lot of DNS requests pointing to hot spots of abuse, that should raise concerns about potential infections." The researchers also found that requests for dynamic DNS also related to bad activity, as these often correlate with services used by bad actors because they provide free domain registrations and the ability to add quickly add domains. The researchers had hoped that the registration of previously expired domain names might provide a warning of impending attacks. But Lever found there was often a lag of months between when expired domains were re-registered and attacks from them began. The research required development of a filtering system to separate benign network traffic from malicious traffic in the ISP data. The researchers also conducted what they believe is the largest malware classification effort to date to differentiate the malicious software from potentially unwanted programs (PUPs). To study similarities, they assigned the malware to specific "families." By studying malware-related network traffic seen by the ISPs prior to detection of the malware, the researchers were able to determine that malware signals were present weeks and even months before new malicious software was found. Relating that to human health, Antonakakis compares the network signals to the fever or general feeling of malaise that often precedes identification of the microorganism responsible for an infection. "You know you are sick when you have a fever, before you know exactly what's causing it," he said. "The first thing the adversary does is set up a presence on the internet, and that first signal can indicate an infection. We should try to observe that symptom first on the network because if we wait to see the malware sample, we are almost certainly allowing a major infection to develop." In all, the researchers found more than 300,000 malware domains that were active for at least two weeks before the corresponding malware samples were identified and analyzed. But as with human health, detecting a change indicating infection requires knowledge of the baseline activity, he said. Network administrators must have information about normal network traffic so they can detect the abnormalities that may signal a developing attack. While many aspects of an attack can be hidden, malware must always communicate back to those who sent it. "If you have the ability to detect traffic in a network, regardless of how the malware may have gotten in, the action of communicating through the network will be observable," Antonakais said. "Network administrators should minimize the unknowns in their networks and classify their appropriate communications as much as possible so they can see the bad activity when it happens." Antonakakis and Lever hope their study will lead to development of new strategies for defending computer networks. "The choke point is the network traffic, and that's where this battle should be fought," said Antonakakis. "This study provides a fundamental observation of how the next generation of defense mechanisms should be designed. As more complicated attacks come into being, we will have to become smarter at detecting them earlier." Explore further: Here's how the ransomware attack was stopped – and why it could soon start again More information: Chaz Lever, et al., "A Lustrum of Malware Network Communication: Evolution and Insights," 38th IEEE Security and Privacy Symposium, 2017.


News Article | May 25, 2017
Site: www.sciencedaily.com

By analyzing network traffic going to suspicious domains, security administrators could detect malware infections weeks or even months before they're able to capture a sample of the invading malware, a new study suggests. The findings point toward the need for new malware-independent detection strategies that will give network defenders the ability to identify network security breaches in a more timely manner. The strategy would take advantage of the fact that malware invaders need to communicate with their command and control computers, creating network traffic that can be detected and analyzed. Having an earlier warning of developing malware infections could enable quicker responses and potentially reduce the impact of attacks, the study's researchers say. "Our study shows that by the time you find the malware, it's already too late because the network communications and domain names used by the malware were active weeks or even months before the actual malware was discovered," said Manos Antonakakis, an assistant professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. "These findings show that we need to fundamentally change the way we think about network defense." Traditional defenses depend on the detection of malware in a network. While analyzing malware samples can identify suspicious domains and help attribute network attacks to their sources, relying on samples to drive defensive actions gives malicious actors a critical time advantage to gather information and cause damage. "What we need to do is minimize the amount of time between the compromise and the detection event," Antonakakis added. The research, which will be presented May 24 at the 38th IEEE Security and Privacy Symposium in San Jose, California, was supported by the U.S. Department of Commerce, the National Science Foundation, the Air Force Research Laboratory and the Defense Advanced Research Projects Agency. The project was done in collaboration with EURECOM in France and the IMDEA Software Institute in Spain -- whose work was supported by the regional government of Madrid and the government of Spain. In the study, Antonakakis, Graduate Research Assistant Chaz Lever and colleagues analyzed more than five billion network events from nearly five years of network traffic carried by a major U.S. internet service provider (ISP). They also studied domain name server (DNS) requests made by nearly 27 million malware samples, and examined the timing for the re-registration of expired domains -- which often provide the launch sites for malware attacks. "There were certain networks that were more prone to abuse, so looking for traffic into those hot spot networks was potentially a good indicator of abuse underway," said Lever, the first author of the paper and a student in Georgia Tech's School of Electrical and Computer Engineering. "If you see a lot of DNS requests pointing to hot spots of abuse, that should raise concerns about potential infections." The researchers also found that requests for dynamic DNS also related to bad activity, as these often correlate with services used by bad actors because they provide free domain registrations and the ability to add quickly add domains. The researchers had hoped that the registration of previously expired domain names might provide a warning of impending attacks. But Lever found there was often a lag of months between when expired domains were re-registered and attacks from them began. The research required development of a filtering system to separate benign network traffic from malicious traffic in the ISP data. The researchers also conducted what they believe is the largest malware classification effort to date to differentiate the malicious software from potentially unwanted programs (PUPs). To study similarities, they assigned the malware to specific "families." By studying malware-related network traffic seen by the ISPs prior to detection of the malware, the researchers were able to determine that malware signals were present weeks and even months before new malicious software was found. Relating that to human health, Antonakakis compares the network signals to the fever or general feeling of malaise that often precedes identification of the microorganism responsible for an infection. "You know you are sick when you have a fever, before you know exactly what's causing it," he said. "The first thing the adversary does is set up a presence on the internet, and that first signal can indicate an infection. We should try to observe that symptom first on the network because if we wait to see the malware sample, we are almost certainly allowing a major infection to develop." In all, the researchers found more than 300,000 malware domains that were active for at least two weeks before the corresponding malware samples were identified and analyzed. But as with human health, detecting a change indicating infection requires knowledge of the baseline activity, he said. Network administrators must have information about normal network traffic so they can detect the abnormalities that may signal a developing attack. While many aspects of an attack can be hidden, malware must always communicate back to those who sent it. "If you have the ability to detect traffic in a network, regardless of how the malware may have gotten in, the action of communicating through the network will be observable," Antonakais said. "Network administrators should minimize the unknowns in their networks and classify their appropriate communications as much as possible so they can see the bad activity when it happens." Antonakakis and Lever hope their study will lead to development of new strategies for defending computer networks. "The choke point is the network traffic, and that's where this battle should be fought," said Antonakakis. "This study provides a fundamental observation of how the next generation of defense mechanisms should be designed. As more complicated attacks come into being, we will have to become smarter at detecting them earlier."


Swamy N.,Microsoft | Fournet C.,Microsoft | Rastogi A.,University of Maryland College Park | Bhargavan K.,French Institute for Research in Computer Science and Automation | And 3 more authors.
Conference Record of the Annual ACM Symposium on Principles of Programming Languages | Year: 2014

JavaScript's flexible semantics makes writing correct code hard and writing secure code extremely difficult. To address the former problem, various forms of gradual typing have been proposed, such as Closure and TypeScript. However, supporting all common programming idioms is not easy; for example, TypeScript deliberately gives up type soundness for programming convenience. In this paper, we propose a gradual type system and implementation techniques that provide important safety and security guarantees. We present TS*, a gradual type system and source-to-source compiler for JavaScript. In contrast to prior gradual type systems, TS* features full runtime reflection over three kinds of types: (1) simple types for higher-order functions, recursive datatypes and dictionary-based extensible records; (2) the type any, for dynamically type-safe TS* expressions; and (3) the type un, for untrusted, potentially malicious JavaScript contexts in which TS* is embedded. After type-checking, the compiler instruments the program with various checks to ensure the type safety of TS* despite its interactions with arbitrary JavaScript contexts, which are free to use eval, stack walks, prototype customizations, and other offensive features. The proof of our main theorem employs a form of type-preserving compilation, wherein we prove all the runtime invariants of the translation of TS* to JavaScript by showing that translated programs are well-typed in JS*, a previously proposed dependently typed language for proving functional correctness of JavaScript programs. We describe a prototype compiler, a secure runtime, and sample applications for TS*. Our examples illustrate how web security patterns that developers currently program in JavaScript (with much difficulty and still with dubious results) can instead be programmed naturally in TS*, retaining a flavor of idiomatic JavaScript, while providing strong safety guarantees by virtue of typing. © 2014 ACM.


Burckhardt S.,Microsoft | Gotsman A.,IMDEA Software Institute | Yang H.,University of Oxford | Zawirski M.,French Institute for Research in Computer Science and Automation
Conference Record of the Annual ACM Symposium on Principles of Programming Languages | Year: 2014

Geographically distributed systems often rely on replicated eventually consistent data stores to achieve availability and performance. To resolve conflicting updates at different replicas, researchers and practitioners have proposed specialized consistency protocols, called replicated data types, that implement objects such as registers, counters, sets or lists. Reasoning about replicated data types has however not been on par with comparable work on abstract data types and concurrent data types, lacking specifications, correctness proofs, and optimality results. To fill in this gap, we propose a framework for specifying replicated data types using relations over events and verifying their implementations using replication-aware simulations. We apply it to 7 existing implementations of 4 data types with nontrivial conflict-resolution strategies and optimizations (last-writer-wins register, counter, multi-value register and observed-remove set). We also present a novel technique for obtaining lower bounds on the worst-case space overhead of data type implementations and use it to prove optimality of 4 implementations. Finally, we show how to specify consistency of replicated stores with multiple objects axiomatically, in analogy to prior work on weak memory models. Overall, our work provides foundational reasoning tools to support research on replicated eventually consistent stores. © 2014 ACM.


Barthe G.,IMDEA Software Institute | Fournet C.,Microsoft | Gregoire B.,French Institute for Research in Computer Science and Automation | Strub P.-Y.,IMDEA Software Institute | And 2 more authors.
Conference Record of the Annual ACM Symposium on Principles of Programming Languages | Year: 2014

Relational program logics have been used for mechanizing formal proofs of various cryptographic constructions. With an eye towards scaling these successes towards end-to-end security proofs for implementations of distributed systems, we present RF*, a relational extension of F*, a general-purpose higher-order stateful programming language with a verification system based on refinement types. The distinguishing feature of F*is a relational Hoare logic for a higher-order, stateful, probabilistic language. Through careful language design, we adapt the F*typechecker to generate both classic and relational verification conditions, and to automatically discharge their proofs using an SMT solver. Thus, we are able to benefit from the existing features of F*, including its abstraction facilities for modular reasoning about program fragments. We evaluate RF*experimentally by programming a series of cryptographic constructions and protocols, and by verifying their security properties, ranging from information flow to unlinkability, integrity, and privacy. Moreover, we validate the design of RF*by formalizing in Coq a core probabilistic λ-calculus and a relational refinement type system and proving the soundness of the latter against a denotational semantics of the probabilistic lambda λ-calculus. © 2014 ACM.


Barthe G.,IMDEA Software Institute | Kunz C.,IMDEA Madrid Institute for Advanced Studies
ACM Transactions on Programming Languages and Systems | Year: 2011

A certificate is a mathematical object that can be used to establish that a piece of mobile code satisfies some security policy. In general, certificates cannot be generated automatically. There is thus an interest in developing methods to reuse certificates generated for source code to provide strong guarantees of the compiled code correctness. Certificate translation is a method to transform certificates of program correctness along semantically justified program transformations. These methods have been developed in previous work, but they were strongly dependent on particular programming and verification settings. This article provides a more general development in the setting of abstract interpretation, showing the scalability of certificate translation. © 2011 ACM.


Attiya H.,Technion - Israel Institute of Technology | Gotsman A.,IMDEA Software Institute | Hans S.,Technion - Israel Institute of Technology | Rinetzky N.,Tel Aviv University
Proceedings of the Annual ACM Symposium on Principles of Distributed Computing | Year: 2013

Transactional memory (TM) has been hailed as a paradigm for simplifying concurrent programming. While several consistency conditions have been suggested for TM, they fall short of formalizing the intuitive semantics of atomic blocks, the interface through which a TM is used in a programming language. To close this gap, we formalize the intuitive expectations of a programmer as observational refinement between TM implementations: a concrete TM observationally refines an abstract one if every user-observable behavior of a program using the former can be reproduced if the program uses the latter. This allows the programmer to reason about the behavior of a program using the intuitive semantics formalized by the abstract TM; the observational refinement relation implies that the conclusions will carry over to the case when the program uses the concrete TM. We show that, for a particular programming language and notions of observable behavior, a variant of the well-known consistency condition of opacity is sufficient for observational refinement, and its restriction to complete histories is furthermore necessary. Our results suggest a new approach to evaluating and comparing TM consistency conditions. They can also reduce the effort of proving that a TM implements its programming language interface correctly, by only requiring its developer to show that it satisfies the corresponding consistency condition. Copyright 2013 ACM.


Gotsman A.,IMDEA Software Institute | Yang H.,University of Oxford
Proceedings of the ACM SIGPLAN International Conference on Functional Programming, ICFP | Year: 2011

Most major OS kernels today run on multiprocessor systems and are preemptive: it is possible for a process running in the kernel mode to get descheduled. Existing modular techniques for verifying concurrent code are not directly applicable in this setting: they rely on scheduling being implemented correctly, and in a preemptive kernel, the correctness of the scheduler is interdependent with the correctness of the code it schedules. This interdependency is even stronger in mainstream kernels, such as Linux, FreeBSD or XNU, where the scheduler and processes interact in complex ways. We propose the first logic that is able to decompose the verification of preemptive multiprocessor kernel code into verifying the scheduler and the rest of the kernel separately, even in the presence of complex interdependencies between the two components. The logic hides the manipulation of control by the scheduler when reasoning about preemptable code and soundly inherits proof rules from concurrent separation logic to verify it thread-modularly. This is achieved by establishing a novel form of refinement between an operational semantics of the real machine and an axiomatic semantics of OS processes, where the latter assumes an abstract machine with each process executing on a separate virtual CPU. The refinement is local in the sense that the logic focuses only on the relevant state of the kernel while verifying the scheduler. We illustrate the power of our logic by verifying an example scheduler, modelled on the one from Linux 2.6.11. Copyright © 2011 ACM.


Barthe G.,IMDEA Software Institute | Kopf B.,IMDEA Software Institute
Proceedings - IEEE Computer Security Foundations Symposium | Year: 2011

There are two active and independent lines of research that aim at quantifying the amount of information that is disclosed by computing on confidential data. Each line of research has developed its own notion of confidentiality: on the one hand, differential privacy is the emerging consensus guarantee used for privacy-preserving data analysis. On the other hand, information-theoretic notions of leakage are used for characterizing the confidentiality properties of programs in language-based settings. The purpose of this article is to establish formal connections between both notions of confidentiality, and to compare them in terms of the security guarantees they deliver. We obtain the following results. First, we establish upper bounds for the leakage of every ε-differentially private mechanism in terms of eps and the size of the mechanism's input domain. We achieve this by identifying and leveraging a connection to coding theory. Second, we construct a class of eps-differentially private channels whose leakage grows with the size of their input domains. Using these channels, we show that there cannot be domain-size-independent bounds for the leakage of all ε-differentially private mechanisms. Moreover, we perform an empirical evaluation that shows that the leakage of these channels almost matches our theoretical upper bounds, demonstrating the accuracy of these bounds. Finally, we show that the question of providing optimal upper bounds for the leakage of ε-differentially private mechanisms in terms of rational functions of ε is in fact decidable. © 2011 IEEE.

Loading IMDEA Software Institute collaborators
Loading IMDEA Software Institute collaborators