Publications

45 Results
Skip to search filters

Toward the analysis of embedded firmware through automated re-hosting

RAID 2019 Proceedings - 22nd International Symposium on Research in Attacks, Intrusions and Defenses

Gustafson, Eric D.; Muench, Marius; Spensky, Chad; Redini, Nilo; Machiry, Aravind; Fratantonio, Yanick; Francillon, Aurélien; Balzarotti, Davide; Choe, Yung R.; Kruegel, Christopher; Vigna, Giovanni

The recent paradigm shift introduced by the Internet of Things (IoT) has brought embedded systems into focus as a target for both security analysts and malicious adversaries. Typified by their lack of standardized hardware, diverse software, and opaque functionality, IoT devices present unique challenges to security analysts due to the tight coupling between their firmware and the hardware for which it was designed. In order to take advantage of modern program analysis techniques, such as fuzzing or symbolic execution, with any kind of scale or depth, analysts must have the ability to execute firmware code in emulated (or virtualized) environments. However, these emulation environments are rarely available and are cumbersome to create through manual reverse engineering, greatly limiting the analysis of binary firmware. In this work, we explore the problem of firmware re-hosting, the process by which firmware is migrated from its original hardware environment into a virtualized one. We show that an approach capable of creating virtual, interactive environments in an automated manner is a necessity to enable firmware analysis at scale. We present the first proof-of-concept system aiming to achieve this goal, called PRETENDER, which uses observations of the interactions between the original hardware and the firmware to automatically create models of peripherals, and allows for the execution of the firmware in a fully-emulated environment. Unlike previous approaches, these models are interactive, stateful, and transferable, meaning they are designed to allow the program to receive and process new input, a requirement of many analyses. We demonstrate our approach on multiple hardware platforms and firmware samples, and show that the models are flexible enough to allow for virtualized code execution, the exploration of new code paths, and the identification of security vulnerabilities.

More Details

Using loops for malware classification resilient to feature-unaware perturbations

ACM International Conference Proceeding Series

Machiry, Aravind; Fratantonio, Yanick; Redini, Nilo; Choe, Yung R.; Vigna, Giovanni; Gustafson, Eric D.; Kruegel, Christopher

In the past few years, both the industry and the academic communities have developed several approaches to detect malicious Android apps. State-of-the-art research approaches achieve very high accuracy when performing malware detection on existing datasets. These approaches perform their malware classification tasks in an "offline" scenario, where malware authors cannot learn from and adapt their malicious apps to these systems. In real-world deployments, however, adversaries get feedback about whether their app was detected, and can react accordingly by transforming their code until they are able to influence the classification. In this work, we propose a new approach for detecting Android malware that is designed to be resilient to feature-unaware pertur¬ bations without retraining. Our work builds on two key ideas. First, we consider only a subset of the codebase of a given app, both for precision and performance aspects. For this paper, our implementation focuses exclusively on the loops contained in a given app. We hypothesize, and empirically verify, that the code contained in apps' loops is enough to precisely detect malware. This provides the additional benefits of being less prone to noise and errors, and being more performant. The second idea is to build a feature space by extracting a set of labels for each loop, and by then considering each unique combination of these labels as a different feature: The combinatorial nature of this feature space makes it prohibitively difficult for an attacker to influence our feature vector and avoid detection, without access to the speciic model used for classiication. We assembled these techniques into a prototype, called L O O P M C, which can locate loops in applications, extract features, and perform classification, without requiring source code. We used L O O P M C to classify about 20,000 benign and malicious applications. While focusing on a smaller portion of the program may seem counter-intuitive, the results of these experiments are surprising: our system achieves a classification accuracy of 99.3% and 99.1% for the Malware Genome Project and VirusShare datasets respectively, which outperforms previous approaches. We also evaluated L O O P M C, along with the related work, in the context of various evasion techniques, and show that our system is more resilient to evasion.

More Details

Intrusion Detection with Unsupervised Heterogeneous Ensembles Using Cluster-Based Normalization

Proceedings - 2017 IEEE 24th International Conference on Web Services, ICWS 2017

Ruoti, Scott; Heidbrink, Scott H.; Oneill, Mark; Gustafson, Eric D.; Choe, Yung R.

Outlier detection has been shown to be a promising machine learning technique for a diverse array of felds and problem areas. However, traditional, supervised outlier detection is not well suited for problems such as network intrusion detection, where proper labelled data is scarce. This has created a focus on extending these approaches to be unsupervised, removing the need for explicit labels, but at a cost of poorer performance compared to their supervised counterparts. Recent work has explored ways of making up for this, such as creating ensembles of diverse models, or even diverse learning algorithms, to jointly classify data. While using unsupervised, heterogeneous ensembles of learning algorithms has been proposed as a viable next step for research, the implications of how these ensembles are built and used has not been explored.

More Details

Physically Unclonable Digital ID

Proceedings - 2015 IEEE 3rd International Conference on Mobile Services, MS 2015

Choi, Sung; Zage, David; Choe, Yung R.; Wasilow, Brent

The Center for Strategic and International Studies estimates the annual cost from cyber crime to be more than $400 billion. Most notable is the recent digital identity thefts that compromised millions of accounts. These attacks emphasize the security problems of using clonable static information. One possible solution is the use of a physical device known as a Physically Unclonable Function (PUF). PUFs can be used to create encryption keys, generate random numbers, or authenticate devices. While the concept shows promise, current PUF implementations are inherently problematic: inconsistent behavior, expensive, susceptible to modeling attacks, and permanent. Therefore, we propose a new solution by which an unclonable, dynamic digital identity is created between two communication endpoints such as mobile devices. This Physically Unclonable Digital ID (PUDID) is created by injecting a data scrambling PUF device at the data origin point that corresponds to a unique and matching descrambler/hardware authentication at the receiving end. This device is designed using macroscopic, intentional anomalies, making them inexpensive to produce. PUDID is resistant to cryptanalysis due to the separation of the challenge response pair and a series of hash functions. PUDID is also unique in that by combining the PUF device identity with a dynamic human identity, we can create true two-factor authentication. We also propose an alternative solution that eliminates the need for a PUF mechanism altogether by combining tamper resistant capabilities with a series of hash functions. This tamper resistant device, referred to as a Quasi-PUDID (Q-PUDID), modifies input data, using a black-box mechanism, in an unpredictable way. By mimicking PUF attributes, Q-PUDID is able to avoid traditional PUF challenges thereby providing high-performing physical identity assurance with or without a low performing PUF mechanism. Three different application scenarios with mobile devices for PUDID and Q-PUDID have been analyzed to show their unique advantages over traditional PUFs and outline the potential for placement in a host of applications.

More Details

EMBERS: EpheMeral biometrically enhanced real-time location System

Proceedings - International Carnahan Conference on Security Technology

Choi, Sung N.; Bierma, Michael B.; Choe, Yung R.; Zage, David J.

In nuclear facilities, having efficient accountability of critical assets, personnel locations, and activities is essential for productive, safe, and secure operations. Such accountability tracked through standard manual procedures is highly inefficient and prone to human error. The ability to actively and autonomously monitor both personnel and critical assets can significantly enhance security and safety operations while removing significant levels of human reliability issues and reducing insider threat concerns. A Real-Time Location System (RTLS) encompasses several technologies that use wireless signals to determine the precise location of tagged critical assets or personnel. RTLS systems include tags that either transmit or receive signals at regular intervals, location sensors/beacons that receive/transmit signals, and a location appliance that collects and correlates the data. Combined with ephemeral biometrics (EB) to validate the live-state of a user, an ephemeral biometrically-enhanced RTLS (EMBERS) can eliminate time-consuming manual searches and audits by providing precise location data. If critical assets or people leave a defined secured area, EMBERS can automatically trigger an alert and function as an access control mechanism and/or ingress/egress monitoring tool. Three different EMBERS application scenarios for safety and security have been analyzed and the heuristic results of this study are outlined in this paper along with areas of technological improvements and innovations that can be made if EMBERS is to be used as safety and security tool.

More Details

Experimental evaluation of the impact of packet capturing tools for web services

GLOBECOM - IEEE Global Telecommunications Conference

Chen, Chao C.; Choe, Yung R.; Chuah, Chen N.; Mohapatra, Prasant

Network measurement is a discipline that provides the techniques to collect data that are fundamental to many branches of computer science. While many capturing tools and comparisons have made available in the literature and elsewhere, the impact of these packet capturing tools on existing processes have not been thoroughly studied. While not a concern for collection methods in which dedicated servers are used, many usage scenarios of packet capturing now requires the packet capturing tool to run concurrently with operational processes. In this paper we perform experimental evaluations of the performance impact that packet capturing process have on webbased services; in particular, we observe the impact on web servers. We find that packet capturing processes indeed impact the performance of web servers, but on a multi- core system the impact varies depending on whether the packet capturing and web hosting processes are co-located or not. In addition, the architecture and behavior of the web server and process scheduling is coupled with the behavior of the packet capturing process, which in turn also affect the web server's performance. © 2011 IEEE.

More Details

Experimental evaluation of the impact of packet capturing tools for web services

Choe, Yung R.

Network measurement is a discipline that provides the techniques to collect data that are fundamental to many branches of computer science. While many capturing tools and comparisons have made available in the literature and elsewhere, the impact of these packet capturing tools on existing processes have not been thoroughly studied. While not a concern for collection methods in which dedicated servers are used, many usage scenarios of packet capturing now requires the packet capturing tool to run concurrently with operational processes. In this work we perform experimental evaluations of the performance impact that packet capturing process have on web-based services; in particular, we observe the impact on web servers. We find that packet capturing processes indeed impact the performance of web servers, but on a multi-core system the impact varies depending on whether the packet capturing and web hosting processes are co-located or not. In addition, the architecture and behavior of the web server and process scheduling is coupled with the behavior of the packet capturing process, which in turn also affect the web server's performance.

More Details

Scientific data analysis on data-parallel platforms

Roe, Diana C.; Choe, Yung R.; Ulmer, Craig D.

As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated analysis algorithms in the computing platforms' storage systems. Data Warehouse Appliances (DWAs) are attractive for this work, due to their ability to store and process massive datasets efficiently. While DWAs have been utilized effectively in data-mining and informatics applications, they remain largely unproven in scientific workloads. In this paper we present our experiences in adapting two mesh analysis algorithms to function on five different DWA architectures: two Netezza database appliances, an XtremeData dbX database, a LexisNexis DAS, and multiple Hadoop MapReduce clusters. The main contribution of this work is insight into the differences between these DWAs from a user's perspective. In addition, we present performance measurements for ten DWA systems to help understand the impact of different architectural trade-offs in these systems.

More Details

Exploring data warehouse appliances for mesh analysis applications

Proceedings of the Annual Hawaii International Conference on System Sciences

Ulmer, Craig; Bayer, Gregory B.; Choe, Yung R.; Roe, Diana C.

As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated data analysis algorithms in the storage systems for the computing platforms. Data Warehouse Appliances (DWAs) are an attractive option for this work, due to their ability to process massive datasets efficiently. While DWAs have been proven effective in data mining and informatics applications, there are relatively few examples of how DWAs can be integrated into the scientific computing workflow. In this paper we present our experiences in adapting two mesh analysis algorithms to function on two different DWAs: a SQL-based Netezza database appliance and a Map/Reduce-based Hadoop cluster. The main contribution of this work is insight into the differences between the two platforms' programming environments. In addition, we present performance measurements for entry-level DWAs to help provide a first-order comparison of the hardware. © 2010 IEEE.

More Details
45 Results
45 Results