Student Projects Completed in 2019-2020

Fall 2019 Student Projects

  • Students: Yuqing Wang, Zhanyuan Gao, Zhengyuan Wang

    Faculty Mentor: Tim Leschke

    Abstract: Quick respond codes, a form of a two-dimension barcode, are widely applied in various fields. They are increasingly popular and at the same time, becoming a quality tool potentially to be adopted by cyber-criminals. Actually, there has already been an upward trend of QR-code-related crimes in recent years. To deal with the increasing crimes in this area, we believe that a study is required for explaining the general theories concerning crimes related to QR codes. Besides, this lack of related research also increases the difficulty of digital forensics. In this paper, we analyzed several types of crimes based on the feature of QR codes. Based on the analysis of QR code crimes, we also analyzed the reason why difficulty occurred in related digital forensic works and provided a feasible solution. To deal with the difficulty, we developed a third-party module which can be applied as a digital forensic tool and further tested with data.

  • Students: Minjie Fu, Jingyi Li, Qin Gao

    Faculty Mentor: Yinzhi Cao

    Abstract: Node.js, the open source JavaScript runtime platform, has gained its reputation because of the popularity of JavaScript and numerous developers depend on Node.js for their applications. Consequently, the security of Node.js applications has also gained the attention of developers. Especially considering that Node.js applications have a large number of dependencies, they are vulnerable to various malicious code and application layer attacks. Nowadays, vulnerability analysing tools become the common resort that most organizations turn to so that they can identify coding flaws and fix them at an early stage of development. As various kinds of Node.js based analysing tools have gradually emerged in marketplaces. Therefore, it is significant to compare and evaluate the performance of typical vulnerability analysing tools and provide useful information for web developers or scanner research teams.

    In this project, we selected four vulnerability scanners as our research objects: Snyk, Retire.js, Node Security Platform and NodeJsScan. We compared their functionality. Based on different mechanisms, we divided them into two groups to evaluate their performances. We selected certain features from collected data and figured out suitable algorithms to calculate their efficiency and accuracy. And then we created graphs to visualize the results and conclude the main features for each tool so that users could find the most suitable one according to their needs.

  • Students: Nihal Shah, Shivani Deshpande, Beckett Browning

    Faculty Mentor: Lei Ding

    Abstract: As cyberspace expands and new cyber threats arise, Intrusion Detection Systems (IDS) become increasingly valuable in reducing the impact of malicious actors. The challenge for a successful detection mechanism is differentiating benign traffic from malicious traffic. This project focuses on identifying and analyzing the performance of a variety of machine learning algorithms in classifying attacks such as DoS, DDoS, SQL injection, and heartbleed. The dataset we have selected is the CICIDS2017 dataset which contains the latest and most commonly used attacks. We conduct a comparative analysis of various machine learning algorithms by evaluating the accuracy, precision, recall, and F1-score values observed in processing the CICIDS2017 dataset after being cleaned.

  • Students: Minzhi He, Jiaming Zhou and Chunxiao Liu

    Faculty Mentor: Reuben Johnston

    Abstract: With the development of reverse engineering tools, the options to analyze a program is on the rise. Ghidra is one of the state-of-art reverse engineering tools that has just been released in March 2019. However, like most other tools, Ghidra mostly provides text-based analysis results that are hard to understand. The visualization functionality provided by Ghidra is also in a primary stage. To help people, both professionals and novices, to better understand the reverse engineering result, we proposed a solution to address this issue: using a graph database to visualize the reverse engineering result. We chose Neo4j as our visualization library. We first extracted data from Ghidra, selected some features that contain valuable information and exported it into a local file. Then, we imported the file to Neo4j and generated the Neo4j graph database. The graph can show the function calls of the entire program and support different types of queries. For external calls, users can be redirected to the function call graphs of the external library. The program improved the efficiency of the reverse engineering process.

  • Students: Abdullah Alghofaili, Atheer Almogbil, Chelsea Deane

    Faculty Mentor: Tim Leschke

    Abstract: Fitbit devices are one of the most popular wearable technologies for health-conscious people. It records information such as logging activity, heart rate, calories burned, and sleep activity. Digital forensic investigators can use this information as evidence to a crime, to either support a suspect’s innocence or guilt. It is important for an investigator to find and analyze every piece of data for accuracy and integrity; however there is no standard for conducting a forensic investigation for wearable technology. In this paper, the authors take on the role of two characters: the common criminal and the forensic investigator. As the common criminal, the authors create scenarios in which the Fitbit device is manipulated in such a way that the device becomes an alibi to a crime. The authors take on the role of the forensic investigator by using publicly available forensic tools and public API’s to extract and analyze the data. It is the responsibility of the investigator to show how the data was obtained and to ensure that the data was not modified during the analysis. The results of this paper show the effects of physically altering the device, manually adding, modifying, or deleting health data, and the artifacts left from using certain services. In addition, the authors demonstrate the type of information that can be found from Forensic Tool Kit (FTK), Autopsy, Bulk Extractor, and from a custom Fitbit API written in python.

  • Students: Tianhao Ma, Tianyi Wei, GuanLong Wu

    Faculty Mentor: Xiangyang Li

    Abstract: Security Operations Center (SOC) is essential to ensure enterprise-level security. In the market, most SOCs are designed for large-scale companies and have over-complex functionality for small-scale companies or educational usage. In our approach, we designed a SOC infrastructure that integrated some common open-source tools to ensure it has similar functionalities as some common SOCs. Our system contains several layers including physical layer, database layer, and application layer to handle data collecting, storing, and visualizing efficiently. Therefore our project shows that by implementing our SOC, small-scale companies could lower their budget to a certain extent as well as obtaining basic security protections.

  • Students: Karol Pierre, Yu Qiu, Cheng Xu, Zichao Yang

    Faculty Mentor: Xiangyang Li

    External Mentors: Matthew Elder (JHU/APL), William Cholter (JHU/APL)

    Abstract: The primary objective of this research is to implement and evaluate a variety of approaches for malware family classification. This research aims to analyze and evaluate the following machine learning models: Logistic Regression, Random Forest, XGBoost, and Neural Networks using static and dynamic features, and a hybrid or combination of the two sets of features, to highlight their ability to detect any underlying relationships and classify between five malware families. The questions we aim to best answer are “How accurate is this model in malware family classification? How does it compare with other popular models? Which features have the greatest impact on performance and why?” In our attempt to answer these questions and based on our research, we were able to determine that the best model for static analysis is Logistic Regression, the best model for hybrid analysis is Neural Networks, and the best model for hybrid analysis is also Neural Networks. Being able to answer these questions will provide the opportunity to improve malware mitigation and control efforts.

  • Students: Benfang Wang, Zuo Wang, Honghao Guo

    Faculty Mentor: Xiangyang Li

    External Mentor: Devu Shila (unknot.id)

    Abstract: Behavioral biometrics are the unique behavioral features of a person, such as the way a person walks, sits, or types on a computer, etc. We can apply machine learning algorithms on such data gathered from end-user devices to classify our subject of interest. However, such machine learning algorithms could be vulnerable to spoofing attacks where adversarial noises were to be added to data from a different subject so that the classifier would be fooled to classify these data as a legitimate user. In past years, researchers have developed mature techniques like Fast Gradient Sign Method (FGSM), to aid white-box attacks on machine learning systems, especially convolution neural networks. We in this novel project, however, for the first time explored the possibility of applying FGSM to launch a black-box attack to a behavioral biometric authentication system. This system is essentially a recurrent neural network, with very limited knowledge about the data of the desired class. Our primary goal is not to launch a successful attack and break the biometric system; rather our goal is to understand the feasibility of applying FGSM in black-box recurrent neural network attacks, and to what extent FGSM would contribute to the attack. From our experiments, we conclude that the performance of FGSM in such a black-box attack depends highly on the quality of our shadow model, which is an intermediate model trained from the queries results coming from the target model used in authentication to help us launch the attack. It is affected by other important parameter configurations in our approach, including the number of queries allowed to train the shadow model and the criteria to guide FGSM. Our experimentation revealed strong relationships between the shadow model and FGSM performance, as well as the critical effect of the number of FGSM iterations to create an attack input and a potential explanation to this effect.

  • Students: Rishabh Singh, Nithin Sade, Yu-Chien Hung

    Faculty Mentor: Chris Monson

    Abstract: Leakage of secrets from the public collaborative platform like GitHub has become a huge problem. The negligence of developers hardcoding secrets into the code often is the reason for accidental secret leakage. These secrets can sometimes act as entry points to internal infrastructure, making them a high-value target. There are few existing tools that hunt for secrets in the GitHub’s repositories using the Regular expression and high entropy-based models, which tend to generate a lot of false positives and false negatives. In this paper, we present a machine learning-based tool that goes through the repositories in GitHub and searches for secrets. This tool uses features that were extracted from the code to reduce the number of false positives and false negatives from the final result.

  • Students: Anuraag Baishya, Logan Kostick

    Faculty Mentor: Avi Rubin

    Abstract: In order to perform their tasks, Internet of Things (IoT) devices need to authenticate themselves with a hub. The hub essentially manages the working of these devices and sends commands which the devices run. Current methods used for authentication include state of the art encryption and network security concepts. However, certain parts of the communication, such as addresses, cannot be encrypted, allowing adversaries to sniff such information over the air and additionally impersonate devices. Being able to impersonate an IoT device may allow the attacker to receive sensitive information from the router, which was originally being sent to a legitimate device.

    In this project, we explore using transmission power as an indirect measure of the distance between the device and the router. We develop a protocol wherein a device and a router negotiate a minimum transmission power to communicate e_ectively, such that devices closer to the router transmit at lower transmission powers. This thwarts the adversaries’ ability to intercept communication or impersonate devices without being at a similar distance from the router as a device. We also develop a prototype implementation of this protocol in Python.

  • Students:Jennifer Li, Muskaan Kalra

    Faculty Mentor: Terry Thompson
    External Mentor: Faye Francy (Auto-ISAC)

    Abstract: Researchers have published studies and recommendations to address AV-related cybersecurity issues, but only a small number of studies have explored the challenges in L4 and L5 AVs. As manufacturers and technology companies around the world race to put L4 and L5 autonomous vehicles in the market, and given the rapid advancements in digital technology that expand the cyber-attack surface, it is crucial to study and tackle the issues now for them to be ready for mass-market consumption. In this report, we analyze potential cyber threats, apply threat model that relates to the CIS Controls, and provide best practices for automotive industry stakeholders by conducting a case study on Apollo Open Driving Platform. Specifically, we believe that the private sector and policymakers should focus their resources to prioritize AV measures that promote human safety first, vehicular integrity second, and other factors third.

  • Students: Alaa Jadullah, Kalyani Pawar, Xiaoyu Shi

    Faculty Mentor: Lanier Watkins

    Abstract: 5G is proclaimed to be the future, and with this vision of the future, there needs to be a focus on the security of 5G devices. This project’s goal is to examine a current market 5G enabled device, and attempt to discover vulnerabilities in its code that could be exploited and then exploit them. The project performed several experiments and attacks both Blackbox and Whitebox ones to test the security of the target device. The project then lists and analyzes the result of these attacks and their implications.

  • Students: Jay Chow, Kevin Hamilton, James Ballard

    Faculty Mentor: Lanier Watkins

    Abstract:This research explores the viability of using biologically-inspired design decisions for experimental firewalls. Most firewall designs in use today are signature-based which tend to struggle to protect against never seen before attacks and against encrypted traffic. By applying cellular biology concepts such as endocytosis, ligand-gated ion channels and active sites, we created a firewall prototype capable of giving a binary classification of benign or malicious to ingested network traffic. Using PCAPS (packet captures) generated by Stratosphere Lab, we extracted data about the streams into CSVs. With those CSVs we tested and evaluated both supervised and unsupervised machine learning algorithms to determine which may be best situated for the bio-inspired firewall.

JHU Information Security Institute