Write a Blog >>
ESEC/FSE 2022
Mon 14 - Fri 18 November 2022 Singapore
Wed 16 Nov 2022 11:00 - 11:15 at SRC LT 50 - Program Analysis II Chair(s): Marsha Chechik

Determining whether multiple instructions can access the same memory location is a critical task in binary analysis. It is challenging as statically computing precise alias information is undecidable in theory. The problem aggravates at the binary level due to the presence of compiler optimizations and the absence of symbols and types. Existing approaches either produce significant spurious dependencies due to conservative analysis or scale poorly to complex binaries.

We present a new machine-learning-based approach to predict memory dependencies by exploiting the model's learned knowledge about how binary programs execute. Our approach features (i) a self-supervised procedure that pretrains a neural net to reason over binary code and its dynamic value flows through memory addresses, followed by (ii) supervised finetuning to infer the memory dependencies statically. To facilitate efficient learning, we develop dedicated neural architectures to encode the heterogeneous inputs (i.e., code, data values, and memory addresses from traces) with specific modules and fuse them with a composition learning strategy.

We implement our approach in NeuDep and evaluate it on 41 popular software projects compiled by 2 compilers, 4 optimizations, and 4 obfuscation passes. We demonstrate that NeuDep is more precise (1.5x) and faster (3.5x) than the current state-of-the-art. Extensive probing studies on security-critical reverse engineering tasks suggest that NeuDep understands memory access patterns, learns function signatures, and is able to match indirect calls. All these tasks either assist or benefit from inferring memory dependencies. Notably, NeuDep also outperforms the current state-of-the-art on these tasks.

Wed 16 Nov

Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

11:00 - 12:30
Program Analysis IIResearch Papers / Demonstrations / Ideas, Visions and Reflections at SRC LT 50
Chair(s): Marsha Chechik University of Toronto
11:00
15m
Talk
NeuDep: Neural Binary Memory Dependence Analysis
Research Papers
Kexin Pei Columbia University, Dongdong She Columbia University, Michael Wang Massachusetts Institute of Technology, Scott Geng Columbia University, Zhou Xuan Purdue University, Yaniv David Columbia University, Junfeng Yang Columbia University, Suman Jana Columbia University, Baishakhi Ray Columbia University
DOI
11:15
15m
Talk
DynaPyt: A Dynamic Analysis Framework for Python
Research Papers
Aryaz Eghbali University of Stuttgart, Michael Pradel University of Stuttgart
DOI Pre-print
11:30
15m
Talk
Language-Agnostic Dynamic Analysis of Multilingual Code: Promises, Pitfalls, and Prospects
Ideas, Visions and Reflections
Haoran Yang Washington State University, Wen Li Washington State University, Haipeng Cai Washington State University
DOI
11:45
15m
Talk
Cross-Language Android Permission Specification
Research Papers
Chaoran Li Swinburne University of Technology, Xiao Chen Monash University, Ruoxi Sun The University of Adelaide, Minhui (Jason) Xue University of Adelaide, Sheng Wen Swinburne University of Technology, Muhammad Ejaz Ahmed Data61, CSIRO, Seyit Camtepe CSIRO Data61, Yang Xiang Digital Research & Innovation Capability Platform, Swinburne University of Technology
DOI
12:00
15m
Talk
Peahen: Fast and Precise Static Deadlock Detection via Context Reduction
Research Papers
Yuandao Cai Hong Kong University of Science and Technology, Chengfeng Ye Hong Kong University of Science and Technology, Qingkai Shi Purdue University, Charles Zhang Hong Kong University of Science and Technology
DOI
12:15
7m
Talk
FIM: Fault Injection and Mutation for Simulink
Demonstrations
Ezio Bartocci TU Wien, Leonardo Mariani University of Milano-Bicocca, Dejan Nickovic Austrian Institute of Technology, Drishti Yadav Technische Universität Wien
12:23
7m
Talk
JSIMutate: Understanding Performance Results through Mutations
Demonstrations
Thomas Laurent Lero & University College Dublin, Paolo Arcaini National Institute of Informatics , Catia Trubiani Gran Sasso Science Institute, Anthony Ventresque University College Dublin & Lero, Ireland
DOI Media Attached