SigRec: Automatic Recovery of Function Signatures in Smart Contracts

Millions of smart contracts have been deployed onto Ethereum for providing various services, whose functions can be invoked. For this purpose, the caller needs to know the function signature of a callee, which includes its function id and parameter types. Such signatures are critical to many applica...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on software engineering Vol. 48; no. 8; pp. 3066 - 3086
Main Authors Chen, Ting, Li, Zihao, Luo, Xiapu, Wang, Xiaofeng, Wang, Ting, He, Zheyuan, Fang, Kezhao, Zhang, Yufei, Zhu, Hang, Li, Hongwei, Cheng, Yan, Zhang, Xiaosong
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2022
IEEE Computer Society
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Millions of smart contracts have been deployed onto Ethereum for providing various services, whose functions can be invoked. For this purpose, the caller needs to know the function signature of a callee, which includes its function id and parameter types. Such signatures are critical to many applications focusing on smart contracts, e.g., reverse engineering, fuzzing, attack detection, and profiling. Unfortunately, it is challenging to recover the function signatures from contract bytecode, since neither debug information nor type information is present in the bytecode. To address this issue, prior approaches rely on source code, or a collection of known signatures from incomplete databases or incomplete heuristic rules, which, however, are far from adequate and cannot cope with the rapid growth of new contracts. In this paper, we propose a novel solution that leverages how functions are handled by Ethereum virtual machine (EVM) to automatically recover function signatures. In particular, we exploit how smart contracts determine the functions to be invoked to locate and extract function ids, and propose a new approach named type-aware symbolic execution (TASE) that utilizes the semantics of EVM operations on parameters to identify the number and the types of parameters. Moreover, we develop SigRec , a new tool for recovering function signatures from contract bytecode without the need of source code and function signature databases. The extensive experimental results show that SigRec outperforms all existing tools, achieving an unprecedented 98.7 percent accuracy within 0.074 seconds. We further demonstrate that the recovered function signatures are useful in attack detection, fuzzing and reverse engineering of EVM bytecode.
ISSN:0098-5589
1939-3520
DOI:10.1109/TSE.2021.3078342