中文译名：基于软件故障注入的设备驱动程序模糊错误处理代码作者：蒋mingzhu 单位：清华大学国家： #中国年份： #2019年来源： #IEEE_ISSRE_CCFB 关键字： #fuzzing #故障注入代码地址：笔记建立时间： 2023-06-27 17:15

Abstract

much error handing code in drivers is triggered by occasional errors (such as insufficient memory and hardware malfunctions) that not related to inputs
this method based on software fault injection
firstly, at complie time, FIZZER uses static analysis to reccommend possible error sites that can trigger error handling code. Then, during driver execution, by analyzing runtime information, it automatically fuzzes error-site sequences for fault injection to improve code coverage.

Background

current fuzzing approches for driver has two limitations:

cannot generate original driver inputs
- PeriScope can simulate and fuzz the driver inputs from the hardware device to perform runtime testing
cannot effectively cover error handling code
- take SFI to cover error handling code at runtime

The author thinks that there are two kinds of errors in general, one is input-related error and the other is occasional errors The authors surveyed the error handling code in the Linux kernel and the errors committed by skzkaller to find out how many of these error handling code and bugs were related to occasional errors.

Methodology

basic idea

error site: some function can trigger some fault then trigger the error handling code
error-site sequence: multiple eorror sites ordered by their static positions in the driver source code
author regard error-site sequences as the “inputs” of possibly encountered error, and then fuzz these sequences to cover the error handling code

method

statically identify error sites in the driver code
run the driver, and then according to the runtime information of the driver, use a coverage-based mutation method to generate error-site sequences to cover error handling code
1. Initial mutation: 初始error-site序列是全零，然后执行一次fuzz，其中可能有的error site被执行，如下图中间标Y的0，那么初次免疫就是针对这些成功执行的error site的，如下图右边。（突变方法通过使一个已执行的错误位点失败 (0→1)来生成每个新的错误位点序列，因为每个错误位点可能触发不同的错误处理代码）
2. Subsequent mutation：在驱动程序的后续执行之后，如果代码覆盖率增加 (即覆盖新的代码分支或基本块)，则突变方法选择此执行的错误位点序列作为突变的种子; 否则，将放弃此错误位点序列。当这个错误位点序列发生突变时，一次只改变一个已执行的错误位点 (0→1or1→0)，因为每个错误位点可能触发不同的错误处理代码。这样，突变法就产生了一些新的错误位点序列。然后，将这些生成的错误位点序列与之前使用的错误位点序列进行比较，并删除重复的错误位点序列。、
inject faults on error sites according to the generated error-site sequences
run the driver, and use the mutation method again, to generate new error-site sequences, making up a fuzzing loop.

IMPLEMENTATION

Error-site analyzer. It performs a static analysis of the driver source code to recommend possible error sites, from which the user should select realistic ones that can actually fail and trigger error handling code.
- 识别error-site的标准就是是否同时出现if语句和NULL指针或者非零整数的返回值，然后人工从静态分析得到的error sites中挑选出真正会触发error hadling code的site
Driver generator. It instruments the identified error sites in the driver code and generates a loadable driver.
- 使用代码插桩进行故障注入：error_probe ()意思就是检测此时的error-site序列，看当前的error site应不应该触发，下图是个例子（蓝色是插桩代码）
Runtime fuzzer. It uses our SFI-based fuzzing strategy to perform runtime testing. During driver execution, it collects the runtime information about the driver.
Bug checkers. They check the information collected by the runtime fuzzer to detect bugs.
- 实现了两个检查器来检测资源泄漏和双锁错误，并使用两个第三方检查器，即KASAN[23]来检测内存损坏错误，Kmemleak[24]来检测内存泄漏。

Key Information

作者在最开始统计错误处理代码的时候，使用的方式手动查找goto语句和返回的错误代码，这两种语句通常用于Linux内核中的错误处理代码
作者指出51%可以触发错误处理代码的站点与偶发错误有关。
这里变异的时候每次只变一个erro site是否fail

疑问

模糊错误序列的意义何在，既然都已经探测到error site，我全部都触发，覆盖率不就最大了吗
如何实现自动插桩的
啥事双锁错误 double-lock bugs

改进

自动识别error sites

目的：方法：意义：效果：

Abstract#

Background#

Methodology#

basic idea#

method#

IMPLEMENTATION#

Key Information#

疑问#

改进#