中文译名:浅谈公共安全漏洞报告不一致的检测 作者:Ying Dong 单位: 国科大 国家: #中国 年份: #2019年 来源: #USENIX会议 关键字: #提取信息 代码地址:pinkymm/inconsistency_detection: Towards the Detection of Inconsistencies in Public Security Vulnerability Reports (github.com) 笔记建立时间: 2023-05-15 14:53

Abstract

  • we propose an automated system VIEM to detect inconsistent information between the fully standardized NVD database and the unstructured CVE descriptions and their referenced vulnerability reports.
  • VIEM is developed to extract vulnerable software names and vulnerable versions from unstructured text.
  • We introduce customized designs to deep-learning-based named entity recognition (NER) and relation extraction (RE) so that VIEM can recognize previous unseen software names and versions based on sentence structure and contexts.

Design

Named Entity Recognition Model

image.png use a state-of-the-art NER model to identify the entities of interest, i.e., the name and version of the vulnerable software those of vulnerable components and those of underlying software systems that vulnerable software depends upon.

  • use a standard word embedding approach to encode tach word as a vector representation
  • use Bi-GRU to perform text encoding at the character level.
  • use other Bi-GRU to assign label to every word : SN for software name, SV for software version, O for other
  • build a dictionary consisting of 81,551 software to rectify the result of NER model.

Relation Extraction Model

image.png utilizes a Relation Extraction (RE) model to pair identified entities accordingly

  • step 1: encodes the occurence of the SN and SV, and then yields a group of position embeddings representing the relative distances from current word to the two named entities in the same sentence.
  • step 2: use the same way as the NER to generate the embedding, then right behind the word embedding, the RE model appends each group of the position embeddings individually.
  • step 3: each pair of word embedding and position encoding through a attention netword and then use other attention netword to predicts which pair is true.

Transfer Learning

learns the aforementioned NER and RE models using vulnerability reports in one primary category and then transfers their capability into other vulnerability categories