中文译名:evil: 利用自然语言开发软件 作者:Pietro Liguori 单位:那不勒斯费德瑞科二世大学 国家: #意大利 年份: #2021年 来源: #IEEE_ISSRE_CCFB 关键字: #NLP_AGE 代码地址:dessertlab/EVIL: EVIL (Exploiting software VIa natural Language) is an approach to automatically generate software exploits in assembly/Python language from descriptions in natural language. The approach leverages Neural Machine Translation (NMT) techniques and a dataset that we developed for this work. (github.com) 笔记建立时间: 2023-05-15 22:29 这篇论文幸好是21年,要是今年发就惨了

ABSTRACT

  • EVIL can automatically generate exploits in assembly/python language from descriptions in natural language.
  • EVIL leverages Neural Machine Translation techniques and a dataset that author developed.

METHODLOGY

image.png

  • pre-processing
    • tokenization
    • standardization: prevent non-English tokens from getting transformed during learning process
      • intent parser: input natural language (intent), output a dictionary of standardizable tokens such as specific values, label names, and parameters
      • Standardizer: input is the output of the intent parser and replace the selected token in both intent adn snippet. just like the step 3 and 4 in figure 1
    • embedding
  • NMT models
    • Seq2Seq
      • bi-directional LSTM as the encoder
    • CodeBERT
  • Post-Processing
    • it is a inverse operation of standardization, it replaces the symbolic value with the real value

虽然能把自然语言转换成代码,但是需要的自然语言的描述及其详细