Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Jesia Yuki 1 ; Mohammadhossein Amouei 1 ; Benjamin C. M. Fung 1 ; Philippe Charland 2 and Andrew Walenstein 3

Affiliations: 1 School of Information Studies, McGill University, Montreal, QC, Canada ; 2 Mission Critical Cyber Security Section, Defence R&D Canada, Quebec, QC, Canada ; 3 BlackBerry Limited, Waterloo, ON, Canada

Keyword(s): Assembly Code, Reverse Engineering, CodeBERT, Transformers, Code Summarization.

Abstract: This study explores the field of software reverse engineering through the lens of code summarization, which involves generating informative and concise summaries of code functionality. A significant aspect of this research is the application of assembly code summarization in malware analysis, highlighting its critical role in understanding and mitigating potential security threats. Although there have been recent efforts to develop code summarization techniques for high-level programming languages, to the best of our knowledge, this study is the first attempt to generate comments for assembly code. For this purpose, we first built a carefully curated dataset of assembly function-comment pairs. We then focused on automatic assembly code summarization using transfer learning with pre-trained natural language processing (NLP) models, including BERT, DistilBERT, RoBERTa, and CodeBERT. The results of our experiments show a notable advantage of Code-BERT: despite its initial training on hi gh-level programming languages alone, it excels in learning assembly language, outperforming other pre-trained NLP models. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 8.217.144.104

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Yuki, J., Amouei, M., C. M. Fung, B., Charland, P. and Walenstein, A. (2024). AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code. In Proceedings of the 19th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-706-1; ISSN 2184-2833, SciTePress, pages 35-45. DOI: 10.5220/0012761400003753

@conference{icsoft24,
author={Jesia Yuki and Mohammadhossein Amouei and Benjamin {C. M. Fung} and Philippe Charland and Andrew Walenstein},
title={AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code},
booktitle={Proceedings of the 19th International Conference on Software Technologies - ICSOFT},
year={2024},
pages={35-45},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012761400003753},
isbn={978-989-758-706-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 19th International Conference on Software Technologies - ICSOFT
TI - AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code
SN - 978-989-758-706-1
IS - 2184-2833
AU - Yuki, J.
AU - Amouei, M.
AU - C. M. Fung, B.
AU - Charland, P.
AU - Walenstein, A.
PY - 2024
SP - 35
EP - 45
DO - 10.5220/0012761400003753
PB - SciTePress

  翻译: