AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code

Jesia Yuki; Mohammadhossein Amouei; Benjamin C. M. Fung; Philippe Charland; Andrew Walenstein

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code

Topics: Automated Software Engineering; Cybersecurity Technologies; Data Mining and Data Analysis; Natural Language Technologies; Software Engineering Tools

In Proceedings of the 19th International Conference on Software Technologies ICSOFT - Volume 1, 35-45, 2024 , Dijon, France

Authors: Jesia Yuki ¹ ; Mohammadhossein Amouei ¹ ; Benjamin C. M. Fung ¹ ; Philippe Charland ² and Andrew Walenstein ³

Affiliations: ¹ School of Information Studies, McGill University, Montreal, QC, Canada ; ² Mission Critical Cyber Security Section, Defence R&D Canada, Quebec, QC, Canada ; ³ BlackBerry Limited, Waterloo, ON, Canada

Keyword(s): Assembly Code, Reverse Engineering, CodeBERT, Transformers, Code Summarization.

Abstract: This study explores the field of software reverse engineering through the lens of code summarization, which involves generating informative and concise summaries of code functionality. A significant aspect of this research is the application of assembly code summarization in malware analysis, highlighting its critical role in understanding and mitigating potential security threats. Although there have been recent efforts to develop code summarization techniques for high-level programming languages, to the best of our knowledge, this study is the first attempt to generate comments for assembly code. For this purpose, we first built a carefully curated dataset of assembly function-comment pairs. We then focused on automatic assembly code summarization using transfer learning with pre-trained natural language processing (NLP) models, including BERT, DistilBERT, RoBERTa, and CodeBERT. The results of our experiments show a notable advantage of Code-BERT: despite its initial training on hi gh-level programming languages alone, it excels in learning assembly language, outperforming other pre-trained NLP models. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 8.217.144.104

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Yuki, J., Amouei, M., C. M. Fung, B., Charland, P. and Walenstein, A. (2024). AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code. In Proceedings of the 19th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-706-1; ISSN 2184-2833, SciTePress, pages 35-45. DOI: 10.5220/0012761400003753

@conference{icsoft24,
author={Jesia Yuki and Mohammadhossein Amouei and Benjamin {C. M. Fung} and Philippe Charland and Andrew Walenstein},
title={AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code},
booktitle={Proceedings of the 19th International Conference on Software Technologies - ICSOFT},
year={2024},
pages={35-45},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012761400003753},
isbn={978-989-758-706-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 19th International Conference on Software Technologies - ICSOFT
TI - AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code
SN - 978-989-758-706-1
IS - 2184-2833
AU - Yuki, J.
AU - Amouei, M.
AU - C. M. Fung, B.
AU - Charland, P.
AU - Walenstein, A.
PY - 2024
SP - 35
EP - 45
DO - 10.5220/0012761400003753
PB - SciTePress