Analyzing Dataset (Email) by Using Classification Approach to Authorship Identification

المؤلفون

  • Khaled Ahmed Adbeib Department of Computer Science, Faculty of Education, Bani Waleed University, Libya

الكلمات المفتاحية:

RapidMiner، K-Nearest Neighbor (K-NN)، Authorship Analysis، Stylometric Features، Classification Analysis، Authorship Categorization

الملخص

With the widespread adoption of internet technologies and applications, the misuse of online emails for illicit purposes has become a significant concern. Authorship identification, a crucial text analysis task, involves determining the likely author of a given document. In the context of emails, this methodology proves valuable in attributing a particular email to its originator based on factors such as writing style, word choice, and other linguistic features. Classification analysis emerges as a prevalent approach for authorship identification, employing machine learning models trained on a dataset of known authors to predict the authorship of unknown texts.

The anonymous nature of online emails presents challenges in tracing identities, escalating the gravity of the issue. The internet has unfortunately become a playground for cybercriminals engaging in activities ranging from simple spamming to sophisticated phishing attacks. Authorship analysis stands out as a pivotal measure to counter such illicit cyber activities. This study delves into authorship identification, focusing on a dataset of emails to ascertain whether an anonymous email is created by a suspect [1].

The primary objective of this project is to discern the authorship of anonymous emails by leveraging stylometric features. These features encompass vocabulary richness, sentence length, and writing style. Through an examination of a dataset comprising known emails, the study aims to distinguish and confirm the authorship of anonymous emails. Authorship analysis has demonstrated effectiveness not only in countering illegal cyber activities but also in revealing the true authorship of anonymous emails. This research contributes to the ongoing efforts to bolster cybersecurity measures and address the challenges posed by the misuse of online communication.

Dimensions

منشور

2023-11-30

كيفية الاقتباس

Khaled Ahmed Adbeib. (2023). Analyzing Dataset (Email) by Using Classification Approach to Authorship Identification. مجلة شمال إفريقيا للنشر العلمي (NAJSP), 1(4), 100–108. استرجع في من https://najsp.com/index.php/home/article/view/98

إصدار

القسم

محور العلوم التطبيقية والطبيعية