Email is one of the most widely used forms of written communication over the Internet, and its use has increased tremendously for both personal and professional purposes. The increase in email traffic comes also with an increase in the use of emails for illegitimate purposes to commit all sort of crimes. Phishing, spamming, email bombing, threatening, cyber bullying, racial vilification, child pornography, viruses and malware propagation, and sexual harassments are common examples of email abuses. Terrorist groups and criminal gangs are also using email systems as a safe channel for their communication. The alarming increase in the number of cybercrime incidents using email is mostly due to the fact that email can be easily anonymized. The problem of email authorship attribution is to identify the most plausible author of an anonymous email from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author and studied the effects of different writing style features on the classification accuracy. However, little attention has been given on ensuring the quality of the evidence. In this work, we introduce an innovative data mining method to capture the write-print of every suspect and model it as combinations of features that occur frequently in the suspect's emails. This notion is called frequent pattern, which has proven to be effective in many data mining applications, but has not been applied to the problem of authorship attribution. Unlike traditional approaches, the extracted write-print by our method is unique among the suspects and, therefore, provides convincing and credible evidence for presenting it in a court of law. Experiments on real-life emails suggest that the proposed method can effectively identify the author and the results are supported by a strong evidence.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error