In this example the Enron e-mail corpus was interrogated. It contains data from about 150 users, mostly senior management of Enron, organised into folders. The corpus contains a total of about 500,000 e-mail messages.
In the first example we have taken approximately 3,000 messages from an Enron employee and processed the content to extract the key topics. Part of this detailed analysis has been displayed here.
Rows correspond to individual messages and the column headings are the key topics. In this heat map we can quickly identify the red columns highlighting frequently discussed subject matter, in this case Enron, trade, globe, risk, management, operation, business, transaction, London and Houston.