Ride the Lightning

Cybersecurity and Future of Law Practice Blog
by Sharon D. Nelson Esq., President of Sensei Enterprises, Inc.

EDRM'S Enron PST Data Set Cleansed of Personal Information

May 16, 2013

Yesterday, I received a press release from Nuix (and a similar release was sent out by EDRM) saying that Nuix and EDRM had republished the EDRMEnron PST Data Set after cleansing it of private, health and personalfinancial information.

A portion of the Nuix release said:

"The EDRM Enrondata set is an industry-standard collection of email data that the legalprofession has used for many years for eDiscovery training and testing.However, it was well known to contain large amounts of personal informationabout the company’s former employees."

The only part of that paragraph I quibble with is "it was well known." It was certainly well known to those who used the data and to certain others in the EDD sector. But as this blog has indicated in previous posts, the extent of personal information in the data set was unknown to many.

Nonetheless, I applaud the Nuix folks and EDRM for cleaning the data set of more than 10,000 e-mails and attachments of such things as credit card numbers, social security numbers, dates of birth and other personal information.

To download thecleansed data set and the case study that explains the methodology used, visit here.

Nuix will hosta Twitter chat to discuss the release of the cleansed EDRM Enron PST Data Seton Thursday, May 23rd 2pm–3pm ET. Its experts will describe the process ofidentifying unsecured financial, health and personally identifiable informationin corporate data. You can follow the hashtag #NuixChat and send in your questionsbeforehand to @nuix.