InBoxer's Unique, Powerful Language Technology
When most people first try to find harassing email messages within a large body of messages they usually start by searching for dirty words and phrases. They quickly realize that additional types of words and phrases, such as ethnic slurs, need to be added. But, eventually, four problems emerge:
They cannot think of all of the possible words and phrase combinations.
They realize that some offensive words also have non-offensive meanings. The result is that the search yields many messages that are not actually harassment.
They discover that as the list gets longer, the processing time to compare each message to the list gets longer.
They discover that some messages that do not have offensive words could be used as evidence of a hostile work environment.
InBoxer approaches the problem differently using an advanced, proprietary technique to find potentially harassing email. The methods are primarily based on statistical language models. InBoxer assembled tens of thousands of emails from many companies and categorized them. We then built statistical models of the emails to find which words and other elements are more often found in risky messages and which are more commonly found in messages that are not. To analyze a new email, the InBoxer Anti-Risk Appliance compares the new email to the language models and performs a complex analysis to see if it is potentially harassing.
To demonstrate, InBoxer analyzed 500,000 messages sent and received by executives and professionals at Enron Corporation.
The Enron message below is similar those that are common in harassment cases. It contains a potentially offensive joke. The InBoxer Anti-Risk Appliance correctly identified this email as one that could be used as evidence to support a hostile work environment. Other techniques would not have identified this message because it does not contain any offensive words.
|
FROM: E*********@ENRON
TO: R********/Corp/Enron@ENRON, K*******/HOU/ECT@ECT
DATE = 03/06/2001
TIME : 09:14:00
SUBJECT : Leaving Early...
Three women all worked in the same office with the same female boss. Each day, they noticed the boss left work early. One day, the women decided that, when the boss left, they would leave right behind her.
The brunette was thrilled to be home early. She did a little gardening, spent playtime with her son, and went to bed early. The redhead was elated to be able to get in a quick workout at the spa before meeting a dinner date.
The blonde was happy to get home early and surprise her husband, but when she got to her bedroom, she heard a muffled noise from inside. Slowly and quietly, she cracked open the door and was mortified to see her husband in bed with her boss! Gently, she closed the door and crept out of her house.
The next day, at their coffee break, the brunette and redhead planned to leave early again, and they asked the blond if she was going to go with them. “No way,” the blonde exclaimed. “I almost got caught yesterday!” |
Figure 1. One of many jokes circulated via Enron corporate e-mail. Note: Names were removed.
Below is an email discussion from the Enron email collection. The two participants may not believe the content to be offensive. as it does not contain any dirty words or slurs.
However, this message could be offensive to many people. It could also provide supporting evidence in a case that does not involve the sender or recipient of the message. An attorney may discover the message in an email search. It could then be used as an example of the prevailing attitudes towards women, women who wish to become pregnant, or women who have children.
|
FROM : B*******
06/09/2000 02:28 PM
To: R******/HOU/ECT@ECT
Subject: Re: Kids
Either they have spare time or they are doing it in their sleep. I really don’t want to think of anyone I know here working on having babies. I say that and yet I know Tracy is trying to get pg. She says she is tired of always having her legs in the air. I know she doesn’t have any spare time.
Maybe she utilizes her time by doing two things at once. Like eating dinner and you know...... Or like, heck I don’t know. My brain is mush. See ya. B
FROM: R*****
06/09/2000 12:54 PM
To: B*****/HOU/ECT@ECT
Subject: Re: Kids
You mean there are Enron employees with spare time??
FROM: B*****
06/09/2000 10:57 AM
To: Robin R/HOU/ECT@ECT
Subject: Kids
Are you here yet?
There are thousands of kids here today. They are in every nook and cranny. Dang, I’ll be glad to get out of here today. Are there thousands of kids on your floor too? We now know what Enron employees do in their spare time!! B |
Figure 2. A conversation that does not contain dirty words, but might support a hostile work environment claim.
As with the other example, systems that depend upon lexicons or word lists would not detect this message. The InBoxer Anti-Risk Appliance gave it a high ranking as potentially inappropriate mail.
While examining products, be sure to look beyond the claims. Be especially skeptical of the products from companies that claim that they spent years working on lexicons, word lists, and phrases. Ask for proof that the solution would catch these examples and others like them.
One way to test products is to use the examples above or others that you might find at www.EnronEmail.com. Ask vendors to run these messages through their systems and examine the results.
InBoxer Domains (Partial List)
Personal, identity theft, and PHI (Protected Health Information)
Medical terminology for HIPAA
Social Security numbers
Driver's license numbers
Credit card numbers
Customer or patient lists
Account Numbers (patterns you define)
Risky recipient email addresses
External domains
Free mail services
Competitive domains
Addresses from a list (for "Chinese Wall" applications)
Personal versus business mail
May not need to be archived
Indicator of wasting time or resources |
|
Custom parameters
Exact or partial matches for words and phrases
Matches from match lists (customer lists, patient lists, etc.)
Partial match using “regular expressions” for complex text that fits a particular pattern, such as bank account numbers
Offensive content / Hostile work environment (Harassment)
Adult content
Hate mail
Profanity
Racial, ethnic or religious slurs
Attachment types
Images, videos, or music
Spreadsheet, word processing, presentation, and database
ZIP files
By type, size, or name

|
|