Far from being a mere annoyance, e-mail spam is beginning to have more serious effects. Penn State students and faculty, however, are beginning to fight back.
Gerald Santoro, assistant professor of information sciences and technology, said there are various ways to block or filter out unwanted e-mail messages. One of the most common techniques is to identify terms in the messages' subject lines or header information. Messages containing the term "Viagra," for example, will most likely be spam.
"Filters can easily be set up to grab those things and move them to another location, such as the trash," Santoro said.
A problem with a keyword-based technique, however, is that filters trained to block out the word "sex" may also filter out messages talking about Middlesex, Pa. To prevent such problems, Santoro said, students should periodically check those e-mail messages that have been filtered out to make sure no legitimate e-mail messages were blocked, in what is called a false positive.
A cat-and-mouse game is taking place between spammers and those trying to stop them, Santoro said. "People who are sending spam are getting good at fooling spam software," he said.
One particularly devious technique for sending spam is to make it appear that the message comes from someone you already know. To do that, the message scans your computer's mailbox to look for people with whom you correspond.
Santoro said he gets about 400 e-mail messages daily, about 100 of which are spam. He estimates that just deleting his spam messages takes him 15 minutes every day. "Each note takes a precious bit of your time before you delete it," he said, and can be a major productivity drain.
One way to ensure that your outgoing messages are not marked as spam is to write relevant subject lines. For example, "The information you requested," may be marked as spam.
What makes spam particularly hard to stop, Santoro said, is that it is often very hard to define exactly what spam is.
"Spam is in the eye of the beholder," he said.
Two different people receiving the exact same message may have differing opinions over whether it is spam or not. E-mail messages from an academic department at Penn State may be welcomed by one student and immediately deleted by another.
What is needed is a way for spam filters to more efficiently learn what types of messages the user wishes to receive, Santoro said.
"It's an artificial intelligence (AI) issue," he said.
Santoro said a combination of anti-spam legislation and advancements in technology will ultimately be required to solve the problem. Legislation, however, may not come easily.
"Anti-spam legislation is running into the same problems as the do-not-call list. How do you restrict it without restricting free speech?" Santoro said.
Even if legislation is passed, spammers can simply move their operations overseas and continue to send messages to the United States, he said.
Kevin Morooney, senior director for academic services and emerging technology in Information Technology Services (ITS), said the amount of spam students receive is based largely on their e-mail habits and how often their e-mail address appears on the Web.
ITS doesn't look at e-mail message that students send and receive, Morooney said, but estimates that between 20 to 40 percent of student e-mail messages are spam.
"We get lots of complaints," he said, "but spam is a distributed problem. It happens globally."
A new spamming technique comes out every two weeks or so, Morooney said, and ITS is constantly scrambling to block it.
"We call it the spam arms race," he said.
For example, one of the latest spamming trends is to hijack many machines and have each one send a few spam messages, rather than having a few machines send many messages. Thousands of smaller attacks are harder to trace, he said.
"We have spent a tremendous amount of energy just to keep things running. All you can hope to do is reduce it," Morooney said.
The best way to reduce spam is to use an e-mail client that uses spam filtering. The email programs Microsoft Outlook, Eudora, and Mozilla use learning mail filters which attempt to guess what is spam based on the kinds of words in the message and the way it is presented.
Duncan Fong, professor of statistics, said a new technique called Bayesian filtering may provide a solution. Bayesian statistics uses probability to deal with uncertainty. By analyzing how often a combination of certain words appears together, the likelihood that a message is spam can be determined. He said it is popular in various fields including AI, and would definitely be helpful for blocking unwanted e-mail messages.
ITS is exploring using a system where e-mail would be scored based on the possibility that it is spam before it reaches the students. Morooney stressed that this does not involve having ITS read the e-mail.
Such a system would be particularly useful for students using Penn State WebMail, since it currently lacks built-in spam filtering. ITS' research shows that half of students use WebMail and an e-mail client each week.
"I get tons of spam. It's annoying that there's nothing to do about that in WebMail. With Hotmail I just keep my junk filter set to the highest level," said Kevin Walker (sophomore-American studies).
Morooney also said that spam is a growing problem. "It's gone beyond vandalism. The infrastructure of the Internet itself is challenged every day. We need help from lawmakers," he said.

