Spam Four Ways: Making Sense of Text Data

The world is full of data, and much of it is unstructured. An example is text data, which forms a critical part of our lives through books, magazines, and the internet. Surprisingly, despite the key role of language arts in all aspects of education, text analysis has not traditionally played a major part in statistical education.

While there are many interesting literary analyses one might consider, we explore a more mundane but familiar example by looking at a text string taken from an email subject line is spam (an unwanted or inappropriate email message).

        Some content is only viewable by ASA Members. Please login or become an ASA member to gain access.

        Tagged as: , ,