SOC 553 Introduction to Text Mining and Statistical Natural Language Processing

This course will introduce statistical processing of natural language texts, particularly counting words and phrases in and of themselves as well as associations between them using correlations and other measures. Goals of text mining include document classification, information retrieval, source authentication, and stylistic categorization. Typical document sources are newspaper stories, email captures, and Internet pages, as well as collections of non-fiction and fiction such as Federalist Papers and Edgar Allan Poe short stories.

Credits

3

Prerequisite

Graduate Student or At Least Junior

Distribution

Computer Science Program

Offered

Fall Semester Spring Semester