The past few years have witnessed the rapid rise of social media Web sites such as Flickr, del.icio.us, YouTube, Myspace, and Facebook, as well as the proliferation of “mashup” applications created when users combine services from multiple sources. These sites contain user-generated content in various forms, from plain text to rich multimedia. In fact, most publicly available text content created during the next 24 hours will be generated by end users, rather than professional writers, journalists, corporate communications departments, or others whose job it is to create and publish content. Furthermore, end users will generate an additional two orders of magnitude more text that they will send privately to other users through a communications channel such as email. The emergence of user content as the dominant content form on the Web raises various questions about the most effective approach to processing it.