The Yahoo Webscope Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists. Datasets are in several categories, including Advertising, Competition data (Learning to Rank), Computing (Hardware Usage/File Access), Social (Search marketing), Images (Flickr), Language (Ngrams, Wikipedia Data), and Ratings and Classification (Music, movies, web pages, click logs, news, images). Also includes a very large Yahoo News Feed dataset containing ~110B events (13.5TB uncompressed) of anonymized user-news item interaction data.
The datasets are only available for academic use by faculty and university researchers who agree to the Data Sharing Agreement. No commercial use.