LargeDataSets Data
From Wiki
AHPCRC Research Project at CSU Stanislaus, links to useful data sets
SNAP Library: Stanford Network Analysis Platform - data sets from real world networks of all sorts
Data gathered by the USA Federal Gov't -- all kinds of data sets, of all sorts of data.
Lahman Baseball Database -- baseball statistics (found by Juan)
The Modesto Bee's collection of maps based on recent Census data
Microsoft Learning to Rank Datasets: research datasets extracted from Bing queries, for use in learning to rank web query results (Data sets courtesy Microsoft)