27 April 2016

Public Data Sources for Machine Learning

UCICollection of benchmark datasets for regression and classification tasksUCI Machine Learning Repository
KDDExtended version of UCI datasetsUCI KDD Extended Version
DELVEPlatform for comparative assessment of regression and classification tasksDELVE
DMOZCollection of links for different datasetsDMOZ Directory
KDNuggetscollection of links for different datasetsFurther Datasets
ChemDBchemical data that can be used as datasets for machine learningChemDB
Golemtrying to learn rules for predictionGolem Datasets
NDRData sets for nonlinear dimensionality reductionNonlinear Dimensionality Reduction
GeneralA list of dataset links by categoryfurther datasets
AWS Publicpublic list of datasets via S3large dataset repository
Datahubpublic list of datasetsdatahub datasets
BigMLcurated list of datasetsbigML datasets
Curated Githubcurated categorized list of datasets on githubpublic datasets on github
wikipedia listcurated categorized list of datasets on wikipediadatasets of ML
Data ScienceData Science Projects19 free public data sources
Data ScienceData Science Projectsdata science datasets