background preloader

Datasets

Facebook Twitter

UCI Machine Learning Repository: Breast Cancer Wisconsin (Original) Data Set: Support. Wisconsin breast cancer data : benchmark dataset. (1) Data: Where can I get large datasets open to the public. Machine Learning - Course website. Chris Thornton This course teaches the theory and practice of machine learning using a mixture of demos, lectures and labs.

Instructions for lab sessions Assessment is based on one programming assignment and an unseen exam. Most of the syllabus material is in the online lecture notes (below), but note-taking and additional reading is strongly advised. The first meeting for the course will be the first lecture in week 1. Your first lab will your first scheduled lab session after the lecture on k-means clustering. Week 1 . Week 2 . k-means clustering agglomerative clustering, cluster hierarchies, centroids pdf . Week 3 . Week 4 . Week 5 . Week 6 . Week 7 . Week 8 . Week 9 . Week 10 . . If you have questions about the material, the best thing is to put a question to me during a lecture. If you prefer, you can approach me at the end of a lecture. If that doesn't work, you can talk to me (or a lab tutor) in your next lab.

Don't send me questions by email. There is no single course text. 4. UCI Machine Learning Repository: Dermatology Data Set. Source: Original Owners: 1. Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey Phone: +90 (312) 214 1080 2. Donor: H. Data Set Information: This database contains 34 attributes, 33 of which are linear valued and one of them is nominal.

The differential diagnosis of erythemato-squamous diseases is a real problem in dermatology. In the dataset constructed for this domain, the family history feature has the value 1 if any of these diseases has been observed in the family, and 0 otherwise. The names and id numbers of the patients were recently removed from the database. Attribute Information: Clinical Attributes: (take values 0, 1, 2, 3, unless otherwise indicated) 1: erythema 2: scaling 3: definite borders 4: itching 5: koebner phenomenon 6: polygonal papules 7: follicular papules 8: oral mucosal involvement 9: knee and elbow involvement 10: scalp involvement 11: family history, (0 or 1) 34: Age (linear) Relevant Papers: G.

Gisele L. Rafael S. Rafael S. 'arff' - Search results | TunedIT. Mushroom. Machine Learning Repository: URL Reputation Data Set. Machine Learning Repository: Amazon Commerce reviews set Data Set. Source: Dataset creator and donator: ZhiLiu, e-mail: liuzhi8673 '@' gmail.com, institution: National Engineering Research Center for E-Learning, Hubei Wuhan, China Data Set Information: dataset are derived from the customers’ reviews in Amazon Commerce Website for authorship identification. Most previous studies conducted the identification experiments for two to ten authors.

But in the online context, reviews to be identified usually have more potential authors, and normally classification algorithms are not adapted to large number of target classes. To examine the robustness of clasification algorithms, we identified 50 of the most active users (represented by a unique ID and username) who frequently posted reviews in these newsgroups. Attribute Information: attribution includes authors' lingustic style such as usage of digit, punctuation, words and sentences' length and usage frequency of words and so on Relevant Papers: Citation Request: Machine Learning Repository: Poker Hand Data Set.