| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Stop wasting time looking for files and revisions. Connect your Gmail, DriveDropbox, and Slack accounts and in less than 2 minutes, Dokkio will automatically organize all your file attachments. Learn more and claim your free account.

View
 

FrontPage

Page history last edited by mike@mbowles.com 4 years, 6 months ago

 

Machine Learning 102

 

 

To join this class please register at http://www.linkedin.com/osview/canvas?_ch_page_id=1&_ch_panel_id=1&appParams=%7B%22go_to%22%3A%22events%2F557127%22%7D&_ch_app_id=7083120&_applicationId=2000&_ownerId=0 

or on the class meetup page: 

http://www.meetup.com/HandsOnProgrammingEvents/events/dmcsdcypnbdc/

 

Instructors: Dr. Michael Bowles & Dr. Patricia Hoffman

 

Overview of the course

Machine Learning 102 covers unsupervised learning and fault detection. 

 

The class begins at the level of elementary probability and statistics and from that background surveys unsupervised learning and fault detection.  The class will give participants a working knowledge of these techniques and will leave them prepared to apply those techniques to real problems.  To get the most out of the class, participants will need to work through the homework assignments. 

 

Prerequisites

This class assumes a moderate level of computer programming proficiency.  We will use R (the open source statistics language) for the homework and for the examples in class.  We will cover some of the basics of R and do not assume any prior knowledge of R.  You can find references to how to use R on this website and we will give out sample code during classes that will help get you started. 

 

You'll need undergraduate level background in probability, calculus, linear algebra and vector calculus.  We will cover most of what is required during the lectures.  The appendices in the back of the Tan text are more than sufficient level for this class. 

 

They second five week session (Machine Learning 102) will culminate in the students giving presentations on papers they have read.

 

Why use R?

We're going to use R as our lingua franca for looking at homework problems, discussing them and comparing different solution approaches.    Load R onto your laptop or desk computer before you come to the first class.   http://cran.r-project.org/  We will include some descriptive material on using R in the first two lectures in order to get everyone up to speed on it. To integrate R with Eclipse click here. References for R are here: References for R Comment on these references here:  Reference for R Comments  More R references

 

Please note that anyone can read this web site, however only the instructors have permission to write on the site.   We welcome new members to the class, but we are not granting permissions to edit this site.

 

General Sequence of Classes:

 

Machine Learning 101:   Supervised learning

Machine Learning 102Unsupervised Learning and Fault Detection

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

 

Machine Learning 201:    Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models    

Machine Learning 202:   Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees

Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

 

Machine Learning Big Data:  Adaptation and execution of machine learning algorithms in the map reduce framework

 

Machine Learning Text Processing:  Machine learning applied to natural language text documents using statistical algorithms including  indexing, automatic classification (e.g. spam filtering) part of speech identification, topic and modeling, sentiment extraction

 

Future Topics 

     Data Mining Social Networks

     Text Mining

 

 

Machine Learning 102 Syllabus:  

 

Week  Topics  Homework  Links 
       
1st Week  Hierarchical, Density, & K - means Clustering
 
ML102Homework02.pdf   Basic Clustering  
2nd Week       Expectation Maximization Algorithms & Discriminant Analysis Homework2.txt  

BasicClustering2

AkaikeInfoCriterion

3rd Week   Cluster Validity, Using Akaike Info Criterion to determine # of Clusters, Gaussian Mixture Models
 Homework3.txt
GaussianMixtureModel  
4th Week Outliers, Extreme Values, Convex Hull, One-Class SVM   Outliers_Fraud  
5th Week
Association Rules
  AssociationRules  

 

           

 

 

There are more Machine Learning References on Patricia's web site http://patriciahoffmanphd.com/

 

Anyone can read this web site, however only the instructors have permission to edit the site. 

If you haven't already filled out the class survey form on the meet-up page, please fill out the form now.  If you haven't already signed up on the on the meet-up page please do so now.

Comments (0)

You don't have permission to comment on this page.