layout | title |
---|---|
page |
Teaching |
Provides an intensive introduction to applied statistics and data analysis. Trains students to become data scientists capable of both applied data analysis and critical evaluation of the next generation of statistical methods. Since both data analysis and methods development require substantial hands-on experience, focuses on hands-on data analysis.
- JHSPH 150.711 - Advanced Data Science I [2015 Site]
These classes are core classes in our Ph.D. program. As I view them the purpose of these classes are: (1) to teach students how to handle data at an advanced level, (2) to teach students methods that are good and how they work (asymptotic, small sample, etc. properties), and (3) teach students how to create their own methods.
- JHSPH 150.754 - Advanced Methods III [2011 site] [2012 site] [2013 site] [2014 site]
- JHSPH 150.753 - Advanced Methods IV [2013 site] [2014 site]
- JHSPH 150.688
This class is a graduate level introduction to genomic technologies and the most common statistical methods used to analyze them. We often have guest lecturers and the focus of the class is getting students to perform a real project at the end of the course.
- JHSPH 150.688 Statistics for Genomics [2011 site]
This class was an 8 week introduction to Data Analysis, starting from very basic concepts of what type of data analyis questions you can ask, to how to get data and do a basic analysis. This course ran twice in January 2013 and October 2013 and enrolled 185,000+ students with 6,000+ completers. More students completed the class than all the masters degrees in statistics handed out across all departments of statistics and biostatistics in the United States in that same time frame. All the content from this class has been folded into our Data Science Specialization but you can still find the old videos from the course on Youtube and the lecture notes on Github.
With Roger Peng, Brian Caffo, Nick Carchedi and Sean Kross we created a 9 course specialization in data science. Some cool things about the program are:
- Every class runs every month.
- The course material is all open source and on Github.
- It is designed from the ground up to cover modern version control, R programming, statistics, machine learning, and data products.
You can sign up here or sign up here
With Steven Salzberg, James Taylor, Ela Pertea, Liliana Florea, Ben Langmead, and Kasper Hansen we created a 7 course specialization in genomic data science. Some cool things about the program are:
- Every class runs every month.
- It is designed from the ground up to cover modern genomics including Galaxy, Python, R, Bioconductor, statistics, computing, and genomic technologies.
- All of my lecture materials for Statistics for Genomic Data Science are open source and available from the website and the genstats R package
You can sign up for the specialization here