About This Course
This course is now taught virtually, on GoToMeeting, by an Enthought trainer in real-time.
- The virtual version of this class will be taught over 5 half-days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
We endeavour to deliver these virtual programs as we would a face-to-face program. Interaction with the trainer is encouraged.
Machine learning models provide a fast and flexible way to build predictive models of the world, and are used for tasks ranging from predicting supply chain availability to optimizing the placement of advertisements. The tools discussed in this class are fast becoming industry standards in bioscience, finance, geology, manufacturing, and marketing.
The Machine Learning Mastery Workshop is 3 days of individualized coaching in the use of scikit-learn to predict country-specific risk of famine using satellite imagery, intelligence reports, and historical climate records. Students will return to work the same week ready to apply advanced learning algorithms to business cases in their own industries. test
The course begins with a conceptual introduction to machine learning algorithms. This is followed by an introduction to the implementation of estimators in scikit-learn and best practices for using them.
The rest of the course is focused around specific feature sources, and for each progresses through a short introductory lecture followed by three exercises of progressive difficulty, starting with standard and well-behaved cases, and ending with real-world and realistically problematic case studies.
Throughout, the focus of the course is on building deep conceptual understanding, exhaustive practical experience, and covering common mistakes and edge cases. Intermingled in the machine learning material will be short discussions of helpful and diagnostic data visualizations.
At the end of this course, participants will be able to:
- Leverage the full power of the scikit-learn API
- Use specific regression, classification, and clustering models skillfully to model their data and solve problems
- Denoise and segment imaging data with scikit-image
- Construct OODA loops with selection and prediction pipelines
- Efficiently search over hyper parameter spaces
- Extract lexical and semantic information from natural language data
- Engineer numeric features to maximize predictive power
- Visualize interactions and non-linear distributions of data
- Validate models with the appropriate success metrics
- Troubleshoot common issues like unbalanced labels and high dimensionality data
- Build deep insight by retrieving model parameters
The virtual version of this class will be taught over 5 half days, instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT. The course will be held on GoToMeeting.
Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us using the form on this page to learn more.
NOTE: this class assumes previous experience with Python Make sure you meet the prerequisites before purchasing. Contact us using the form on this page if you need help determining the best class for you or with any questions.
Need the prerequisites? Take Python for Machine Learning instead!
|Where||When||Price (per person)||Reserve a Seat|
|Online - Live Virtual||February 1-5, 2021 | 9-11AM and 1-3PM MST daily||$1500||Register Online|
Course Syllabus & Topics
Due to social distancing measures currently in place to slow the spread of COVID-19, we will be teaching this course online, in real-time on GoToMeeting, with an Enthought trainer. The content and prerequisites for the virtual course do not differ from the face-to-face program.
The virtual version of this class will be taught over 5 half days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
Knowledge of programming in the Python standard language (data structures, control flow, assignment, functions, and package access) and familiarity with array programming in NumPy is required. Familiarity with the Pandas and matplotlib libraries is also required (DataFrames, indexing, plot grids). Knowledge of general data analysis techniques and basic statistics (mean, standard deviation, correlation, etc.) is strongly recommended.
Individuals who have taken Enthought’s Python Foundations, Python for Scientists and Engineers, Python for Data Science, or Python for Data Analysis classes will have met the prerequisites for the course.
- Linear and nonlinear models
- Constant and variable learning-rates
- Cost functions, regularization methods, and other constraints
- Fitting, transforming, and predicting
- Logarithmic and curvilinear transforms
- Data scaling
- Linear regressors
- l1 and l2 normalization
- Support vector machines (SVM)
- Contrast encoding
- Missing values
- Categorical rebinning
- Linear classifiers
- Tree-based classifiers
- Ensemble methods
- Boosting methods
- Unbalanced designs
- Image storage formats
- Smoothing and denoising
- Edge detection
- Feature-based segmentation
- K-means clustering
- Orthographic measurement
- Lexical vectorizers
- Semantic embeddings
- The “long tail” problem
- Dimensionality reduction
- Underspecified models