About This Course

Machine learning models provide a fast and flexible way to build predictive models of the world, and are used for tasks ranging from predicting supply chain availability to optimizing the placement of advertisements. The tools discussed in this class are fast becoming industry standards in bioscience, finance, geology, manufacturing, and marketing.

The Machine Learning Mastery Workshop is 3 days of individualized coaching in the use of scikit-learn to predict country-specific risk of famine using satellite imagery, intelligence reports, and historical climate records. Students will return to work the same week ready to apply advanced learning algorithms to business cases in their own industries.

Machine Learning Mastery Workshop Icon

Course Overview

The course begins with a conceptual introduction to machine learning algorithms. This is followed by an introduction to the implementation of estimators in scikit-learn and best practices for using them.

The rest of the course is focused around specific feature sources, and for each progresses through a short introductory lecture followed by three exercises of progressive difficulty, starting with standard and well-behaved cases, and ending with real-world and realistically problematic case studies.

Throughout, the focus of the course is on building deep conceptual understanding, exhaustive practical experience, and covering common mistakes and edge cases. Intermingled in the machine learning material will be short discussions of helpful and diagnostic data visualizations.

At the end of this course, participants will be able to:
  • Leverage the full power of the scikit-learn API
  • Use specific regression, classification, and clustering models skillfully to model their data and solve problems
  • Denoise and segment imaging data with scikit-image
  • Construct OODA loops with selection and prediction pipelines
  • Efficiently search over hyper parameter spaces
  • Extract lexical and semantic information from natural language data
  • Engineer numeric features to maximize predictive power
  • Visualize interactions and non-linear distributions of data
  • Validate models with the appropriate success metrics
  • Troubleshoot common issues like unbalanced labels and high dimensionality data
  • Build deep insight by retrieving model parameters

Contact Us

Questions or need help registering? Call us at 512.536.1057 or fill out the form:


Course Instructors

Enthought instructors have doctorates in scientific fields such as physics, engineering, computer science, and mathematics, and all have extensive experience through research and consulting in applying Python to solve complex problems across a range of industries, allowing them to bring their real world experience to the classroom every day. Enthought instructors possess professional, first-hand experience with the tools and technologies covered in our courses.

Course Syllabus & Topics

Course Prerequisites

Knowledge of programming in the Python standard language (data structures, control flow, assignment, functions, and package access) and familiarity with array programming in NumPy is required. Familiarity with the Pandas and matplotlib libraries is also required (DataFrames, indexing, plot grids). Knowledge of general data analysis techniques and basic statistics (mean, standard deviation, correlation, etc.) is strongly recommended.

Individuals who have taken Enthought’s Python Foundations, Python for Scientists and Engineers, Python for Data Science, or Python for Data Analysis classes will have met the prerequisites for the course.

I. Introduction to Machine Learning
  • Linear and nonlinear models
  • Constant and variable learning-rates
  • Cost functions, regularization methods, and other constraints
  • Fitting, transforming, and predicting
II. Numeric Data
  • Logarithmic and curvilinear transforms
  • Data scaling
  • Outliers
  • Linear regressors
  • l1 and l2 normalization
  • Support vector machines (SVM)
III. Categorical Data
  • Contrast encoding
  • Missing values
  • Categorical rebinning
  • Linear classifiers
  • Tree-based classifiers
  • Ensemble methods
  • Boosting methods
  • Unbalanced designs
IV. Image Data
  • Image storage formats
  • Scikit-image
  • Smoothing and denoising
  • Edge detection
  • Feature-based segmentation
  • K-means clustering
V. Language Data
  • Orthographic measurement
  • Lexical vectorizers
  • Semantic embeddings
  • The “long tail” problem
  • Dimensionality reduction
  • Underspecified models

Open Class Schedule

Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us to learn more.

WhereWhenPrice (per person)Register
Austin, TXFebruary 21-23, 2018$1800
Houston, TXApril 18-20, 2018 (NOTE: this class will include a special 1/2 day module on oil & gas applications of machine learning on the 3rd day)$1800Contact us with the form to the right
Cambridge, UKMay 9-11, 2018£1476Contact us with the form to the right
Albuquerque, NMMay 21-23, 2018$1800

Contact Us

Questions or need help registering? Call us 512.536.1057 or fill out the form below:


  • Is a class completion certificate provided?
    • Yes, a class completion certificate is provided for the Machine Learning Mastery Workshop.
  • Do I need to have taken a class from Enthought before to enroll in the Machine Learning Mastery Workshop?
    • No, but students should already be proficient in scientific Python before attending. We  will be working extensively with both NumPy ndarrays and Pandas DataFrames, and will not have time to review these data structures during the class.
  • What’s the difference between Enthought’s Python for Data Science and Machine Learning Mastery Workshop?
    • Enthought’s Python for Data Science is a five-day class designed to introduce Python, NumPy, Pandas, Matplotlib, and scikit-learn. One previous attendee called it “the most concise data science primer you can find”. Machine Learning Mastery is a three-day long workshop-style course, which means that it largely consists of guided, hands-on practice applying machine learning algorithms to real data. As opposed to a primer, the Mastery Workshop is more of a “deep-dive”.
  • Is Deep Learning (with Keras, TensorFlow, or PyTorch) covered in the course?
    • Not in this particular course, no. Deep learning is a very exciting and promising field of research, but one which requires specialized hardware and whose use cases are relatively limited. This course covers learning algorithms that are both broadly applicable and also usable on common workstations and laptops.
  • I am worried that your training is only useful to people who are committed to using Enthought software products. How much of your training is usable without Enthought software?
    • 100%. Our training teaches students how to write software with Python and solve problems using its scientific packages, not how to use proprietary software. Everything you will learn uses free and open source software.
      We provide Enthought Canopy (our integrated analysis environment and Python distribution) to training participants to ensure they have all of the tools and Python packages they need to complete the training and that the tools are as easy as possible to install. While participants sometimes do use other editors, package managers, and Python distributions, we strongly recommend participants use Canopy during the training. With Canopy we can ensure that you can easily install everything you need for the course out of the box and we can provide technical support (which we unfortunately cannot provide for other tool sets).
  • I use / will be using Anaconda Python. Will I still benefit from this course?
    • Absolutely. Our training materials work with any Python distribution (such as Anaconda), as long as you also have all of the necessary packages, a text or code editor, package manager, interactive IPython shell, and Jupyter notebooks installed.

Have a question that isn’t answered here? Contact us or call 512.536.1057.