About This Course
This course is now taught virtually, on GoToMeeting, by an Enthought trainer in real-time.
- The virtual version of this class will be taught over 5 half-days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
We endeavour to deliver these virtual programs as we would a face-to-face program. Interaction with the trainer is encouraged.
This 3-day intensive Python training class provides practical, hands-on experience and foundational working knowledge of Python for data analysis, science, engineering, and other technical applications. Whether you are new to Python or a long-time enthusiast, you’ll benefit from this focused series of topics and best practices taught by experts who create Python software for notable companies in finance, oil and gas, scientific research, aerospace, biotechnology, marketing analysis and more.
The Python Foundations class will get you up to speed quickly on how to optimize your use of the Python standard language and key Python packages for data exploration, modeling, and analysis.
- It begins with a one-day introduction to the Python language focusing on standard data structures, control constructs, and code organization.
- We then cover object-oriented programming in Python.
- After a brief overview of the Scientific Python ecosystem, we dive into techniques for numeric data processing, including efficiently manipulating and processing large data sets using NumPy arrays and data visualization with 2D plots using Matplotlib.
- Next up is an introduction to Pandas to efficiently load, clean, normalize, aggregate, transform, and visualize data.
"Tim is an excellent instructor. I was very impressed with his ability to code 'on-the-fly' as he gave us illuminating examples."
- Gene B.
"Dr. Diller is a great teacher! The way he presents is very clear to understand, and the pace he teaches at is a perfect balance; It suits both those who know a lot about programing and those who know less. I am impressed by his knowledge, and inspired by his excitement on the subject's application to everyone's work."
- Robert S.
The virtual version of this class will be taught over 5 half days, instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT. The course will be held on GoToMeeting.
Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us using the form on this page to learn more.
|Where||When||Price (per person)||Reserve a Seat|
|Online - Live Virtual||November 2-6, 2020 | 9-11AM and 1-3PM MDT daily||$999.00||Register Online|
Course Syllabus & Topics
Due to social distancing measures currently in place to slow the spread of COVID-19, we will be teaching this course online, in real-time on GoToMeeting, with an Enthought trainer. The content and prerequisites for the virtual course do not differ from the face-to-face program.
The virtual version of this class will be taught over 5 half days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
What You’ll Learn
The class will give you the initial building blocks to effectively use Python in your daily work, while setting the foundation for additional skill building in areas of specific interest.
Experience with Python is helpful (but not required). However, programming experience in some language (such as R, MATLAB, SAS, Mathematica, Java, C, C++, VB, or FORTRAN) is expected. In particular, participants need to be comfortable with general programming concepts like variables, loops, and functions.
We kick off the class by exploring the functionality of the IPython Shell, an enhanced interactive science-centric console. Next we review the Jupyter Notebook, a cell-based environment that renders scripts, plots, and rich media in a web-like interface, making it ideal for sharing and publishing analysis with peers. You’ll leave with a mastery of these tools that will accelerate your productivity and facilitate collaboration.
1. Building a Solid Infrastructure to Go From Exploratory Analysis to Reproducible Workflows
A. Introduction and Setting Up Your Integrated Analysis Environment
- Using the Enthought Deployment Manager (EDM) with the Visual Studio Code (VS Code) IDE
- IPython Shell
- Custom environment settings
- Jupyter (IPython) Notebooks
- Script editor
- Packages: NumPy, Pandas, matplotlib, etc.
Next we move into an introduction to Python’s core language features that form part of your universal toolkit for tasks ranging from initial data exploration to extensible application development. We’ll introduce Python’s built-in data structures, including how and where each might be used and what trade-offs are present, and we’ll cover Python’s looping and control flow constructs. Along the way we’ll provide insight into Python’s design choices that will help you understand why Python works the way it does.
1. Using Python to Control and Document Your Workflow
- Data types and objects
- Loading packages, namespaces
- Reading and writing data
- Simple plotting
- Control flow
- Code profiling
There are a number of “must-have” packages for scientific computing and data analysis with Python. We’ll review three of these in this class that will give you the underpinnings you need to be able to expand your knowledge into additional packages that fit your area of specialization. If you are coming from a background in MATLAB®* or R, you’ll find these libraries essential.
Chief among these packages is NumPy, a tool for rapidly manipulating and processing large data sets. Whether you are a scientist writing short scripts to analyze and plot your analytical results or an analyst writing large-scale quantitative finance applications for Wall Street, NumPy should be part of your toolbox. We give you a jump start with the basics in the classroom, then provide you additional curated lectures to extend your understanding.
Once you’ve crunched your data, you’ll want to visualize it, which is where matplotlib comes in. Matplotlib is a versatile 2D plotting library that allows you to generate plots, histograms, power spectra, bar charts, error charts, scatter plots, and more with just a few lines of code.
Finally, we do a deep dive into the Python Data Analysis Library (Pandas), a powerful package for working with multi-dimensional datasets. Pandas’ powerful data aggregation and reorganization capabilities, including support for labeling data along each dimension, missing values, and time series manipulations, have made Python an indispensable tool for data exploration and analysis.
1. Numerical Analysis, Data Exploration, and Data Visualization with NumPy Arrays and Matplotlib
- The NumPy array
- Indexing and slicing arrays
- Array operations and manipulations
- 2D plotting with matplotlib
2. Data Wrangling, Exploration, and Analysis with Pandas
- 1D and 2D data structures: Series and DataFrame
- Pandas I/O
- Data visualization
- Data manipulation (alignment, aggregation, and summarization)
- Statistical analysis with Pandas
- Date and time series analysis with Pandas