About This Course
This course is now taught virtually, on GoToMeeting, by an Enthought trainer in real-time.
- The virtual version of this class will be taught over 5 half-days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
We endeavour to deliver these virtual programs as we would a face-to-face program. Interaction with the trainer is encouraged.
This intensive 3-day class is designed for students to gain proficiency using the Python Pandas library for data analysis. It is perfect for someone who uses or plans to use Python and Pandas regularly in their day-to-day work and wants to achieve a high level of proficiency rapidly. With a hands-on, exercise-intensive design and individualized instructor coaching, students will leave this class with the capability to immediately transfer their learnings to their day-to-day work.
Pandas (the Python Data Analysis library) provides a powerful and comprehensive toolset for working with data, including tools for reading and writing diverse files, data cleaning and wrangling, analysis and modeling, and visualization. Fields with widespread use of Pandas include data science, finance, neuroscience, economics, advertising, web analytics, statistics, social science, and many areas of engineering. Quantitative analysts, data scientists, and business analysts will find this class particularly beneficial.
The class progresses step-by-step through a repeatable data analysis workflow using the Python Pandas library, including reading in data from multiple sources and databases, cleaning, merging, and munging data to prepare it for analysis, and data exploration and visualization.
Topics covered include:
- Accessing Data From Multiple Sources
- Cleaning and Preparing Data
- Database Access and Data Wrangling
- Data Visualization
- Data Analysis
- Real-World Modeling and Problem Solving
"Terrific course! Perfect foundation to train my entire team of data scientists. We now have a common language and common set of tools for our daily research. I'm looking forward to seeing the full impact of this workshop over the next several months."
- Carrie M.
"After finishing Enthought's 'Python for Data Science' and 'Pandas Mastery Workshop' course series, I feel confident and prepared to tackle even the ugliest datasets around. Their teachers are very knowledgeable and do a great job explaining tricky topics with ease and clarity. I highly recommend their training to anyone whose workflow revolves around data."
- William C.
The virtual version of this class will be taught over 5 half days, instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT. The course will be held on GoToMeeting.
Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us using the form on this page to learn more.
|Where||When||Price (per person)||Reserve a Seat|
|Online - Live Virtual||June 14-18, 2021 | 2 sessions per day from 9-11AM and 1-3PM MDT||$1500||Register Online|
Course Syllabus & Topics
Due to social distancing measures currently in place to slow the spread of COVID-19, we will be teaching this course online, in real-time on GoToMeeting, with an Enthought trainer. The content and prerequisites for the virtual course do not differ from the face-to-face program.
The virtual version of this class will be taught over 5 half days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3PM MDT, with a 2 hour break from 11AM-1PM MDT.
Knowledge of programming in the Python standard language (data structures, control flow, assignment, functions, and package access) and familiarity with array programming in NumPy is required. Knowledge of general data analysis techniques and basic statistics (mean, standard deviation, correlation, etc.) is strongly recommended.
- Reading and writing data from local files (.txt,.csv,.xls, .json, etc)
- Reading data from remote files
- Scraping tables from web pages (.html)
- Making the most of the powerful read_table method
- Working with Pandas data structures: Series and DataFrame
- Accessing your data: indexing, slicing, fancy indexing, boolean indexing
- Data wrangling, including dealing with dates and times and missing data
- Adding, dropping, selecting, creating, and combining rows and columns
- Database access with DB-API2 and SQLAlchemy
- Executing SQL commands from Pandas
- Loading database data into a DataFrame
- Combining and manipulating DataFrames: merge, join, concatenate
- Understanding the structure of a Figure
- Data visualization: scatter plots, line plots, box plots, bar charts,and histograms with matplotlib
- Customizing plots: important attributes and arguments
- Split-apply-combine with DataFrames
- Data summarization and aggregation methods
- Pandas powerful groupby method
- Reshaping, pivoting, and transforming your data
- Simple and rolling statistics
- Deep learning of the data analysis tools through lectures, Q&A, and hands-on exercises
- Develop transferable skills through application to authentic data sets
- Predict the future with time series analysis
- And more!