About This Course

This intensive 3-day class is designed for students to gain proficiency using the Python Pandas library for data analysis. It is perfect for someone who uses or plans to use Python and Pandas regularly in their day-to-day work and wants to achieve a high level of proficiency rapidly. With a hands-on, exercise-intensive design and individualized instructor coaching, students will leave this class with the capability to immediately transfer their learnings to their day-to-day work.

Pandas (the Python Data Analysis library) provides a powerful and comprehensive toolset for working with data, including tools for reading and writing diverse files, data cleaning and wrangling, analysis and modeling, and visualization. Fields with widespread use of Pandas include data science, finance, neuroscience, economics, advertising, web analytics, statistics, social science, and many areas of engineering. Quantitative analysts, data scientists, and business analysts will find this class particularly beneficial.

This course is currently being taught virtually, on GoToMeeting, by an Enthought trainer in real-time. 

  • The virtual version of this class will be taught over 5 half-days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3:30PM MT, with a 2 hour break from 11AM-1PM MT.

Register Now

thumbnail

Course Overview

The class progresses step-by-step through a repeatable data analysis workflow using the Python Pandas library, including reading in data from multiple sources and databases, cleaning, merging, and munging data to prepare it for analysis, and data exploration and visualization.

Topics covered include:

  • Accessing Data From Multiple Sources
  • Cleaning and Preparing Data
  • Database Access and Data Wrangling
  • Data Visualization
  • Data Analysis
  • Real-World Modeling and Problem Solving

View Course FAQs

 

Testimonials

"Terrific course! Perfect foundation to train my entire team of data scientists. We now have a common language and common set of tools for our daily research. I'm looking forward to seeing the full impact of this workshop over the next several months."

- Carrie M.

"After finishing Enthought's 'Python for Data Science' and 'Pandas Mastery Workshop' course series, I feel confident and prepared to tackle even the ugliest datasets around. Their teachers are very knowledgeable and do a great job explaining tricky topics with ease and clarity. I highly recommend their training to anyone whose workflow revolves around data."

- William C.

 

Class Schedule

The virtual version of this class will be taught over 5 half days, instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3:30PM MT, with a 2 hour break from 11AM-1PM MT. The course will be held on GoToMeeting.

Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us using the form on this page to learn more.

Contact Us

Questions or need help registering? Call us at 512.536.1057 or fill out the form:


    Course Syllabus & Topics

    The virtual version of this class will be taught over 5 half days instead of 3 full days. The class will be taught in two 2 hour sessions each day from 9-11AM and 1-3:30PM MT, with a 2 hour break from 11AM-1PM MT.

     

    REGISTER NOW

     

    Course Prerequisites

    Knowledge of programming in the Python standard language (data structures, control flow, assignment, functions, and package access) and familiarity with array programming in NumPy is required. Knowledge of general data analysis techniques and basic statistics (mean, standard deviation, correlation, etc.) is strongly recommended.

    Individuals who have taken Enthought’s Python Foundations, Python for Scientists and Engineers, Python for Data Science, or Python for Machine Learning classes will have the prerequisite knowledge for the course.

    Collapse All
    • Reading and writing data from local files (.txt,.csv,.xls, .json, etc)
    • Reading data from remote files
    • Scraping tables from web pages (.html)
    • Making the most of the powerful read_table method

    • Working with Pandas data structures: Series and DataFrame
    • Accessing your data: indexing, slicing, fancy indexing, boolean indexing
    • Data wrangling, including dealing with dates and times and missing data
    • Adding, dropping, selecting, creating, and combining rows and columns

    • Database access with DB-API2 and SQLAlchemy
    • Executing SQL commands from Pandas
    • Loading database data into a DataFrame
    • Combining and manipulating DataFrames: merge, join, concatenate

    • Understanding the structure of a Figure
    • Data visualization: scatter plots, line plots, box plots, bar charts,and histograms with matplotlib
    • Customizing plots: important attributes and arguments

    • Split-apply-combine with DataFrames
    • Data summarization and aggregation methods
    • Pandas powerful groupby method
    • Reshaping, pivoting, and transforming your data
    • Simple and rolling statistics

    • Deep learning of the data analysis tools through lectures, Q&A, and hands-on exercises
    • Develop transferable skills through application to authentic data sets
    • Predict the future with time series analysis
    • And more!