About This Course

This 5-day class will get your group up to speed quickly on how to optimize your use of the Python standard language and key Python packages for data exploration, modeling, and analysis.

Course Overview

The Python for Data Analysis class will get you up to speed quickly on how to optimize your use of the Python standard language and key Python packages for data exploration, modeling, and analysis. This curriculum provides an excellent survey understanding of the Python language and its capabilities for all things data, while also providing intensive exposure to the core workhorse tools of NumPy and Pandas that are central to data analysis in Python.

Contact Us

Questions or need help registering? Call us 512.536.1057 or fill out the form below:

Course Instructors

Enthought instructors have doctorates in scientific fields such as physics, engineering, computer science, and mathematics, and all have extensive experience through research and consulting in applying Python to solve complex problems across a range of industries, allowing them to bring their real world experience to the classroom every day. Enthought instructors possess professional, first-hand experience with the tools and technologies covered in our courses.

Testimonials

Course Syllabus & Topics

Course Prerequisites

Programming experience in some language (such as R, MATLAB, SAS, Mathematica, Java, C, C++, VB, or FORTRAN) is expected. In particular, participants need to be comfortable with general programming concepts like variables, loops, and functions. Previous Python experience is helpful, but not required.

Python Essentials

An understanding of how to use the Python standard library to write programs, access various tools, and document and automate analytical processes.

  • Types (strings, lists, dictionaries, and more)
  • Control Flow (if-then statements, looping)
  • Organizing code (functions, modules, packages)
  • Reading and writing files
  • Overview of Object-Oriented Programming (OOP)

NumPy and 2D Plotting

Introduction to NumPy and 2D plotting. The NumPy package is presented as a tool for rapidly manipulating and processing large data sets. 2D plotting is introduced with matplotlib.

  • Understanding the N-dimensional data structure
  • Creating arrays
  • Indexing arrays by slicing or more generally with indices or masks
  • Basic operations and manipulations on N-dimensional arrays
  • Plotting with matplotlib

Pandas: Python's Workhorse Toolkit for All Things Data Analysis

Built on top of NumPy arrays, the Python Data Analysis Library (Pandas) is a powerful and convenient package for dealing with tabular datasets. Participants will learn about its powerful data aggregation and reorganization capabilities for data set explorations, including support for labeling data along each dimension, dealing with missing values, and time series manipulations.

An expert instructor will support students as they work through a typical real-world data analysis project step-by-step using Pandas. This course develops the deep knowledge and skills that will enable students to tackle their own projects with Pandas immediately when they get back to work on Monday morning.

Accessing Data From Multiple Sources

  • Reading and writing data from local files (.txt,.csv,.xls, .json, etc)
  • Reading data from remote files
  • Scraping tables from web pages (.html)
  • Making the most of the powerful read_table method

Cleaning and Preparing Data

  • Working with Pandas data structures: Series and DataFrames
  • Accessing your data: indexing, slicing, fancy indexing, boolean indexing
  • Data wrangling, including dealing with dates and times and missing datas
  • Adding, dropping, selecting, creating, and combining rows and columns

Database Access and Data Wrangling

  • Database access with DB-API2 and SQLAlchemy
  • Executing SQL commands from Pandas
  • Loading database data into a DataFrame
  • Combining and manipulating DataFrames: merge, join, concatenate

Data Visualization

  • Understanding the structure of a Figure
  • Data visualization: scatter plots, line plots, box plots, bar charts,and histograms with matplotlib
  • Customizing plots: important attributes and arguments

Data Analysis

  • Split-apply-combine with DataFrames
  • Data summarization and aggregation methods
  • Pandas powerful groupby method
  • Reshaping, pivoting, and transforming your data
  • Simple and rolling statistics

Real-World Modeling and Problem Solving

  • Deep learning of the data analysis tools through lectures, Q&A, and hands-on exercises
  • Develop transferable skills through application to authentic data sets
  • Predict the future with time series analysis
  • And more!

Open Class Schedule

Onsite corporate classes are also available. Discounts are available for 3 or more attendees and academics currently at a degree-granting institution. Contact us to learn more.

WhereWhenPrice (per person)Reserve a Seat
San Jose, CANovember 6-10, 2017$2750Contact us with the form to the right to reserve a seat

The 3 day Pandas Mastery Workshop is an alternative course for those who already have both (1) current working knowledge of programming in the Python standard language (data structures, control flow, assignment, functions, and package access) and (2) familiarity with array programming in NumPy.

Contact Us

Questions or need help registering? Call us 512.536.1057 or fill out the form below:

FAQs

  • Is a class completion certificate provided?
    • Yes, a class completion certificate is provided for the Python for Data Analysis class.

Have a question that isn’t answered here? Contact us or call 512.536.1057.

Testimonials

You could tell from the demos, examples and exercises that this course was designed and taught by someone who has first hand experience of using the tools on real world and real life data.

Business Technologist, Glasgow Caledonian University

The depth and breadth of this course will provide the foundation to efficient data manipulation and encapsulation.