Python for Data Science

Course Overview

This fast-paced class is intended for practicing data scientists, data analysts, and business intelligence experts interested in using Python for their day-to-day work. The primary focus is on learning to use Python tools for data science, data analysis, and machine learning efficiently and effectively.

What You'll Learn

Participants in this course will take away:

  • Hands-on experience setting up an integrated analysis environment for doing data science with Python.
  • An understanding of how to use the Python standard library to write programs, access the various data science tools, and document and automate analytic processes.
  • Orientation to some of the most powerful and popular Python libraries for data science including Pandas (data preparation, analysis, and modeling; time series analysis), scikit-learn (machine learning), and Matplotlib and Seaborn (data visualization).
  • Working knowledge of the Python tools ideally suited for data science tasks, including:
    • Accessing data (e.g., text files, databases)
    • Cleansing and normalizing data
    • Exploring data (e.g., simple statistics, correlation matrices, visualization)
    • Modeling data (e.g., machine learning)

Course Syllabus

I. Introduction and Setting Up Your Integrated Analysis Environment

Setting Up Your Integrated Analysis Environment & Tools Overview

  • IPython Shell
  • Custom environment settings
  • Jupyter Notebooks
  • Script editor
  • Packages: NumPy, SciPy, scikit-learn, Pandas, Matplotlib, Seaborn, etc.

Once you complete this module, you will understand some of the unique benefits of using Python for data science / what features make Python particularly well-suited for data science, you will be able to set up a fully functioning Python-based analysis environment, and you will know what each tool is used for in the data science workflow.

II. Using Python to Control and Document Your Data Science Processes

Python Essentials

  • Data types and objects
  • Loading packages, namespaces
  • Reading and writing data
  • Simple plotting
  • Control flow
  • Debugging
  • Code profiling

Once you complete this module, you will be able to use the Python standard library plus Canopy tools to write, run, debug, and profile programs that control your data science processes (which draw on the scientific packages).

III. Accessing and Preparing Data

Data, Data, Everywhere...

Acquiring Data with Python

  • Loading from CSV files
  • Accessing SQL databases

Cleansing Data with Python

  • Stripping out extraneous information
  • Normalizing data
  • Formatting data

Once you complete this module, you will know how to load data from common types of data sources, including structured text files and SQL databases. and you will know some of the common tools used in Python to cleanse and prepare your data for analysis.

IV. Numerical Analysis, Data Exploration, and Data Visualization with NumPy Arrays,
Matplotlib, and Seaborn

NumPy Essentials

  • The NumPy array
  • N-dimensional array operations and manipulations
  • Memory mapped files

Data Visualization

  • 2D plotting with Matplotlib
  • Advanced data visualization with Seaborn

Once you complete this module, you will understand how to use NumPy arrays for efficient numerical processing and how to use NumPy methods such as slicing to write code that is both compact and easy to read and understand. You will know how to use Matplotlib, Seaborn, and NumPy together to explore and visualize your data.

V. Exploring Data with Pandas

Searching for Gold in a Pile of Pyrite

  • Data manipulation with Pandas
  • Statistical analysis with Pandas
  • Time series analysis with Pandas

At the end of this module, you will know how to access some of the core tools used for statistical analysis and data exploration in Python.

VI. Machine Learning with scikit-learn

Predicting the Future Can Be Good for Business

  • Input: 2D, samples, and features
  • Estimator, predictor, transformer interfaces
  • Pre-processing data
  • Regression
  • Classification
  • Model selection

At the end of this module you will have a working understanding of what machine learning tools are available in scikit-learn and how to use them.

Expand All Collapse All

Prerequisites

The course assumes a working knowledge of key data science topics (statistics, machine learning, and general data analytic methods). Programming experience in some language (such as R, MATLAB, SAS, Mathematica, Java, C, C++, VB, or FORTRAN) is expected. In particular, participants need to be comfortable with general programming concepts like variables, loops, and functions. Experience with Python is helpful (but not required).


FAQs

  • Is a class completion certificate provided?
  • Yes, a class completion certificate is provided for the Python for Data Science class.

    Have a question that isn't answered here? Contact us or call 512.536.1057.



Python for Data Science

For inquiries or to register call 512.536.1057


San Jose, CA
Aug 21-25, 2017
Houston, TX
Oct 2-6, 2017
Albuquerque, NM
Oct 16-20, 2017
$2750
Washington, DC
Nov 13-17, 2017
London, UK
Nov 20-24, 2017
New York City, NY
Dec 4-8, 2017
Austin, TX
Dec 11-15, 2017

Discounts are available for 3+ attendees, and corporate training options are also available.

A 20% discount is available for academics currently at a degree-granting institution.

Contact us or call 512.536.1057 for more information.

Questions or want to reserve a seat in an upcoming class?

Call 512.536.1057 or fill out the
form below.

Testimonials

“Excellently taught course. Coming from a background of R and SAS, I finally understand Python beyond the basics and have necessary tools to really harness its power. This wouldn't have been possible without the instructor's deep knowledge, and the patience and willingness to share it!” —Statistician, Pharmaceutical Industry
“Excellent, important, succinct way to build a knowledge base in Python, critical to helping me develop the skill to successfully implement my ideas in new ways. and the skill to do so successfully.” —Tyler C., Marketing Analyst, Best Buy
“A very insightful course, delivered by a true expert. I left the course with hundreds of ideas upon which I can now act.” —Neal M., Research Manager, Shell Oil
“Highly recommended if you want to learn or improve your Python. The trainer was the best trainer I have ever encountered. He had a nice style of presenting and was very intelligent and knowledgeable. Easily answered even the most complex questions thrown at him.” —Ahmad H., Software Engineer, Financial Services
“I did not think that I would learn Python programming in one week given that I do not have strong background in programming, but with the instructor and the whole course I am ready to implement my reports to Python and effectively enhance their efficiency and quality.” —Yacoub N., Senior Manager, Investments, Abu Dhabi Investment Authority
“It was a fantastic class. I cannot believe how much I learned in five days. The instructor was excellent.” —Lee I., Ziften Software”
“The course and program were great. Right pace, right breadth of material to cover. Also, the instructor was very knowledgable and really did a good job of walking the fine line between providing enough support/guidance to us and letting us get ourselves into trouble before throwing us the life raft.” —Aubra A., Analyst, International Development
“Great course, Python is demystified in less than a week. I'm ready to apply it to my projects.” —Javier P., Data Scientist

See more testimonials