The Latest and Greatest Pandas Features (since v 0.11)

On May 28, 2014 Phillip Cloud, core contributor for the Pandas data analytics Python library, spoke at a joint meetup of the New York Quantitative Python User’s Group (NY QPUG) and the NY Finance PUG. Enthought hosted and about 60 people joined us to listen to Phillip present some of the less-well-known, but really useful features that have come out since Pandas version 0.11 and some that are coming soon. We all learned more about how to take full advantage of the Pandas Python library, and got a better sense of how excited Phillip was to discover Pandas during his graduate work.

After a fairly comprehensive overview of Pandas, Phillip got into the new features. In version 0.11 he covered:

  • indexers loc/at, iloc/iat,
  • dtypes,
  • using numexpr to evaluate arithmetic expressions for large objects, focusing mainly on numexpr. Then in version 0.12 he went into some depth on read_html. In the process he read data from a website and re-created a plot from the website. His examples are valuable as a way to see how an expert uses the Pandas package. He also goes over read_json and others new features as well, again with some really interesting examples.

Phillip covered some experimental features in version 0.13 including query/eval, msgpack IO and Google BigQuery IO. He then wrapped up with a sneak peak at some version 0.14 (soon to be released) features including MultiIndex slicing. His MultiIndex slicing example comes from his work on neuroscience (his cool data collection system is in the figure below).

You can watch his presentation below, and you can get his iPython Notebooks from the talk as well.

The Latest and Greatest Pandas Features (since v 0.11) from NYQPUG.

Share this article:

Related Content

Retuning the Heavens: Machine Learning and Ancient Astronomy

What can we learn about machine learning from ancient astronomy? When thinking about Machine Learning it is easy to be model-centric and get caught up…

Read More

Announcing Enthought Academy

Dear Students and Friends of Enthought,  I am pleased to announce Enthought Academy—the culmination of over twenty years of teaching Scientific Python. Since our founding…

Read More

Extracting Target Labels from Deep Learning Classification Models

In the blog post Configuring a Neural Network Output Layer we highlighted how to correctly set up an output layer for deep learning models. Here,…

Read More

Exploring Python Objects

Introduction When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python…

Read More

Choosing the Right Number of Clusters

Introduction When I first started my machine learning journey, K-means clustering was one of the first algorithms I was introduced to – and it is…

Read More

Prospecting for Data on the Web

Introduction At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and…

Read More

Configuring a Neural Network Output Layer

Introduction If you have used TensorFlow before, you know how easy it is to create a simple neural network model using the Keras API. Just…

Read More

No Zero Padding with strftime()

One of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works…

Read More

Got Data?

Introduction So, you have data and want to get started with machine learning. You’ve heard that machine learning will help you make sense of that…

Read More

Sorting Out .sort() and sorted()

Sorting Out .sort() and sorted() Sometimes sorting a Python list can make it mysteriously disappear.  This happens even to experienced Python programmers who use .sort()…

Read More

Join Our Mailing List!

Sign up below to receive email updates including the latest news, insights, and case studies from our team.