The Latest and Greatest Pandas Features (since v 0.11)

On May 28, 2014 Phillip Cloud, core contributor for the Pandas data analytics Python library, spoke at a joint meetup of the New York Quantitative Python User’s Group (NY QPUG) and the NY Finance PUG. Enthought hosted and about 60 people joined us to listen to Phillip present some of the less-well-known, but really useful features that have come out since Pandas version 0.11 and some that are coming soon. We all learned more about how to take full advantage of the Pandas Python library, and got a better sense of how excited Phillip was to discover Pandas during his graduate work.

After a fairly comprehensive overview of Pandas, Phillip got into the new features. In version 0.11 he covered:

  • indexers loc/at, iloc/iat,
  • dtypes,
  • using numexpr to evaluate arithmetic expressions for large objects, focusing mainly on numexpr. Then in version 0.12 he went into some depth on read_html. In the process he read data from a website and re-created a plot from the website. His examples are valuable as a way to see how an expert uses the Pandas package. He also goes over read_json and others new features as well, again with some really interesting examples.

Phillip covered some experimental features in version 0.13 including query/eval, msgpack IO and Google BigQuery IO. He then wrapped up with a sneak peak at some version 0.14 (soon to be released) features including MultiIndex slicing. His MultiIndex slicing example comes from his work on neuroscience (his cool data collection system is in the figure below).

You can watch his presentation below, and you can get his iPython Notebooks from the talk as well.

The Latest and Greatest Pandas Features (since v 0.11) from NYQPUG.

Share this article:

Related Content

Prospecting for Data on the Web

Introduction At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and…

Read More

Configuring a Neural Network Output Layer

Introduction If you have used TensorFlow before, you know how easy it is to create a simple neural network model using the Keras API. Just…

Read More

No Zero Padding with strftime()

One of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works…

Read More

Got Data?

Introduction So, you have data and want to get started with machine learning. You’ve heard that machine learning will help you make sense of that…

Read More

Sorting Out .sort() and sorted()

Sorting Out .sort() and sorted() Sometimes sorting a Python list can make it mysteriously disappear.  This happens even to experienced Python programmers who use .sort()…

Read More

A Beginner’s Guide to Deep Learning

Deep learning. By this point, we’ve all heard of it. It’s the magic silver bullet that can fix any complex problem. It’s the special ingredient…

Read More

Scientists Who Code

Digital skills personas for success in digital transformation The digital skills mix varies widely across companies, from those just starting to invest in digital transformation…

Read More

Giving Visibility to Renewable Energy

The ultimate project goal of EnergizAIR Infrastructure was to raise individual awareness of the contribution of renewable energy sources, and ultimately change behaviors. Now ten…

Read More

Introducing Enthought Edge: Unlocking the Value of R&D Data

While the value of R&D data is clear, finding a way to sort through it can be daunting given the special handling required to extract…

Read More

Machine Learning in Materials Science

The process of materials discovery is complex and iterative, requiring a level of expertise to be done effectively. Materials workflows that require human judgement present…

Read More

Join Our Mailing List!

Sign up below to receive email updates including the latest news, insights, and case studies from our team.