Exploring Python Objects

Introduction

2022-newsletter-salesperson-3x

When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python from the command line. This has several advantages. First, it reduces context switching – to figure out new stuff, students don’t constantly have to toggle between writing Python code and searching for documentation on the web or in a book. Second, it encourages an experimental mindset – students can use a set of simple tools to examine unfamiliar Python objects, figure out what they do, and how to correctly use them, or find new possibilities for what they could do.

Let’s take a look at some of the functions that Python provides for exploring objects: help(), dir(), and type().

help()

One of the main ways to explore objects is Python’s built-in help() function (and its IPython surrogates of ? and ??). Calling help with a type name or an actual object, will pull up help on that type:

>>> help(list)
Help on class list in module builtins:

class list(object)
| list() -> new empty list
| list(iterable) -> new list initialized from iterable's items
|
| Methods defined here:
|
| __add__(self, value, /)
| Return self+value.
...

In [1]: # if in IPython
In [2]: l = [1, 2, 3]
In [3]: l?
Type: list
String form: [1, 2, 3]
Length: 3
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items

Both help() and ? draw on the object’s docstring (stored in the .__doc__ attribute) for the information to display. Both facilities also allow you to specify a part of the object as well. For example:

>>> help(l.append)
Help on built-in function append:

append(...) method of builtins.list instance
L.append(object) -> None -- append object to end

In most cases, this will give you the information you need to proceed. However, it works best when you know what you are looking for, but just need to be reminded of the details. It also works best on smaller objects or specific object attributes. For example, most people will not have the requisite patience to read everything that comes up if you were to type:

>>> import numpy as np
>>> help(np)

Since help recurses into every object in the NumPy package (there are currently 621 of them at the top-level), that is a lot of reading!

dir()

The dir() function returns a directory of the type or object that you provide as an argument. For instance:

>>> dir(l)
['__add__', '__class__', '__contains__', '__delattr__',
'__delitem__', '__dir__', '__doc__', '__eq__', '__format__',
'__ge__', '__getattribute__', '__getitem__', '__gt__',
'__hash__', '__iadd__', '__imul__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__len__', '__lt__',
'__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__reversed__', '__rmul__', '__setattr__',
'__setitem__', '__sizeof__', '__str__', '__subclasshook__',
'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert',
'pop', 'remove', 'reverse', 'sort']

>>> len(dir(l))
46

Even a simple list has 46 attributes on it to sort out. In most cases, unless you are doing some object-oriented programming, you can ignore the “dunder” attributes (those that start and end with double underscores). That leaves the eleven names starting with “append” through “sort”.

By reading (or scrolling through) those names, you can often figure out promising object attributes and methods for whatever you are trying to do. At the very least, this generally helps you find more precise items to pass into help() for more details.

Unfortunately, the list of attributes returned by dir() does not tell you what kind of object you have. So, sometimes you just try them out:

>>> l.reverse
<built-in method reverse of list object at 0x0000029CA0581208>

If you get a message like this, you know the attribute is really an object method that you will need to call as l.reverse(), possibly with some arguments. In this case, it is probably a good idea to call help(l.reverse) and see what the documentation says.

Leveraging type(), getattr(), and callable()

Large objects like the top level NumPy and Pandas objects are still a bit of a problem with dir() as it is hard to sort out exactly what each name really represents. Still, you are getting a list back. With the list and some more of the tools that Python provides for object exploration, we can figure out a lot. These tools include:

  • type() – get object type class (this is the type object, not just the name)
  • getattr() – get an object’s attribute from its name; this allows us to use the names listed in the output of dir() to retrieve an actual Python object that we can use.
  • callable() – test if an object is callable (usually we are trying to see if an attribute is a method that can be run or a class that can be instantiated)

With these tools and a little creative programming, you can filter out information that you consider irrelevant and focus on what you are actually interested in. Consider the following function:

import pandas as pd
def obj_explore(obj, dunders=False):
df = pd.DataFrame(columns=['Attribute', 'Type', 'Callable'])
for attr_name in dir(obj):
if not dunders:
if attr_name.startswith('__') and attr_name.endswith('__'):
continue
attr = getattr(obj, attr_name)
df.loc[len(df)] = [
attr_name,
type(attr).__name__,
callable(attr)
]
df = df.set_index('Attribute')
return df


If I have Pandas imported and the function defined, I can:

>>> import numpy as np
>>> obj_explore(np)
  Type Callable
Attribute    
ALLOW_THREADS int False
AxisError  type True
BUFSIZE int False
CLIP int False
ComplexWarning type True
... ... ...
warnings module False
where function True
who who True
zeros builtin_function_or_method True
zeros_like function True

 

[607 rows x 2 columns]


The function automatically filters out the dunder methods (unless you set the dunders argument to True), and provides you with a dataframe showing the type of each attribute and whether or not it is callable.

On big packages like NumPy you can now take advantage of the Pandas DataFrame’s ability to filter on any of the columns to focus on what you are interested in. For instance, if I want to know what floating point constants NumPy has defined, I can:

>> df = obj_explore(np)
>>> df[df.Type == 'float']
  Type Callable
Attribute float False
Inf float False
Infinity float False
NAN float False
NINF float False
NZERO float False
NaN float False
PINF float False
PZERO float False
e float False
euler_gamma float False
inf float False
infty float False
nan float False
pi float False

 

Conclusion

While the obj_explore() function might not fit your needs, it is a reminder that Python provides a lot of top-level tools that we can use to explore Python objects. Many of them were designed to be used interactively. However, there is no reason why you can’t take advantage of your Python programming skills to facilitate and partially automate your exploration of the language and its many third party packages.

What’s Next? Level-up Your Scientific Python Skills

Enthought has been a leader in scientific Python software development and digital transformation for over 20 years. We also train over 1,000 scientists and engineers each year through in-depth open and corporate courses with live expert instructors. In fact, all Enthought classes are taught by scientists and engineers, for scientists and engineers. Click here for the upcoming course schedule and contact us if you have any questions.

 


 

About the Author

Eric Olsen holds a Ph.D. in history from the University of Pennsylvania, a M.S. in software engineering from Pennsylvania State University, and a B.A. in computer science from Utah State University. Eric spent three decades working in software development in a variety of fields, including atmospheric physics research, remote sensing and GIS, retail, and banking. In each of these fields, Eric focused on building software systems to automate and standardize the many repetitive, time-consuming, and unstable processes that he encountered.

Share this article:

Related Content

Leveraging AI for More Efficient Research in BioPharma

In the rapidly-evolving landscape of drug discovery and development, traditional approaches to R&D in biopharma are no longer sufficient. Artificial intelligence (AI) continues to be a...

Read More

Utilizing LLMs Today in Industrial Materials and Chemical R&D

Leveraging large language models (LLMs) in materials science and chemical R&D isn't just a speculative venture for some AI future. There are two primary use...

Read More

Top 10 AI Concepts Every Scientific R&D Leader Should Know

R&D leaders and scientists need a working understanding of key AI concepts so they can more effectively develop future-forward data strategies and lead the charge...

Read More

Why A Data Fabric is Essential for Modern R&D

Scattered and siloed data is one of the top challenges slowing down scientific discovery and innovation today. What every R&D organization needs is a data...

Read More

Jupyter AI Magics Are Not ✨Magic✨

It doesn’t take ✨magic✨ to integrate ChatGPT into your Jupyter workflow. Integrating ChatGPT into your Jupyter workflow doesn’t have to be magic. New tools are…

Read More

Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More

By Mike Heiber, Ph.D., Materials Informatics Manager Enthought, Materials Science Solutions The American Chemical Society (ACS) is a premier scientific organization with members all over…

Read More

Real Scientists Make Their Own Tools

There’s a long history of scientists who built new tools to enable their discoveries. Tycho Brahe built a quadrant that allowed him to observe the…

Read More

How IT Contributes to Successful Science

With the increasing importance of AI and machine learning in science and engineering, it is critical that the leadership of R&D and IT groups at...

Read More

From Data to Discovery: Exploring the Potential of Generative Models in Materials Informatics Solutions

Generative models can be used in many more areas than just language generation, with one particularly promising area: molecule generation for chemical product development.

Read More

7 Pro-Tips for Scientists: Using LLMs to Write Code

Scientists gain superpowers when they learn to program. Programming makes answering whole classes of questions easy and new classes of questions become possible to answer….

Read More