Introduction
When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python from the command line. This has several advantages. First, it reduces context switching – to figure out new stuff, students don’t constantly have to toggle between writing Python code and searching for documentation on the web or in a book. Second, it encourages an experimental mindset – students can use a set of simple tools to examine unfamiliar Python objects, figure out what they do, and how to correctly use them, or find new possibilities for what they could do.
Let’s take a look at some of the functions that Python provides for exploring objects: help()
, dir()
, and type()
.
help()
One of the main ways to explore objects is Python’s built-in help()
function (and its IPython surrogates of ?
and ??
). Calling help with a type name or an actual object, will pull up help on that type:
>>> help(list)
Help on class list in module builtins:
class list(object)
| list() -> new empty list
| list(iterable) -> new list initialized from iterable's items
|
| Methods defined here:
|
| __add__(self, value, /)
| Return self+value.
...
In [1]: # if in IPython
In [2]: l = [1, 2, 3]
In [3]: l?
Type: list
String form: [1, 2, 3]
Length: 3
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items
Both help()
and ?
draw on the object’s docstring (stored in the .__doc__
attribute) for the information to display. Both facilities also allow you to specify a part of the object as well. For example:
>>> help(l.append)
Help on built-in function append:
append(...) method of builtins.list instance
L.append(object) -> None -- append object to end
In most cases, this will give you the information you need to proceed. However, it works best when you know what you are looking for, but just need to be reminded of the details. It also works best on smaller objects or specific object attributes. For example, most people will not have the requisite patience to read everything that comes up if you were to type:
>>> import numpy as np
>>> help(np)
Since help recurses into every object in the NumPy package (there are currently 621 of them at the top-level), that is a lot of reading!
dir()
The dir()
function returns a directory of the type or object that you provide as an argument. For instance:
>>> dir(l)
['__add__', '__class__', '__contains__', '__delattr__',
'__delitem__', '__dir__', '__doc__', '__eq__', '__format__',
'__ge__', '__getattribute__', '__getitem__', '__gt__',
'__hash__', '__iadd__', '__imul__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__len__', '__lt__',
'__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__reversed__', '__rmul__', '__setattr__',
'__setitem__', '__sizeof__', '__str__', '__subclasshook__',
'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert',
'pop', 'remove', 'reverse', 'sort']>>> len(dir(l))
46
Even a simple list has 46 attributes on it to sort out. In most cases, unless you are doing some object-oriented programming, you can ignore the “dunder” attributes (those that start and end with double underscores). That leaves the eleven names starting with “append” through “sort”.
By reading (or scrolling through) those names, you can often figure out promising object attributes and methods for whatever you are trying to do. At the very least, this generally helps you find more precise items to pass into help()
for more details.
Unfortunately, the list of attributes returned by dir()
does not tell you what kind of object you have. So, sometimes you just try them out:
>>> l.reverse
<built-in method reverse of list object at 0x0000029CA0581208>
If you get a message like this, you know the attribute is really an object method that you will need to call as l.reverse()
, possibly with some arguments. In this case, it is probably a good idea to call help(l.reverse)
and see what the documentation says.
Leveraging type(), getattr(), and callable()
Large objects like the top level NumPy and Pandas objects are still a bit of a problem with dir()
as it is hard to sort out exactly what each name really represents. Still, you are getting a list back. With the list and some more of the tools that Python provides for object exploration, we can figure out a lot. These tools include:
type()
– get object type class (this is the type object, not just the name)getattr()
– get an object’s attribute from its name; this allows us to use the names listed in the output ofdir()
to retrieve an actual Python object that we can use.callable()
– test if an object is callable (usually we are trying to see if an attribute is a method that can be run or a class that can be instantiated)
With these tools and a little creative programming, you can filter out information that you consider irrelevant and focus on what you are actually interested in. Consider the following function:
import pandas as pd
def obj_explore(obj, dunders=False):
df = pd.DataFrame(columns=['Attribute', 'Type', 'Callable'])
for attr_name in dir(obj):
if not dunders:
if attr_name.startswith('__') and attr_name.endswith('__'):
continue
attr = getattr(obj, attr_name)
df.loc[len(df)] = [
attr_name,
type(attr).__name__,
callable(attr)
]
df = df.set_index('Attribute')
return df
If I have Pandas imported and the function defined, I can:
>>> import numpy as np
>>> obj_explore(np)
Type |
Callable |
|
Attribute |
||
ALLOW_THREADS |
int |
False |
AxisError |
type |
True |
BUFSIZE |
int |
False |
CLIP |
int |
False |
ComplexWarning |
type |
True |
... |
... |
... |
warnings |
module |
False |
where |
function |
True |
who |
who |
True |
zeros |
builtin_function_or_method |
True |
zeros_like |
function |
True |
[607 rows x 2 columns]
The function automatically filters out the dunder methods (unless you set the dunders argument to True), and provides you with a dataframe showing the type of each attribute and whether or not it is callable.
On big packages like NumPy you can now take advantage of the Pandas DataFrame’s ability to filter on any of the columns to focus on what you are interested in. For instance, if I want to know what floating point constants NumPy has defined, I can:
>> df = obj_explore(np)
>>> df[df.Type == 'float']
Type |
Callable |
|
Attribute |
float |
False |
Inf |
float |
False |
Infinity |
float |
False |
NAN |
float |
False |
NINF |
float |
False |
NZERO |
float |
False |
NaN |
float |
False |
PINF |
float |
False |
PZERO |
float |
False |
e |
float |
False |
euler_gamma |
float |
False |
inf |
float |
False |
infty |
float |
False |
nan |
float |
False |
pi |
float |
False |
Conclusion
While the obj_explore()
function might not fit your needs, it is a reminder that Python provides a lot of top-level tools that we can use to explore Python objects. Many of them were designed to be used interactively. However, there is no reason why you can’t take advantage of your Python programming skills to facilitate and partially automate your exploration of the language and its many third party packages.
What’s Next? Level-up Your Scientific Python Skills
Enthought has been a leader in scientific Python software development and digital transformation for over 20 years. We also train over 1,000 scientists and engineers each year through in-depth open and corporate courses with live expert instructors. In fact, all Enthought classes are taught by scientists and engineers, for scientists and engineers. Click here for the upcoming course schedule and contact us if you have any questions.
About the Author
Eric Olsen holds a Ph.D. in history from the University of Pennsylvania, a M.S. in software engineering from Pennsylvania State University, and a B.A. in computer science from Utah State University. Eric spent three decades working in software development in a variety of fields, including atmospheric physics research, remote sensing and GIS, retail, and banking. In each of these fields, Eric focused on building software systems to automate and standardize the many repetitive, time-consuming, and unstable processes that he encountered.
Related Content
Digital Transformation vs. Digital Enhancement: A Starting Decision Framework for Technology Initiatives in R&D
Leveraging advanced technology like generative AI through digital transformation (not digital enhancement) is how to get the biggest returns in scientific R&D.
Digital Transformation in Practice
There is much more to digital transformation than technology, and a holistic strategy is crucial for the journey.
Leveraging AI for More Efficient Research in BioPharma
In the rapidly-evolving landscape of drug discovery and development, traditional approaches to R&D in biopharma are no longer sufficient. Artificial intelligence (AI) continues to be a...
Utilizing LLMs Today in Industrial Materials and Chemical R&D
Leveraging large language models (LLMs) in materials science and chemical R&D isn't just a speculative venture for some AI future. There are two primary use...
Top 10 AI Concepts Every Scientific R&D Leader Should Know
R&D leaders and scientists need a working understanding of key AI concepts so they can more effectively develop future-forward data strategies and lead the charge...
Why A Data Fabric is Essential for Modern R&D
Scattered and siloed data is one of the top challenges slowing down scientific discovery and innovation today. What every R&D organization needs is a data...
Jupyter AI Magics Are Not ✨Magic✨
It doesn’t take ✨magic✨ to integrate ChatGPT into your Jupyter workflow. Integrating ChatGPT into your Jupyter workflow doesn’t have to be magic. New tools are…
Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More
By Mike Heiber, Ph.D., Materials Informatics Manager Enthought, Materials Science Solutions The American Chemical Society (ACS) is a premier scientific organization with members all over…
Real Scientists Make Their Own Tools
There’s a long history of scientists who built new tools to enable their discoveries. Tycho Brahe built a quadrant that allowed him to observe the…
How IT Contributes to Successful Science
With the increasing importance of AI and machine learning in science and engineering, it is critical that the leadership of R&D and IT groups at...