if __name__ == “__main__”:
When I was new to Python, I ran into a mysterious block of code that looked something like:
def main():
# do some stuff
if __name__ == "__main__":
main()
Looking at the code, I could see that it ran the main()
function after checking the status of the __name__
variable, but didn’t know what that variable was or how it was set. I asked a colleague (another former C and Java programmer who had recently made the leap to Python) what the “name equals main stuff” was all about. He replied that it was simply Python’s cumbersome way of using a main()
function (like in a normal language) and that I should just copy the pattern. That was a singularly unsatisfying answer.
So, why should you put this in a Python program? Why does it work? And, is it just Python’s cumbersome way of doing main()
? Let’s start with the last question.
Python’s Entry Point is at the Beginning
If you are programming in a language like C, you typically create a source file for your program, compile it, link various external libraries, and then—if all went well—run the executable that results. When you run that executable, the system looks for a function named main()
that should have been defined in your source file and starts execution there. This function is called the entry point. If you do not have an entry point, running your program doesn’t do anything.
This is not the way it works in Python. When you program in Python, you create a source file and then tell Python to run it. When this happens, Python converts your source to bytecode (essentially compiling it for you) and then starts running from the beginning of the bytecode. In effect, Python starts running your source code much the same way a person reads a document: starting at the beginning. No main()
function is required. You can choose to define one, but it is just another, plain vanilla function. It is no different as far as Python is concerned than start(), calc_speed_of_light()
, or something_else()
. To run a function within your code, you—the programmer—have to call it.
So, that original block of code could just as well have been:
def start():
# do some stuff
if __name__ == "__main__":
start()
By convention, the name main()
is the one typically used as the first function to run since a lot of programmers tend to look for that name. However, it has no special semantics in Python.
What is __name__?
In contrast, the “__main__
“ string that is being matched in the if statement does have special semantics. Each time a Python module is imported, Python automatically assigns a string name to the dunder name (__name__
) variable in that module’s namespace; normally this is the name of the module being imported as defined by its source file name (or package hierarchy name). The outermost module—the one that is there every time you run Python and requires no import—is always assigned “__main__
“ as its name. It is the main module that is always present. (See the Python Execution Model documentation for more details.)
Let’s check this out in code. If I start my Python interpreter and print the value of __name__
, I see:
C:\> python
>>> print(__name__)
__main__
>>>
Here, I am in the outermost module provided by starting Python.
I can also create a script (test.py) that prints out __name__
in the function run()
:
def run():
print(f"This module is named: {__name__}")
run()
If I run test.py, I get:
C:\> python test.py
This module is named: __main__
Again, even running Python non-interactively places my script in the outermost __main__
module. However, if I enter Python and then import my test.py module, I get:
C:\> python
>>> import test
This module is named: test
>>>
Because of the import, my code from test.py resides in the test
module, not the __main__
module.
Note that as the test
module was imported, the function run()
was automatically executed. I did not have to call test.run()
after importing to see the output. Instead, as part of the import, Python interpreted my file from top to bottom, running each command. (Note, this only happens on the first import of test
; if I try importing test
again during the same Python session, Python will recognize that it was already imported and silently skip the import).
Another thing to note is that the name I import the module as does not change the name that gets assigned by Python. For example:
C:\> python
>>> import test as t
This module is named: test
>>>
The module name as far as Python goes is defined by the filename of my module (without the .py extension), not any alias I assign during the import. If I have module test.py as part of a Python package hierarchy, I will get something like “package.test” as the name.
Interesting? Perhaps. But…what exactly can I do with this technical arcana? Is it useful?
Why is __name__ useful?
The reason that __name__
is useful has to do with the fact that Python runs the code that it is importing as a module. Doing an import populates the module’s namespace with the variables, functions, and classes defined in the module. With __name__ I have a way to control what actually runs and the context in which it runs.
As we saw earlier, when I imported test.py, the import ran the run()
function. In that case we got the message “This module is named: test”. That is probably not that useful, but it does help illustrate the idea. Imagine if the code in the module did something more expensive—running an extensive calculation, loading a database, clearing a file system—perhaps something that I do not want to have happen every time the module gets imported. The pattern with which we started is the way you get to control that in Python:
def run():
print(f"This module is named: {__name__}")
if __name__ == "__main__":
run()
By inserting that, my import now defines everything I want defined, but does not actually run any of it:
C:\> python
>>> import test
>>>
At this point, if I want to run my code, I can interactively call run()
:
>>> test.run()
This module is named: test
>>>
However, if I run the code directly (as a script, rather than as a module I am importing):
C:\> python test.py
This module is named: __main__
That simple if statement allows me to have my code understand the context in which it is running. So, I can make a single source code file that can be run as a script (i.e., “python test.py”) and do something useful; alternatively, it can be imported (i.e., “import test”), make its functions available to the importing program and not unexpectedly run a bunch of stuff during the import.
Not Python’s Cumbersome main()
Python’s if __name__ == "__main__"
pattern is extremely flexible. You can have the pattern in multiple places in your source file, whenever you need to detect how that code is running. I have seen this pattern for multiple use cases:
- wrapping imports that are only needed when run as a script
- wrapping top-level code
- wrapping debug level settings (to force more verbosity when running as a script)
- wrapping command line usage messages
The pattern provides the programmer with a standard, easy mechanism to differentiate between run modes. Am I running code as a script? Or, did I import it as a module? In either case, this is not C’s or Java’s main()
. It is something much more useful and, well, Pythonic.
Author: Eric Olsen, Director, Training Solutions, holds a Ph.D. in history from the University of Pennsylvania, a M.S. in software engineering from Pennsylvania State University, and a B.A. in computer science from Utah State University. Eric spent three decades working in software development in a variety of fields, including atmospheric physics research, remote sensing and GIS, retail, and banking. In each of these fields, Eric focused on building software systems to automate and standardize the many repetitive, time-consuming, and unstable processes that he encountered.
Related Content
Digital Transformation vs. Digital Enhancement: A Starting Decision Framework for Technology Initiatives in R&D
Leveraging advanced technology like generative AI through digital transformation (not digital enhancement) is how to get the biggest returns in scientific R&D.
Digital Transformation in Practice
There is much more to digital transformation than technology, and a holistic strategy is crucial for the journey.
Leveraging AI for More Efficient Research in BioPharma
In the rapidly-evolving landscape of drug discovery and development, traditional approaches to R&D in biopharma are no longer sufficient. Artificial intelligence (AI) continues to be a...
Utilizing LLMs Today in Industrial Materials and Chemical R&D
Leveraging large language models (LLMs) in materials science and chemical R&D isn't just a speculative venture for some AI future. There are two primary use...
Top 10 AI Concepts Every Scientific R&D Leader Should Know
R&D leaders and scientists need a working understanding of key AI concepts so they can more effectively develop future-forward data strategies and lead the charge...
Why A Data Fabric is Essential for Modern R&D
Scattered and siloed data is one of the top challenges slowing down scientific discovery and innovation today. What every R&D organization needs is a data...
Jupyter AI Magics Are Not ✨Magic✨
It doesn’t take ✨magic✨ to integrate ChatGPT into your Jupyter workflow. Integrating ChatGPT into your Jupyter workflow doesn’t have to be magic. New tools are…
Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More
By Mike Heiber, Ph.D., Materials Informatics Manager Enthought, Materials Science Solutions The American Chemical Society (ACS) is a premier scientific organization with members all over…
Real Scientists Make Their Own Tools
There’s a long history of scientists who built new tools to enable their discoveries. Tycho Brahe built a quadrant that allowed him to observe the…
How IT Contributes to Successful Science
With the increasing importance of AI and machine learning in science and engineering, it is critical that the leadership of R&D and IT groups at...