Your first C API extension module¶
This tutorial will take you through creating a simple Python extension module written in C or C++.
It assumes basic knowledge about Python: you should be able to define functions in Python code before starting to write them in C. See The Python Tutorial for an introduction to Python itself.
The tutorial should be useful for anyone who can write a basic C library.
While we will mention several concepts that a C beginner would not be expected
to know, like static functions or linkage declarations, understanding these
is not necessary for success.
We will focus on giving you a “feel” of what Python’s C API is like. It will not teach you important concepts, like error handling and reference counting, which are covered in later chapters.
As you write the code, you will need to compile it. Prepare to spend some time choosing, installing and configuring a build tool, since CPython itself does not include one.
We will assume that you use a Unix-like system (including macOS and Linux), or Windows. On other systems, you might need to adjust some details – for example, a system command name.
Note
This tutorial uses API that was added in CPython 3.15. To create an extension that’s compatible with earlier versions of CPython, please follow an earlier version of this documentation.
This tutorial uses some syntax added in C11 and C++20. If your extension needs to be compatible with earlier standards, please follow tutorials in documentation for Python 3.14 or below.
What we’ll do¶
Let’s create an extension module called spam [1],
which will include a Python interface to the C
standard library function system().
This function is defined in stdlib.h.
It takes a C string as argument, runs the argument as a system
command, and returns a result value as an integer.
A manual page for system might summarize it this way:
#include <stdlib.h>
int system(const char *command);
Note that like many functions in the C standard library,
this function is already exposed in Python.
In production, use os.system() or subprocess.run()
rather than the module you’ll write here.
We want this function to be callable from Python as follows:
>>> import spam
>>> status = spam.system("whoami")
User Name
>>> status
0
Note
The system command whoami prints out your username.
It’s useful in tutorials like this one because it has the same name on
both Unix and Windows.
Warming up your build tool¶
Begin by creating a file named spammodule.c. [2]
Now, while the file is empty, we’ll compile it.
Choose a build tool such as Setuptools or Meson, and follow its instructions
to compile and install the empty spammodule.c as a C extension module.
This will ensure that your build tool works, so that you can make and test incremental changes as you follow the rest of the text.
Note
Workaround for missing PyInit
If your build tool output complains about missing PyInit_spam,
add the following function to your module for now:
// A workaround
void *PyInit_spam(void) { return NULL; }
This is a shim for an old-style initialization function, which was required in extension modules for CPython 3.14 and below. Current CPython will not call it, but some build tools may still assume that all extension modules need to define it.
If you use this workaround, you will get the exception
SystemError: initialization of spam failed without raising an exception
instead of an ImportError in the next step.
Note
Using a third-party build tool is heavily recommended, as in will take care of various details of your platform and Python installation, of naming the resulting extension, and, later, of distributing your work.
If you don’t want to use a tool, you can try to run your compiler directly.
The following command should work for many flavors of Linux, and generate
a spam.so file that you need to put in a directory
in sys.path:
gcc --shared spammodule.c -o spam.so
When your extension is compiled and installed, start Python and try to import your extension. This should fail with the following exception:
>>> import spam
Traceback (most recent call last):
...
ImportError: dynamic module does not define module export function (PyModExport_spam or PyInit_spam)
Including the Header¶
Now, add the first line to your file: include Python.h to pull in
all declarations of the Python C API:
#include <Python.h>
Next, include the header for the system() function:
#include <stdlib.h>
Be sure to put this, and any other standard library includes, after
Python.h.
On some systems, Python may define some pre-processor definitions
that affect the standard headers.
Tip
The <stdlib.h> include is technically not necessary.
Python.h includes several standard header files
for its own use and for backwards compatibility,
and stdlib is one of them.
However, it is good practice to explicitly include what you need.
With the includes in place, compile and import the extension again. You should get the same exception as with the empty file.
Note
Third-party build tools should handle pointing the compiler to the CPython headers and libraries, and setting appropriate options.
If you are running the compiler directly, you will need to do this yourself.
If your installation of Python comes with a corresponding python-config
command, you can run something like:
gcc --shared $(python-config --cflags --ldflags) spammodule.c -o spam.so
Module export hook¶
The exception you got when you tried to import the module told you that Python is looking for a “module export function”, also known as a module export hook. Let’s define one.
First, add a prototype below the #include lines:
PyMODEXPORT_FUNC PyModExport_spam(void);
Tip
The prototype is not strictly necessary, but some modern compilers emit warnings without it. It’s generally better to add the prototype than to disable the warning.
The PyMODEXPORT_FUNC macro declares the function’s
return type, and adds any special linkage declarations needed
to make the function visible and usable when CPython loads it.
After the prototype, add the function itself.
For now, make it return NULL:
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return NULL;
}
Compile and load the module again. You should get a different error this time.
>>> import spam
Traceback (most recent call last):
...
SystemError: module export hook for module 'spam' failed without setting an exception
Simply returning NULL is not correct behavior for an export hook,
and CPython complains about it.
That’s good – it means that CPython found the function!
Let’s now make it do something useful.
The slot table¶
Rather than NULL, the export hook should return the information needed to
create a module.
Let’s with the basics: the name and docstring.
The information should de defined in as static array of
PyModuleDef_Slot entries, which are essentially key-value pairs.
Define this array just before your export hook:
static PyModuleDef_Slot spam_slots[] = {
{Py_mod_name, "spam"},
{Py_mod_doc, "A wonderful module with an example function"},
{0, NULL}
};
For both name and docstring, the values are C strings – that is, NUL-terminated UTF-8 encoded byte arrays.
Note the zero-filled sentinel entry at the end. If you forget it, you’ll trigger undefined behavior.
The array is defined as static – not visible outside this .c file.
This will be a common theme.
CPython only needs to access the export hook; all global variables
and all other functions should generally be static, so that they don’t
clash with other extensions.
Return this array from your export hook instead of NULL:
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return spam_slots;
}
Now, recompile and try it out:
>>> import spam
>>> print(spam)
<module 'spam' from '/home/encukou/dev/cpython/spam.so'>
You have a extension module!
Try help(spam) to see the docstring.
The next step will be adding a function.
Exposing a function¶
To expose the system C function directly to Python,
we’ll need to write a layer of glue code to convert arguments from Python
objects to C values, and the C return value back to Python.
One of the simplest glue code is a “METH_O” function,
which takes two Python objects and returns one.
All Pyton objects – regardless of the Python type – are represented in C
as pointers to the PyObject structure.
Add such a function above the slots array:
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
Py_RETURN_NONE;
}
For now, we’ll ignor the arguments, and use the Py_RETURN_NONE
macro to properly return a Python None object.
Recompile your extension to make sure you don’t have syntax errors.
We haven’t yet added spam_system to the module, so you might get a
warning that spam_system is unused.
Method definitions¶
To expose the C function to Python, you will need to provide several pieces of
information in a structure called
PyMethodDef [3]:
ml_name: the name of the Python function;ml_doc: a docstring;ml_meth: the C function to be called; andml_flags: a set of flags describing details like how Python arguments are passed to the C function. We’ll useMETH_Ohere – the flag that matches ourspam_systemfunction’s signature.
Because modules typically create several functions, these definitions
need to be collected in an array, with a zero-filled sentinel at the end.
Add this array just below the spam_system function:
static PyMethodDef spam_methods[] = {
{
.ml_name="system",
.ml_meth=spam_system,
.ml_flags=METH_O,
.ml_doc="Execute a shell command.",
},
{NULL, NULL, 0, NULL} /* Sentinel */
};
As with module slots, a zero-filled sentinel marks the end of the array.
Next, we’ll add the method to the module.
Add a Py_mod_methods slot to your a PyMethodDef array:
static PyModuleDef_Slot spam_slots[] = {
{Py_mod_name, "spam"},
{Py_mod_doc, "A wonderful module with an example function"},
{Py_mod_methods, spam_methods},
{0, NULL}
};
Recompile your extension again, and test it.
You should now be able to call the function, and get None back:
>>> import spam
>>> print(spam.system)
<built-in function system>
>>> print(spam.system('whoami'))
None
Returning an integer¶
Now, let’s take a look at the return value.
Instead of None, we’ll want spam.system to return a number – that is,
a Python int object.
Eventually this will be the exit code of a system command,
but let’s start with a fixed value, say, 3.
The Python C API provides a function to create a Python int object
from a C int values: PyLong_FromLong(). [4]
To call it, replace the Py_RETURN_NONE with the following 3 lines:
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
Recompile and run again, and check that the function now returns 3:
>>> import spam
>>> spam.system('whoami')
3
Accepting a string¶
Finally, let’s handle the function argument.
Our C function, spam_system(), takes two arguments.
The first one, PyObject *self, will be set to the spam module
object.
This isn’t useful in our case, so we’ll ignore it.
The other one, PyObject *arg, will be set to the object that the user
called the Python with.
We expect that it should be a Python string.
In order to use the information in it, we will need
to convert it to a C value — in this case, a C string (const char *).
There’s a slight type mismatch here: Python’s str objects store
Unicode text, but C strings are arrays of bytes.
So, we’ll need to encode the data, and we’ll use the UTF-8 encoding for it.
(UTF-8 might not always be correct for system commands, but it’s what
str.decode() uses by default,
and the C API has special support for it.)
The function to decode a Python string into a UTF-8 buffer is named
PyUnicode_AsUTF8() [5].
Call it like this:
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
If PyUnicode_AsUTF8() is successful, command will point to the
resulting array of bytes.
This buffer is managed by the arg object, which means we don’t need to free
it, but we must follow some rules:
We should only use the buffer inside the
spam_systemfunction. Whenspam_systemreturns, arg and the buffer it manages might be garbage-collected.We must not modify it. This is why we use
const.
If PyUnicode_AsUTF8() was not successful, it returns a NULL
pointer.
When calling any Python C API, we always need to handle such error cases.
The way to do this in general is left for later chapters of this documentation.
For now, be assured that we are already handling errors from
PyLong_FromLong() correctly.
For the PyUnicode_AsUTF8() call, the correct way to handle errors is
returning NULL from spam_system.
Add an if block for this:
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
That’s it for the setup.
Now, all that is left is calling C library function system with
the char * buffer, and using its result instead of the 3:
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = system(command);
PyObject *result = PyLong_FromLong(status);
return result;
}
Compile your module, and test:
>>> import spam
>>> result = spam.system('whoami')
User Name
>>> result
0
You might also want to test error cases:
>>> import spam
>>> result = spam.system('nonexistent-command')
sh: line 1: nonexistent-command: command not found
>>> result
32512
>>> spam.system(3)
Traceback (most recent call last):
...
TypeError: bad argument type for built-in operation
>>> print(spam.system('too', 'many', 'arguments'))
Traceback (most recent call last):
...
TypeError: spam.system() takes exactly one argument (3 given)
The result¶
Congratulations! You have written a complete Python C API extension module, and completed this tutorial!
Here is the entire source file, for your convenience:
/// Includes
#include <Python.h>
#include <stdlib.h>
/// Implementation of spam.system
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = system(command);
PyObject *result = PyLong_FromLong(status);
return result;
}
/// Module method table
static PyMethodDef spam_methods[] = {
{
.ml_name="system",
.ml_meth=spam_system,
.ml_flags=METH_O,
.ml_doc="Execute a shell command.",
},
{NULL, NULL, 0, NULL} /* Sentinel */
};
/// Module slot table
static PyModuleDef_Slot spam_slots[] = {
{Py_mod_name, "spam"},
{Py_mod_doc, "A wonderful module with an example function"},
{Py_mod_methods, spam_methods},
{0, NULL}
};
/// Export hook prototype
PyMODEXPORT_FUNC PyModExport_spam(void);
/// Module export hook
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return spam_slots;
}
Footnotes