Before you can experience the art of gray hat Python programming, you must work through the least exciting portion of this book, setting up your development environment. It is essential that you have a solid development environment, which allows you to spend time absorbing the interesting information in this book rather than stumbling around trying to get your code to execute.
This chapter quickly covers the installation of Python 2.5, configuring your Eclipse development environment, and the basics of writing C-compatible code with Python. Once you have set up the environment and understand the basics, the world is your oyster; this book will show you how to crack it open.
1.1 Operating System Requirements
I assume that you are using a 32-bit Windows-based platform to do most of your coding. Windows has the widest array of tools and lends itself well to Python development. All of the chapters in this book are Windows-specific, and most examples will work only with a Windows operating system.
However, there are some examples that you can run from a Linux distribution. For Linux development, I recommend you download a 32-bit Linux distro as a VMware appliance. VMware’s appliance player is free, and it enables you to quickly move files from your development machine to your virtualized Linux machine. If you have an extra machine lying around, feel free to install a complete distribution on it. For the purpose of this book, use a Red Hat–based distribution like Fedora Core 7 or Centos 5. Of course, alternatively, you can run Linux and emulate Windows. It’s really up to you.
1.3.1 The Hacker’s Best Friend: ctypes
The Python module ctypes is by far one of the most powerful libraries available to the Python developer. The ctypes library enables you to call functions in dynamically linked libraries and has extensive capabilities for creating complex C datatypes and utility functions for low-level memory manipulation. It is essential that you understand the basics of how to use the ctypes library, as you will be relying on it heavily throughout the book.
1.3.2 Using Dynamic Libraries
The first step in utilizing ctypes is to understand how to resolve and access functions in a dynamically linked library. A dynamically linked library is a compiled binary that is linked at runtime to the main process executable. On Windows platforms these binaries are called dynamic link libraries (DLL), and on Linux they are called shared objects (SO). In both cases, these binaries expose functions through exported names, which get resolved to actual addresses in memory. Normally at runtime you have to resolve the function addresses in order to call the functions; however, with ctypes all of the dirty work is already done.
There are three different ways to load dynamic libraries in ctypes: cdll(), windll() , and oledll(). The difference among all three is in the way the functions inside those libraries are called and their resulting return values. The cdll() method is used for loading libraries that export functions using the standard cdecl calling convention. The windll() method loads libraries that export functions using the stdcall calling convention, which is the native convention of the Microsoft Win32 API. The oledll() method operates exactly like the windll() method; however, it assumes that the exported functions return a Windows HRESULT error code, which is used specifically for error messages returned from Microsoft Component Object Model (COM) functions.
For a quick example you will resolve the printf() function from the C runtime on both Windows and Linux and use it to output a test message. On Windows the C runtime is msvcrt.dll, located in C:\WINDOWS\system32\, and on Linux it is libc.so.6, which is located in /lib/ by default. Create a chapter1-printf.py script, either in Eclipse or in your normal Python working directory, and enter the following code.
chapter1-printf.py Code on Windows
from ctypes import *
msvcrt = cdll.msvcrt message_string = "Hello world!\n" msvcrt.printf("Testing: %s", message_string)
The following is the output of this script:
C:\Python25> python chapter1-printf.py
Testing: Hello world!
C:\Python25>
On Linux, this example will be slightly different but will net the same results. Switch to your Linux install, and create chapter1-printf.py inside your /root/ directory.
UNDERSTANDING CALLING CONVENTIONS
A calling convention describes how to properly call a particular function. This includes the order of how function parameters are allocated, which parameters are pushed onto the stack or passed in registers, and how the stack is unwound when a function returns. You need to understand two calling conventions: cdecl and stdcall. In the cdecl convention, parameters are pushed from right to left, and the caller of the function is responsible for clearing the arguments from the stack. It’s used by most C systems on the x86 architecture.
Following is an example of a cdecl function call:
In C
int python_rocks(reason_one, reason_two, reason_three);
In x86 Assembly
push reason_three
push reason_two
push reason_one
call python_rocks
add esp, 12
You can clearly see how the arguments are passed, and the last line increments the stack pointer 12 bytes (there are three parameters to the function, and each stack parameter is 4 bytes, and thus 12 bytes), which essentially clears those parameters.
An example of the stdcall convention, which is used by the Win32 API, is shown here:
In C
int my_socks(color_one color_two, color_three);
In x86 Assembly
push color_three
push color_two
push color_one
call my_socks
In this case you can see that the order of the parameters is the same, but the stack clearing is not done by the caller; rather the my_socks function is responsible for cleaning up before it returns.
For both conventions it’s important to note that return values are stored in the EAX register.