Please wait while we load your page...

Tools

PHP Manual [Zend API: Hacking the Core of PHP]

Protect Your Website Today


PHP Manual || Zend Engine 1

Introduction

Those who know don't talk.

Those who talk don't know.

Sometimes, PHP "as is" simply isn't enough. Although these cases are rare for the average user, professional applications will soon lead PHP to the edge of its capabilities, in terms of either speed or functionality. New functionality cannot always be implemented natively due to language restrictions and inconveniences that arise when having to carry around a huge library of default code appended to every single script, so another method needs to be found for overcoming these eventual lacks in PHP.

As soon as this point is reached, it's time to touch the heart of PHP and take a look at its core, the C code that makes PHP go.

Warning

This information is currently rather outdated, parts of it only cover early stages of the ZendEngine 1.0 API as it was used in early versions of PHP 4.

More recent information may be found in the various README files that come with the PHP source and the » Internals section on the Zend website.

Overview

"Extending PHP" is easier said than done. PHP has evolved to a full-fledged tool consisting of a few megabytes of source code, and to hack a system like this quite a few things have to be learned and considered. When structuring this chapter, we finally decided on the "learn by doing" approach. This is not the most scientific and professional approach, but the method that's the most fun and gives the best end results. In the following sections, you'll learn quickly how to get the most basic extensions to work almost instantly. After that, you'll learn about Zend's advanced API functionality. The alternative would have been to try to impart the functionality, design, tips, tricks, etc. as a whole, all at once, thus giving a complete look at the big picture before doing anything practical. Although this is the "better" method, as no dirty hacks have to be made, it can be very frustrating as well as energy- and time-consuming, which is why we've decided on the direct approach.

Note that even though this chapter tries to impart as much knowledge as possible about the inner workings of PHP, it's impossible to really give a complete guide to extending PHP that works 100% of the time in all cases. PHP is such a huge and complex package that its inner workings can only be understood if you make yourself familiar with it by practicing, so we encourage you to work with the source.

What Is Zend? and What Is PHP?

The name Zend refers to the language engine, PHP's core. The term PHP refers to the complete system as it appears from the outside. This might sound a bit confusing at first, but it's not that complicated ( see below). To implement a Web script interpreter, you need three parts:

  1. The interpreter part analyzes the input code, translates it, and executes it.

  2. The functionality part implements the functionality of the language (its functions, etc.).

  3. The interface part talks to the Web server, etc.

Zend takes part 1 completely and a bit of part 2; PHP takes parts 2 and 3. Together they form the complete PHP package. Zend itself really forms only the language core, implementing PHP at its very basics with some predefined functions. PHP contains all the modules that actually create the language's outstanding capabilities.
The internal structure of PHP.

The following sections discuss where PHP can be extended and how it's done.

Extension Possibilities

As shown above, PHP can be extended primarily at three points: external modules, built-in modules, and the Zend engine. The following sections discuss these options.

External Modules

External modules can be loaded at script runtime using the function dl(). This function loads a shared object from disk and makes its functionality available to the script to which it's being bound. After the script is terminated, the external module is discarded from memory. This method has both advantages and disadvantages, as described in the following table:

Advantages Disadvantages
External modules don't require recompiling of PHP. The shared objects need to be loaded every time a script is being executed (every hit), which is very slow.
The size of PHP remains small by "outsourcing" certain functionality. External additional files clutter up the disk.
    Every script that wants to use an external module's functionality has to specifically include a call to dl(), or the extension tag in php.ini needs to be modified (which is not always a suitable solution).
To sum up, external modules are great for third-party products, small additions to PHP that are rarely used, or just for testing purposes. To develop additional functionality quickly, external modules provide the best results. For frequent usage, larger implementations, and complex code, the disadvantages outweigh the advantages.

Third parties might consider using the extension tag in php.ini to create additional external modules to PHP. These external modules are completely detached from the main package, which is a very handy feature in commercial environments. Commercial distributors can simply ship disks or archives containing only their additional modules, without the need to create fixed and solid PHP binaries that don't allow other modules to be bound to them.

Built-in Modules

Built-in modules are compiled directly into PHP and carried around with every PHP process; their functionality is instantly available to every script that's being run. Like external modules, built-in modules have advantages and disadvantages, as described in the following table:

Advantages Disadvantages
No need to load the module specifically; the functionality is instantly available. Changes to built-in modules require recompiling of PHP.
No external files clutter up the disk; everything resides in the PHP binary. The PHP binary grows and consumes more memory.
Built-in modules are best when you have a solid library of functions that remains relatively unchanged, requires better than poor-to-average performance, or is used frequently by many scripts on your site. The need to recompile PHP is quickly compensated by the benefit in speed and ease of use. However, built-in modules are not ideal when rapid development of small additions is required.

The Zend Engine

Of course, extensions can also be implemented directly in the Zend engine. This strategy is good if you need a change in the language behavior or require special functions to be built directly into the language core. In general, however, modifications to the Zend engine should be avoided. Changes here result in incompatibilities with the rest of the world, and hardly anyone will ever adapt to specially patched Zend engines. Modifications can't be detached from the main PHP sources and are overridden with the next update using the "official" source repositories. Therefore, this method is generally considered bad practice and, due to its rarity, is not covered in this book.

Source Layout

Note:

Prior to working through the rest of this chapter, you should retrieve clean, unmodified source trees of your favorite Web server. We're working with Apache (available at » http://httpd.apache.org/) and, of course, with PHP (available at » https://www.php.net/ - does it need to be said?).

Make sure that you can compile a working PHP environment by yourself! We won't go into this issue here, however, as you should already have this most basic ability when studying this chapter.

Before we start discussing code issues, you should familiarize yourself with the source tree to be able to quickly navigate through PHP's files. This is a must-have ability to implement and debug extensions.

The following table describes the contents of the major directories.

Directory Contents
php-src Main PHP source files and main header files; here you'll find all of PHP's API definitions, macros, etc. (important). Everything else is below this directory.
php-src/ext Repository for dynamic and built-in modules; by default, these are the "official" PHP modules that have been integrated into the main source tree. From PHP 4.0, it's possible to compile these standard extensions as dynamic loadable modules (at least, those that support it).
php-src/main This directory contains the main php macros and definitions. (important)
php-src/pear Directory for the PHP Extension and Application Repository. This directory contains core PEAR files.
php-src/sapi Contains the code for the different server abstraction layers.
TSRM Location of the "Thread Safe Resource Manager" (TSRM) for Zend and PHP.
ZendEngine2 Location of the Zend Engine files; here you'll find all of Zend's API definitions, macros, etc. (important).

Discussing all the files included in the PHP package is beyond the scope of this chapter. However, you should take a close look at the following files:

  • php-src/main/php.h, located in the main PHP directory. This file contains most of PHP's macro and API definitions.

  • php-src/Zend/zend.h, located in the main Zend directory. This file contains most of Zend's macros and definitions.

  • php-src/Zend/zend_API.h, also located in the Zend directory, which defines Zend's API.

You should also follow some sub-inclusions from these files; for example, the ones relating to the Zend executor, the PHP initialization file support, and such. After reading these files, take the time to navigate around the package a little to see the interdependencies of all files and modules - how they relate to each other and especially how they make use of each other. This also helps you to adapt to the coding style in which PHP is authored. To extend PHP, you should quickly adapt to this style.

Extension Conventions

Zend is built using certain conventions; to avoid breaking its standards, you should follow the rules described in the following sections.

Macros

For almost every important task, Zend ships predefined macros that are extremely handy. The tables and figures in the following sections describe most of the basic functions, structures, and macros. The macro definitions can be found mainly in zend.h and zend_API.h. We suggest that you take a close look at these files after having studied this chapter. (Although you can go ahead and read them now, not everything will make sense to you yet.)

Memory Management

Resource management is a crucial issue, especially in server software. One of the most valuable resources is memory, and memory management should be handled with extreme care. Memory management has been partially abstracted in Zend, and you should stick to this abstraction for obvious reasons: Due to the abstraction, Zend gets full control over all memory allocations. Zend is able to determine whether a block is in use, automatically freeing unused blocks and blocks with lost references, and thus prevent memory leaks. The functions to be used are described in the following table:

Function Description
emalloc() Serves as replacement for malloc().
efree() Serves as replacement for free().
estrdup() Serves as replacement for strdup().
estrndup() Serves as replacement for strndup(). Faster than estrdup() and binary-safe. This is the recommended function to use if you know the string length prior to duplicating it.
ecalloc() Serves as replacement for calloc().
erealloc() Serves as replacement for realloc().
emalloc(), estrdup(), estrndup(), ecalloc(), and erealloc() allocate internal memory; efree() frees these previously allocated blocks. Memory handled by the e*() functions is considered local to the current process and is discarded as soon as the script executed by this process is terminated.
Warning

To allocate resident memory that survives termination of the current script, you can use malloc() and free(). This should only be done with extreme care, however, and only in conjunction with demands of the Zend API; otherwise, you risk memory leaks.

Zend also features a thread-safe resource manager to provide better native support for multithreaded Web servers. This requires you to allocate local structures for all of your global variables to allow concurrent threads to be run. Because the thread-safe mode of Zend was not finished back when this was written, it is not yet extensively covered here.

Directory and File Functions

The following directory and file functions should be used in Zend modules. They behave exactly like their C counterparts, but provide virtual working directory support on the thread level.

Zend Function Regular C Function
V_GETCWD() getcwd()
V_FOPEN() fopen()
V_OPEN() open()
V_CHDIR() chdir()
V_GETWD() getwd()
V_CHDIR_FILE() Takes a file path as an argument and changes the current working directory to that file's directory.
V_STAT() stat()
V_LSTAT() lstat()

String Handling

Strings are handled a bit differently by the Zend engine than other values such as integers, Booleans, etc., which don't require additional memory allocation for storing their values. If you want to return a string from a function, introduce a new string variable to the symbol table, or do something similar, you have to make sure that the memory the string will be occupying has previously been allocated, using the aforementioned e*() functions for allocation. (This might not make much sense to you yet; just keep it somewhere in your head for now - we'll get back to it shortly.)

Complex Types

Complex types such as arrays and objects require different treatment. Zend features a single API for these types - they're stored using hash tables.

Note:

To reduce complexity in the following source examples, we're only working with simple types such as integers at first. A discussion about creating more advanced types follows later in this chapter.

PHP's Automatic Build System

PHP 4 features an automatic build system that's very flexible. All modules reside in a subdirectory of the ext directory. In addition to its own sources, each module consists of a config.m4 file, for extension configuration. (for example, see » http://www.gnu.org/software/m4/)

All these stub files are generated automatically, along with .cvsignore, by a little shell script named ext_skel that resides in the ext directory. As argument it takes the name of the module that you want to create. The shell script then creates a directory of the same name, along with the appropriate stub files.

Step by step, the process looks like this:

:~/cvs/php4/ext:> ./ext_skel --extname=my_module
Creating directory my_module
Creating basic files: config.m4 .cvsignore my_module.c php_my_module.h CREDITS EXPERIMENTAL tests/001.phpt my_module.php [done].

To use your new extension, you will have to execute the following steps:

1.  $ cd ..
2.  $ vi ext/my_module/config.m4
3.  $ ./buildconf
4.  $ ./configure --[with|enable]-my_module
5.  $ make
6.  $ ./php -f ext/my_module/my_module.php
7.  $ vi ext/my_module/my_module.c
8.  $ make

Repeat steps 3-6 until you are satisfied with ext/my_module/config.m4 and
step 6 confirms that your module is compiled into PHP. Then, start writing
code and repeat the last two steps as often as necessary.
This instruction creates the aforementioned files. To include the new module in the automatic configuration and build process, you have to run buildconf, which regenerates the configure script by searching through the ext directory and including all found config.m4 files.

The default config.m4 shown in The default config.m4. is a bit more complex:

Example #1 The default config.m4.

dnl config.m4 for extension my_module

dnl Comments in this file start with the string 'dnl'.
dnl Remove where necessary. This file will not work
dnl without editing.

dnl If your extension references something external, use with:

dnl PHP_ARG_WITH(my_module, for my_module support,
dnl Make sure that the comment is aligned:
dnl [  --with-my_module             Include my_module support])

dnl Otherwise use enable:

dnl PHP_ARG_ENABLE(my_module, whether to enable my_module support,
dnl Make sure that the comment is aligned:
dnl [  --enable-my_module           Enable my_module support])

if test "$PHP_MY_MODULE" != "no"; then
  dnl Write more examples of tests here...

  dnl # --with-my_module -> check with-path
  dnl SEARCH_PATH="/usr/local /usr"     # you might want to change this
  dnl SEARCH_FOR="/include/my_module.h"  # you most likely want to change this
  dnl if test -r $PHP_MY_MODULE/; then # path given as parameter
  dnl   MY_MODULE_DIR=$PHP_MY_MODULE
  dnl else # search default path list
  dnl   AC_MSG_CHECKING([for my_module files in default path])
  dnl   for i in $SEARCH_PATH ; do
  dnl     if test -r $i/$SEARCH_FOR; then
  dnl       MY_MODULE_DIR=$i
  dnl       AC_MSG_RESULT(found in $i)
  dnl     fi
  dnl   done
  dnl fi
  dnl
  dnl if test -z "$MY_MODULE_DIR"; then
  dnl   AC_MSG_RESULT([not found])
  dnl   AC_MSG_ERROR([Please reinstall the my_module distribution])
  dnl fi

  dnl # --with-my_module -> add include path
  dnl PHP_ADD_INCLUDE($MY_MODULE_DIR/include)

  dnl # --with-my_module -> chech for lib and symbol presence
  dnl LIBNAME=my_module # you may want to change this
  dnl LIBSYMBOL=my_module # you most likely want to change this 

  dnl PHP_CHECK_LIBRARY($LIBNAME,$LIBSYMBOL,
  dnl [
  dnl   PHP_ADD_LIBRARY_WITH_PATH($LIBNAME, $MY_MODULE_DIR/lib, MY_MODULE_SHARED_LIBADD)
  dnl   AC_DEFINE(HAVE_MY_MODULELIB,1,[ ])
  dnl ],[
  dnl   AC_MSG_ERROR([wrong my_module lib version or lib not found])
  dnl ],[
  dnl   -L$MY_MODULE_DIR/lib -lm -ldl
  dnl ])
  dnl
  dnl PHP_SUBST(MY_MODULE_SHARED_LIBADD)

  PHP_NEW_EXTENSION(my_module, my_module.c, $ext_shared)
fi

If you're unfamiliar with M4 files (now is certainly a good time to get familiar), this might be a bit confusing at first; but it's actually quite easy.

Note: Everything prefixed with dnl is treated as a comment and is not parsed.

The config.m4 file is responsible for parsing the command-line options passed to configure at configuration time. This means that it has to check for required external files and do similar configuration and setup tasks.

The default file creates two configuration directives in the configure script: --with-my_module and --enable-my_module. Use the first option when referring external files (such as the --with-apache directive that refers to the Apache directory). Use the second option when the user simply has to decide whether to enable your extension. Regardless of which option you use, you should uncomment the other, unnecessary one; that is, if you're using --enable-my_module, you should remove support for --with-my_module, and vice versa.

By default, the config.m4 file created by ext_skel accepts both directives and automatically enables your extension. Enabling the extension is done by using the PHP_EXTENSION macro. To change the default behavior to include your module into the PHP binary when desired by the user (by explicitly specifying --enable-my_module or --with-my_module), change the test for $PHP_MY_MODULE to == "yes":

if test "$PHP_MY_MODULE" == "yes"; then dnl
    Action.. PHP_EXTENSION(my_module, $ext_shared)
    fi


PHP Manual || Zend Engine 1