Pragmatism in the real world

A View Stream with Zend_View

One of my biggest issues with using PHP as the templating engine in View scripts is that the easiest way to echo a variable is the least secure.

Consider:

<?= $this->var ?>

Perfectly legal, dead easy to understand, but doesn’t escape $var which is what you want more often than not. To resolve this you need something like:

<?= $this->escape($this->var) ?>

But who remembers to do that?!

I don’t and I have short-open-tags turned off too!

So, I decided to leverage a post by Mike Naberezny from a while ago about streams. The idea is all his; I just modified it to work with Zend Framework’s Zend_View the way I wanted it to.

This is what I want to happen:

My Code:

<?= @$var; ?>

is translated to

<?php echo $this->escape($this->var); ?>

As you can see, this significantly cuts down the amount of typing that we need to do and also makes view templates much easier to read!

PHP stream wrappers are a mechanism that allows us to write our own protocol handlers for files. In our case, we want to intercept the view script file and alter the code within short tags to escape the variable for us. We use the short tag so that if we decide that we do not want a variable to be escaped we can use:

<?php echo $this->var;?> 

and it will work as expected.

Let’s look at the code!

File: App/View/Stream.php:
Firstly we need a stream file that does the hard work. Most of this file is by Mike and Paul so I’ve left their attributions at the top of the file.

<?php
/**
 * Stream wrapper to convert markup of mostly-PHP templates into PHP prior to include().
 * 
 * Based in large part on the example at
 * http://www.php.net/manual/en/function.stream-wrapper-register.php
 * 
 * @author Mike Naberezny (@link http://mikenaberezny.com)
 * @author Paul M. Jones  (@link http://paul-m-jones.com)
 * @author Rob Allen (@link https://akrabat.com)
 * 
 */
class App_View_Stream {
    
    /**
     * Current stream position.
     *
     * @var int
     */
    private $pos = 0;

    /**
     * Data for streaming.
     *
     * @var string
     */
    private $data;

    /**
     * Stream stats.
     *
     * @var array
     */
    private $stat;

    
    /**
     * Opens the script file and converts markup.
     */
    public function stream_open($path, $mode, $options, &$opened_path) {
        
        // get the view script source
        $path = str_replace('view://', "", $path);
        $this->data = file_get_contents($path);
        
        /**
         * If reading the file failed, update our local stat store
         * to reflect the real stat of the file, then return on failure
         */
        if ($this->data===false) {
            $this->stat = stat($path);
            return false;
        }

        /**
         * Look for short open tag and change:
         * 		< ?= $test ?>
         * to
         * 		< ?php echo $this->escape{$test); ?>
         * 
         */ 
	        $find = '/< ?=[ ]*([^;>]*?|[^;?]*?)[; ]*?>/';
	        $replace = "< ?php echo $this->escape($1); ?>";
	        $this->data = preg_replace($find, $replace, $this->data);
        
        /**
         * Convert @$ to $this->
         * 
         * We could make a better effort at only finding @$ between < ?php ?>
         * but that's probably not necessary as @$ doesn't occur much in the wild
         * and there's a significant performance gain by using str_replace().
         */
        $this->data = str_replace('@$', '$this->', $this->data);
        
       /**
         * Convert @ to $this->
         * 
         */
        $this->data = str_replace('@', '$this->', $this->data);
        
        /**
         * file_get_contents() won't update PHP's stat cache, so performing
         * another stat() on it will hit the filesystem again.  Since the file
         * has been successfully read, avoid this and just fake the stat
         * so include() is happy.
         */
        $this->stat = array('mode' => 0100777,
                            'size' => strlen($this->data));

        return true;
    }

    
    /**
     * Reads from the stream.
     */
    public function stream_read($count) {
        $ret = substr($this->data, $this->pos, $count);
        $this->pos += strlen($ret);
        return $ret;
    }

    
    /**
     * Tells the current position in the stream.
     */
    public function stream_tell() {
        return $this->pos;
    }

    
    /**
     * Tells if we are at the end of the stream.
     */
    public function stream_eof() {
        return $this->pos >= strlen($this->data);
    }

    
    /**
     * Stream statistics.
     */
    public function stream_stat() {
        return $this->stat;
    }

    
    /**
     * Seek to a specific point in the stream.
     */
    public function stream_seek($offset, $whence) {
        switch ($whence) {
            case SEEK_SET:
                if ($offset < strlen($this->data) && $offset >= 0) {
                $this->pos = $offset;
                    return true;
                } else {
                    return false;
                }
                break;

            case SEEK_CUR:
                if ($offset >= 0) {
                    $this->pos += $offset;
                    return true;
                } else {
                    return false;
                }
                break;

            case SEEK_END:
                if (strlen($this->data) + $offset >= 0) {
                    $this->pos = strlen($this->data) + $offset;
                    return true;
                } else {
                    return false;
                }
                break;

            default:
                return false;
        }
    }
}

All the real work goes on in stream_open(). The rest of the file is housekeeping to make it all work. (Thanks Mike and Paul!)

The stream_open() function is called when our file is included and so we grab the contents using file_get_contents() and then fix the code using a preg_replace() and a couple of str_replace()s and that’s it:

These are interesting bits from the code above:

Look for short open tag and change:

            $find = '/< ?=[ ]*([^;>]*?|[^;?]*?)[; ]*?>/';
            $replace = "< ?php echo $this->escape($1); ?>";
            $this->data = preg_replace($find, $replace, $this->data);

This is a complicated regexp, but passes all my use-cases in my real-world code. There is one case that it doesn’t work on:

<= $var ? $var : '-unknown-'; ?>

So if anyone could suggest a better regexp, please let me know! I suspect the problem is related to looking for the semi-colon before the ?>, but I’m not a regexp expert. It may just be easier to mandate that you can’t use a ; in the short form, but I’m a creature of habit…

The other two conversions are trivial:

Convert @$ to $this->:

$this->data = str_replace('@$', '$this->', $this->data);

and convert @ to $this->:

$this->data = str_replace('@', '$this->', $this->data);

These are very simple, but the order is important.

All we need to do now is to integrate into the Zend Framework’s View system. This turns out to be very simple as all we need is our own App_View class:

<?php
/**
 * @copyright  Copyright (c) 2007 Rob Allen (https://akrabat.com)
 * @license    http://framework.zend.com/license/new-bsd     New BSD License
 */

require_once 'Zend/View/Abstract.php';
require_once 'App/View/Stream.php';

class App_View extends Zend_View_Abstract
{
	public function __construct($config = array()) {
	    // register our view stream to do automatic escaping of the <?= construct 
		if (!in_array('view', stream_get_wrappers())) {
			stream_wrapper_register('view', 'App_View_Stream');
		}
		parent::__construct($config);
	}    

    /**
     * Includes the view script in a scope with only public $this variables.
     *
     * @param string The view script to execute.
     */
    protected function _run()
    {

        include 'view://' . func_get_arg(0);
    }
}

We use the constructor to register our new stream using the prefix “view” and then we write our own _run() function that doesn’t do a lot other than prepend “view://” to the name of the view script to be included.

The “view://” causes our stream wrapper class to be called which does the necessary replacements and then the file is processed by the PHP engine. At this point the short tags are gone and replaced with <?php and the variables are escaped properly.

Finally, we need to tell the view renderer about our new view class. I do this in a Front Controller plugin’s dispatchLoopStartup():

class App_Controller_Plugin_Initialisation extends Zend_Controller_Plugin_Abstract
{
    public function dispatchLoopStartup(Zend_Controller_Request_Abstract $request)
    { 
        $viewRenderer = Zend_Controller_Action_HelperBroker::getStaticHelper('viewRenderer');
        $viewRenderer->setView(new App_View());

}

The view renderer now use our new App_View class and streaming goodness is ours!

Final thoughts:

What I like most about this implementation is that the easiest solution to echo a variable results in “safer” outputs. Obviously, if you don’t want a given variable escaped, you can use the long form and the stream won’t touch it:
<?php echo $var2; ?>

Be aware that this isn’t a panacea and you definitely need to be careful if you aren’t using your view script in an HTML context. It certainly makes for much less code in the average view script though!

26 thoughts on “A View Stream with Zend_View

  1. I tried to come up with a regex that matches all of your examples and also allows quoted strings. It's a bit more complicated, but you could give it a try:

    <?php

    $test = 'This is little – test" ?> ';
    echo preg_replace('%)|"(\"|[^"]*)"|'(\'[^']*)')*.)?>%', ", $test);

    ?>

    Also you should use preg_replace_callback() and put the two str_replace() statments in a callback (= only replace between ).

    I myself use Smarty with a prefilter to auto-escape variables, because the default modifiers do too much – and break the templates (i.e. escaping arrays in a foreach). The advantage is the replacing is only once, the disadvantage is you have to use Smarty (some people don't like it).

  2. Ben,

    It depends on what you are comparing to I suppose.

    Compared to straight Smarty, probably not. Compared to straight PHP, it's certainly more overhead.

    That's what Zend_Cache is for though :)

    Rob…

  3. Anders,

    Not sure. However there is one bug in the code above:

    $path = str_replace('view://', ", $path);

    should be:

    $path = str_replace('view://', '', $path);

    Regards,

    Rob…

  4. Why not use a custom View that escapes everything by default and add an unescape() method for special occasions?

  5. Rob,

    Even after correcting some typos I couldn't get your regex to work, so I came up with my own simple solution:

    /<?=(.*?);s*?>/is
    

    That one did the job even for something like <?= $var ? $var : '-unknown-'; ?>

    The next thing I asked myself was why would anybody want to use PHP short open tags and @$var when a Smarty-like {$var} could be used instead. So I ended up refactoring the App_View_Stream class to look like that:

    class App_View_Stream {
    private $pos = 0;
    private $data;
    private $stat;

    var $startTag = '<\?='; // '\{' change settings
    var $endTag = ';\s*\?>'; // '\}' for "Smarty" solution
    var $varIndicator = '@$'; // '$' ... or something totally different
    var $pattern;

    public function __construct()
    {
    $this->pattern = '/' . $this->startTag . '(.*?)' . $this->endTag . '/is';
    }

    public function stream_open($path, $mode, $options, &$opened_path)
    {
    $path = str_replace('view://', "", $path);
    $this->data = file_get_contents($path);

    if ($this->data===false) {
    $this->stat = stat($path);
    return false;
    }

    $this->data = preg_replace_callback($this->pattern, array($this, 'replaceVar'), $this->data);

    $this->stat = array('mode' => 0100777,
    'size' => strlen($this->data));

    return true;
    }

    private function replaceVar($pregResult) {
    return '<?php echo $this->escape(' . str_replace($this->varIndicator, '$this->', $pregResult[1]) . '); ?>';
    }

    ...
    }

    Thanks for coming up with that topic. I am now going to have a look at the latest MEAP version of your ZE book I downloaded today …

    Regards,
    Thomas

  6. Kyle idea would work like that:


    _data[$key] = $val;
    return;
    }

    require_once 'Zend/View/Exception.php';
    throw new Zend_View_Exception('Setting private or protected class members is not allowed', $this);
    }

    public function __get($key)
    {
    if (array_key_exists($key, $this->_data)) {
    return $this->escape($this->_data);
    }

    if ($this->_strictVars) {
    trigger_error('Key "' . $key . '" does not exist', E_USER_NOTICE);
    }

    return null;
    }

    public function unescape($key)
    {
    if (array_key_exists($key, $this->_data)) {
    return $this->_data;
    }

    if ($this->_strictVars) {
    trigger_error('Key "' . $key . '" does not exist', E_USER_NOTICE);
    }

    return null;
    }

    public function __isset($key)
    {
    if ('_' != substr($key, 0, 1)) {
    return isset($this->$key) || isset($_data[$key]);
    }

    return false;
    }

    public function __unset($key)
    {
    if ('_' != substr($key, 0, 1) && isset($this->$key)) {

    if (isset($this->$key)) {
    unset($this->$key);
    } else {
    if (array_key_exists($key, $this->_data)) {
    unset($this->_data[$key]);
    }
    }
    }
    }
    }

  7. I lost a part of it…

    class My_View extends Zend_View
    {
    private $_data = array();

    public function __set($key, $val)
    {
    if ('_' != substr($key, 0, 1)) {
    $this->_data[$key] = $val;
    return;
    }

    require_once 'Zend/View/Exception.php';
    throw new Zend_View_Exception('Setting private or protected class members is not allowed', $this);
    }

    public function __get($key)
    {
    if (array_key_exists($key, $this->_data)) {
    return $this->escape($this->_data);
    }

    if ($this->_strictVars) {
    trigger_error('Key "' . $key . '" does not exist', E_USER_NOTICE);
    }

    return null;
    }

    public function unescape($key)
    {
    if (array_key_exists($key, $this->_data)) {
    return $this->_data;
    }

    if ($this->_strictVars) {
    trigger_error('Key "' . $key . '" does not exist', E_USER_NOTICE);
    }

    return null;
    }

  8. There are some errors in your code, I blame WordPress for them, which eats backslashes.

    Here is the updated part, replace ### with a backsplash:

    $find = '/]*?|[^;?]*?)[; ]*###?>/';
    $replace = "escape($1); ?>";

  9. Argh… So WordPress also ate my post. So here's my next try:

    Find the lines starting with "$find" and "$replace". In the first line, there has to be a backslash between the opening angle bracket and the first question mark and one between the asterisk and the question mark at the end.

    In the second line, a backslash is needed in front of the second dollar sign (that's the one followed by "this").

  10. @rob: I don't think it can works because inside a loop same variables shouldn't be changed into an instance property (key and value of the array element).

  11. Nevermind, I had to test the solution to understand the details.

    ps: make it to work I had to use the following regex:

    $find = '//';
    $replace = "escape($1); ?>";
    $this->data = preg_replace($find, $replace, $this->data);

    The regex for a viriable name is from http://www.php.net/language.variables

  12. Try again…
    $find = '/<?=s*($[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*)s*;?s*?>/';
    $replace = "<?php echo $this->escape($1); ?>";
    $this->data = preg_replace($find, $replace, $this->data);

  13. We also need to escape result of functions:

    $find = '/<?=s*([^;]*);?s*?>/';
    $replace = "<?php echo $this->escape($1); ?>";
    $this->data = preg_replace($find, $replace, $this->data);

  14. Finally, about the ternary operator:

    <?= @$var ? @$var : '-unknown-'; ?>
    <?= $var ? $var : '-unknown-'; ?>

    become:

    <?php echo $this->escape($this->var ? $this->var : '-unknown-'); ?>
    <?php echo $this->escape($var ? $var : '-unknown-'); ?>

    Even if it is not really what you write, isn't it valid php?

  15. Nice thought but personally not a huge fan of this approach, feels like it's using regex where a parser is more appropriate… particularly as you're trying to solve a security issue, where arcane, missed details are often where holes are exploited…

    P.S. I didn't look much at the regex but I'm curious, how would you want to handle something like this?:

    < ?= @var; @var2 ?>

  16. Joe,

    I expect it would break :) But then I wouldn't be impressed if I saw that code in a view template anyway!

    Rob…

  17. I grant you that's a bad example, heh heh, what I meant is, <?= is the same as <?php echo, which can go multi-line legitimately.

    If you're trying to improve security then a robust solution would be preferred, which would require a parser instead of regex…

    But I suppose this'll do the trick if you make sure you keep it to a one liner.

    For your readers, I'll link to a cool thread (that you participated in) about how escaping is being planned for #ZF 2.0:
    http://framework.zend.com/wiki/display/ZFDEV2/Zend_View+2.0

  18. Over at the PiKe project we build a custom stream wrapper that automatically escapes all view variables, with a MINIMAL performance hit! You can still get the RAW value with:

    Notice the "~" character. Checkout http://code.google.com/p/php-pike/wiki/Pike_View_Stream

    I know you said that you want to avoid "tricky ways like output buffering and PREG replacing *.phtml files.", but I still think it's a very neat way to fix auto escaping in Zend Framework 1.

Comments are closed.