PHP Include Path Surprises

While diagnosing a potential unconfirmed problem with a certain popular WordPress plugin I did something that every good developer should do every once in a while, which is to humble themselves and read the most basic and obvious documentation for the simplest parts of a language or framework that you know by heart.

If you aren’t a programmer, what I mean is the equivalent of reading a driver’s education book even though you’ve been driving for years or decades. And I don’t necessarily mean “what’s the yellow blinking arrow mean”, I mean basic like “how do I make the car go left?”

And when I say read, I don’t mean skim, I really mean thoroughly read. For instance, to make a car go left, you don’t “turn the wheel left” but instead, when going forward you turn the wheel counter/anti-clockwise! The top of the wheel goes left but the bottom of the wheel goes right. Stupid and obvious, I know. But how about when going backward. Puzzle that through, I think there’s actually a two-part answer there (for how most people use reverse at least). And then, just for fun, work through the scenario with a trailer (although I don’t think they teach that in driver’s ed).

Anyhow, back to code.

I’ve always known that there have been peculiarities about PHP’s include system, and I’ve always tried to use absolute paths just to avoid the peculiarities, but I never really dove into what the specific nuances were until this morning.

The below code is talking about include however it applies to require and require_once just the same.

From the documentation:

Files are included based on the file path given or, if none is given, the include_path specified. If the file isn’t found in the include_path, include will finally check in the calling script’s own directory and the current working directory before failing

That might be a “duh” moment for some people, and maybe you’ll stop reading this because you think I’m an idiot for writing about obvious stuff.

But here’s the problem: what do they mean by “file path” exactly? Going to the next paragraph in the documentation it says:

If a path is defined — whether absolute (starting with a drive letter or \ on Windows, or / on Unix/Linux systems) or relative to the current directory (starting with . or ..)

Do you get where I’m going yet?

Let’s look at a common bit of code:

require_once 'file.php';

The second quote above basically says that for the argument to this code to be considered path-based, the string must begin with either a slash (be absolute) or one or two periods (relative). If these neither of these two conditions are met, the current folder will not be looked at until after the include path is checked.

Read that again. Or let me say it again. If the argument to require/include doesn’t start with a slash or dot, multiple directories will be (potentially) scanned before the current directory.

Let’s make a quick proof-of-concept demo:

<?php

//Make a subfolder if it doesn't exist
if(!is_dir('sub_folder')){mkdir('sub_folder');}

//The PHP script that we'll write to each file, just echoing the script's directory
$contents = '<?php echo __DIR__ . "\n";';

//Create two files, one "here" and one in a sub folder
file_put_contents('sub_folder/file.php', $contents);
file_put_contents('file.php', $contents);

//Unless you have this file in your own path, these two should be the same
require 'file.php';
require './file.php';

//Change the current include path
set_include_path(__DIR__ . '/sub_folder');

//Re-run the exact same code as above, the first result is now different than the second!
require 'file.php';
require './file.php';

You can save that somewhere as runner.php and then run php -f runner.php from that directory.

What do you get when you run it? Here’s what I get:

/home/demo_user
/home/demo_user
/home/demo_user/sub_folder
/home/demo_user

Did you notice that third one? That’s the call to just require 'file.php' which, because it doesn’t start with a slash or a period, searched the include path first looking for a match, not the current folder!

Now, if you’re running in your own private code base and server, this might not be a big deal. The perf problems of this are probably very negligible, too, to the point that I don’t think you’d ever notice.

But if you’re creating code for other people to consume, you cannot guarantee that they don’t have an include path set, and you also cannot guarantee that they don’t have a file with your exact structure in that path. Probably not, or, hopefully not.

But the fix is actually easy. Just always use absolute or relative paths.

If you want a file in the current directory, just always prepend with ./. (However, scroll to the additional problem to see why this breaks, too!)

Or, better yet, in whatever you think should be your “entrance file(s)”, define a constant that holds the value of __DIR__ for that file and use that with concatenation for your files.

//In your entrance file
define('MY_APP_ROOT', __DIR__);

//In your other files
require_once MY_APP_ROOT . '/path/to/my/file.php';

That’s exactly what WordPress does.

Even better than all of this is to avoid require and include in general and always use composer to handle things on your behalf, but that’s not always an option.

A thought from a security perspective

As I write this, I’m thinking about a pretty interesting exploit that would be hard-ish to debug. First, you’d need to get code onto someone’s server and second, you’d need to get it into the execution path. We’ll wave our hand and pretend that’s done somehow. Once that happens, that code could set the include path to include its own code just by naming files the same as other code that gets included with a true path. This exploit code could even then re-require the legitimate code, effectively making it a trojan that no one notices (depending on what the payload does).

Once again, they’d need to have elevation to start with, but once in, this would be almost impossible to notice if the payload is silent enough, like logging passwords or something.

An additional problem

I’m going to take the initial code same I gave and tweak it a little bit. This time, we’re going to create an extra sub folder that has the same file.php with the exact same contents as the others. Also in that folder we’re going to create one additional file with the only job of running our main original code via require.

(Also, I’m using better ABS paths for creating things because with the jumping around for demo purposes things get weird otherwise. The actual spirit of the original demo is still here, just more specific.)

 <?php

//Save this file as runner.php

//We're going to use two sub folders
$FOLDER_ALPHA  = __DIR__ . '/sub_folder';
$FOLDER_RUNNER = __DIR__ . '/sub_runner';

//Make them if they don't exist
if(!is_dir($FOLDER_ALPHA)){mkdir($FOLDER_ALPHA);}
if(!is_dir($FOLDER_RUNNER)){mkdir($FOLDER_RUNNER);}

//The PHP script that we'll write to each file, just echo'ing the script's directory
$contents = '<?php echo __DIR__ . "\n";';

//Create three files, one "here" and one in each sub folder
file_put_contents("${FOLDER_ALPHA}/file.php",    $contents);
file_put_contents("${FOLDER_RUNNER}/file.php",   $contents);
file_put_contents('file.php',                    $contents);

$second_runner_contents = '<?php require dirname(__DIR__) . "/runner.php";';
file_put_contents("${FOLDER_RUNNER}/runner.php", $second_runner_contents);

//Unless you have this file in your own path, these two should be the same
require 'file.php';
require './file.php';

//Change the current include path
set_include_path($FOLDER_ALPHA);

//Re-run the exact same code as above, the first result is now different than the second!
require 'file.php';
require './file.php';

Now, from a command line where you saved the above run php -f runner.php and you’ll get something like:

/home/demo_user
/home/demo_user
/home/demo_user/sub_folder
/home/demo_user

No surprises, the previous code talked about this. From that same folder now run php -f sub_runner/runner.php and you’ll get:

/home/demo_user
/home/demo_user
/home/demo_user/sub_folder
/home/demo_user

Once again, still the same.

Lastly, we’re going to cd into our new inner folder and run that runner:

cd sub_runner/
php -f runner.php

Which now gives:

/home/demo_user/sub_runner
/home/demo_user/sub_runner
/home/demo_user/sub_folder
/home/demo_user/sub_runner

Wow, that’s new. We’re invoking almost the exact same code, just in a slightly different way. This time, we’re tripping a different rule from the include documentation (same as above, sub-setted below):

include will finally check in the calling script’s own directory and the current working directory before failing

I’m not actually sure the documentation is correct here, or, more specifically, I can’t get “calling scripts own directory” to be any different than “current working directory”. If someone can get me an actual working difference I’d love to see that.

Back to the results, because are working directory was changed, and because there was a new magic-named file in that folder, our main script used that file for the non-path version. But also notice that the explicitly relative paths also broke!

The path ./file.php isn’t relative to the file it is written in, it is relative to the current working directory! This basically means that the only absolutely safe method to include files is to always use the absolute path format, unless you have 100% absolute control over everything.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.