-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #65069: GlobIterator fails to access files inside an open_basedir restricted dir #398
Conversation
Great to see ones hands on this bug, nice idea for the fix. Please take the comments above into account. Also i think you should add more tests, like with the empty glob result, '.' as basedir, pattern is completely outta basedir, ... may be you could see more in ext/standard/tests/file/glob_* ... One thing i wonder though - why do you need to deal with two indexes? You could save only valid indexes in the map and use those for iterator instead of the originals delivered by glob(). That would also spare recording the first element, only the map array and its size were needed. |
Thanks for comments! I will look into this a bit later (probably tonight). I planed to do some changes anyway. About the indexes. The thing is that if I don't want to check valid paths twice or use reallocations during the checking, I need to allocate pglob->glob.gl_pathc elements. In case I wanted to save indexes, I would have to use int and map would be 4 times or 8 times (64bit) bigger. I actually thought about binary map when I started with implementation but compare to data allocated by glob, it's just not worthy implementation... I know that memory is not an issue so using index map wouldn't be problem. I also know that iterating would be more performance friendly if I used int map with indexes. Because both of these things have minor influence (because the slowest and the memory most expensive operation is glob), I think that the best would be to choose a solution that is easier to understand when you read the code. I thought that masking is simpler but if you think that indexes would be better, I am happy to implement it? |
As for me, i'd be standing for a simpler code more than for sparse memory, at least as long as it's acceptable. Having something like pglob->valid_idx and checking only for pglob->valid_idx_size would also minimize error risks . IMHO, lets wait for other comments. |
I have fixed the things that you mentioned (if I forgot something let me know). It looks that saving of indexes is much more readable. It also is a bit shorter. Just need to do a proper testing. I will add more tests later (probably tomorrow). |
I have just added more tests and last fixes. I think that the patch is ok now. If there is anything else, let me know! |
Please could you provide an example when it fails? The test bug65069.phpt contains example ( There is a security reason why using |
That's in the test you mention, the place with That -1 is because php_check_open_basedir_ex() used with a pattern path
There is a specific check for some wildcards in the path when it enters virtual_file_ex(). Could you pls recheck with the glob implementation? It throws no warning and just silently return false, therefore no platform difference. Maybe it should be done the same way glob does, like an early return right after glob was NOMATCH? So like in the glob() function. |
I just double checked the glob implementation. glob logic the same as my implementation of the glob_wrapper. glob return values are following:
this patch deals with that in this way:
As you can see the problem is in both functions. glob shouldn't return false but an empty array. I just tested it on Linux and it really returns an empty array so there is a platform difference. As I said the logic is the same with one small difference (see bellow). There is the same test on the pattern http://lxr.php.net/xref/PHP_TRUNK/ext/standard/dir.c#495 that is executed if path_c is equal to 0 or result is equal to Anyway back to the our problem. I think that there is a bug in glob implementation on Windows. Possibly bug in |
Finally prepared the env to test on both linux and windows, please check this on linux `$ sapi/cli/php -n -d open_basedir=/home/anatol/dws/src/bukka/ext/spl/tests/bug65069/ -r 'var_dump(new GlobIterator("/home/anatol/dws/src/bukka/ext/spl/tests/bug65069/.php"));' $ sapi/cli/php -n -d open_basedir=/home/anatol/dws/src/bukka/ext/spl/tests/bug65069/ -r 'var_dump(glob("/home/anatol/dws/src/bukka/ext/spl/tests/bug65069/*.php"));' the same on windows `x64\Debug_TS\php -n -d open_basedir=C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069 -r "var_dump(new GlobIterator('C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069*.php'));" Fatal error: Uncaught exception 'UnexpectedValueException' with message 'GlobIterator::__construct(): open_basedir restriction in effect. File(C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069*.php) is not within the allowed path(s): (C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069)' in Command line code:1 x64\Debug_TS\php -n -d open_basedir=C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069 -r "var_dump(glob('C:\php-sdk\phpmaster\vc11\x64\bukka\ext\spl\tests\bug65069*.php'));" Ok, so that's probably not what i thought last time. However you see teh difference, there has to be something in the code flow causing it. Also note the behaviour is only reproduceable with the full paths and only if the pattern points to non existent files in the root of basepath. However, checking the same glob() call on PHP-5.5 it returns bool(false) on both windows and linux. I haven't checked the history yet, but master is clearly breaking BC. The question is if that's intended and accepted. Possibly the behaviour of glob() has to be fixes as well then. Aside that, i'd really speak for leaving the virtual_file_ex() and co alone. Not only it's not the topic of this ticket. The behaviour there, as you say, is based on many platform specific things, and even if not - any change will have beyond consequences. What you call bug might be a survival thing :), like '*' and '?' are inacceptable for the NTFS filesystem. Much better to solve it right in place using #ifdef if necessary. Thanks. |
hi! "The best place would be php_check_specific_open_basedir where we could check for wildcards. I could do that if you want?" That's not a good choice, this function only checks a given path and tells if it is inside the open base dirs or not. Adding glob-like support to it makes it over complicated and will slow down almost all file ops. "Possibly changing virtual_file_ex could solve the problem as well because its behavior is platform specific but this would need to be done by someone who can test it on win as I don't have win... What do you think?" Which kind of platform specific behaviors do you see exactly? |
@bukka, please see https://bugs.php.net/bug.php?id=47358 |
I think that the best solution would be create a new PHPAPI exported function in
that would use (when call I will send a patch later if you agree with the solution? |
hi! On Mon, Aug 19, 2013 at 1:28 PM, Jakub Zelenka [email protected]:
Not really, this function just like virtual_file_ex should care only about That being said, we found the actual source of a bug, Anatolyi is Pierre |
Hi, Ok great. That makes sense. Let me know if there is anything that should be done in this patch. |
@bukka please consider these patch to glob() and the test http://git.php.net/?p=php-src.git;a=commitdiff;h=f4df40108be641a4167f6f6c1c3989958dda438a The lesson learned is that we shouldn't bother for windows with the basedir check on the glob query. If that query contains illegal path chars, it's justified to get an error. As you can see from the test, in case of windows if glob() returned nomatch, return an empty array() without further checks. That way the test linked above passes on linux and windows. So in your patch, you could make an early return just using that five lines from the glob(), or #ifndef that basedir check on windows correspondingly. Cheers |
I am afraid that If you test only this:
the result is false for all calling of glob on Linux and I guess that it will be an empty array on Windows (That's because you don't test the pattern). The linux behaviour is correct because testing files that are not in open_basedir directory is error (not empty result). What you can actually do on Win now is finding existing file that is not in open_basedir. Try In this case the point of testing the pattern is not to find out whether the whole pattern is in open_basedir but whether its directory (in this case I have got an idea how to do that. I could implement it as a part of this patch to glob_wrapper and after testing on win, we can do similar implementation in glob implementation. What do you think? |
You can change it, but only make it more restrictive. See the doc for more
|
@bukka, ok, if the pattern looks like |
Yeah I was thinking about these cases and they are a bit tricky. We would have to parse the path when calling glob recursively. For example if you have something like this: dir =
The question is if this complex checking is not a bit too much..? Other and maybe better solution would be returning an empty array in any case. It means even if the open_basedir test fails - http://lxr.php.net/xref/PHP_TRUNK/ext/standard/dir.c#537 (That would prevent security risk of checking file existence out of open_basedir directory) We would just didn't consider searching out of open_basedir as an error but only return empty result...? That would actually resolve the initial request in bug 47358. |
On Tue, Aug 20, 2013 at 11:05 AM, Jakub Zelenka [email protected]:
Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org |
The problem is what to return if you don't have the result...? I haven't shown any example how this be a security risk yet. Please consider this: <?php
var_dump(glob('/var/test/index.html'));
var_dump(glob('/var/test/*.php')); then if don't test the pattern, the result is following
You can see that we shouldn't get any information if the file exists because open_basedir is /tmp. We need to agree how the empty results will be handled. Either we keep returning false if the testing is out of open_basedir (in that case we have to check the pattern and implement complex logic - example above), or we return an empty array (then we don't have to check the pattern). |
On Aug 20, 2013 11:40 AM, "Jakub Zelenka" [email protected] wrote:
|
But this is exactly what's happening on the current master branch when you try it on windows because there is no pattern check. My question is still the same. What do you want to do if there is no result? Just returning false or empty array leads to the leaking info about files that are out of open_basedir. However returning an empty array in any case except glob error would resolve the problem. Although it's an undocumented behaviour, it's small BC. This discussion is about windows because pattern check works on Linux, so it's entirely up to you. If you don't think that it's a security risk, I can just hide pattern check for windows in this patch. The only problem is that the behaviour will be different on each platform. |
Just small correction. I know that this directories would be different on win. The point of the example was just to show the logic. It would return an empty array for any non-existing files (even when you specify it without wildcard) and false for any existing file that is out of open basedir. |
No result or open basedir restriction are two different things. That is
|
Sorry I meant no result when there is an open basedir restriction. It means what to do if the open basedir is set and glob returns an empty result... :) As I said I see two possible solutions:
I just think that the current state needs to be changed. |
Well that's the dillema, there were two code paths 1 2 That means to be consistent whatever [WHAT?] is returned in the 1st code path, the same in the 2nd should be returned. But that's not gonna work good, in the first code path there is no reliable way to check whether the query was in the basedir, like In case of returning false it'd break the case foreach(glob('nothing found') ... ) if it was nomatch. In case of empty array it'll not break it, but it'll not indicate the basepath error case. So that's a dillema ( |
If one single element falls because of obd, return false.
|
Yep, that's exactly the question, as it's unknown in the 1st code path. You mean in that case false too? |
Do we really need to indicate that it's an open basedir restriction. If the reason for this is just to tell user that files have been filtered because there is an open basedir, then it doesn't cover all cases anyway. Consider this: dir structure
script: <?php
ini_set('open_basedir', '/var/www')
glob('/var/www/*'); The return array contains only test but The point is that in the case above you don't indicate that some files are hidden because of open basedir restriction. What's the point to indicate it in case that there are not any files found then? |
sorry the example in the previous comment contained few mistakes. I have just updated it so it should make sense now... :) |
I've filed a ticket about this https://bugs.php.net/bug.php?id=65489 |
Comment on behalf of krakjoe at php.net: Since this PR contains merge conflicts, and since the author seems to have abandoned working on it, I'm closing this PR. Please take this action as encouragement to work on the feature, open a clean PR, with satisfactory test coverage, and ensure those tests pass. |
Proper checking and filtering of glob stream paths when open_basedir set