Showing posts with label buffer overflow. Show all posts
Showing posts with label buffer overflow. Show all posts

Thursday, October 2, 2014


What's really wrong with systemd?



Unless you've been living under a rock or on a park bench during the past two years, you've probably heard of systemd and the numerous issues and controversies involved. You probably also have also heard about some new alternatives.

Now instead of boring you with long lengthy arguments and debatable concepts and anecdotes, I'm going to boil the core problem with systemd down to two simple points:

  • The singular aim of systemd is to get other projects to depend on it.
  • The systemd developers have a track record for not knowing what they're doing, thereby creating very fragile and unwieldy problematic software.
Now if you're looking for proof of these two points, you're free to read through the aforementioned pages I linked to, as well as search online for more. I hope you've enjoyed my summary.

Monday, June 30, 2014


Memory management in C and auto allocating sprintf() - asprintf()



Memory Management

Memory management in C is viewed by some to be quite tricky. One needs to work with pointers that can point anywhere in memory, and if misused, cause a program to misbehave, or worse.

The basic functions to allocate and deallocate memory in C are malloc() and free() respectively. The malloc() function takes a size in bytes of how much to allocate, and returns a pointer to the allocated memory to be used. Upon failure, a null pointer, which can be thought of as a pointer pointing to 0 is returned. The free() function takes the pointer returned by malloc(), and deallocates the memory, or frees up the memory it was once pointing to.

To work with malloc() in simple situations, typically, code along the following lines is used:
void *p = malloc(size);
if (p)
{
  ... work with p ...
  free(p);
}
else
{
  ... handle error scenario ...
}

Unfortunately many experienced programmers forget to handle the failure scenario. I've even heard some say they purposely don't, as they have no clue how to proceed, and just letting the program crash is good enough for them. If you meet someone who makes that argument, revoke their programming license. We don't need such near sighted idiots writing libraries.

In any case, even the above can lead to some misuse. After this block of code runs, what is p now pointing to?

After the above code runs, in the case that malloc() succeeded, p is now pointing to memory in middle of nowhere, and can't be used. This is known as a dangling pointer. Dangling pointers can be dangerous, as an if clause as above will think the pointer is valid, and may do something with it, or lead to the infamous use after free bug. This becomes more likely to occur as the situation becomes more complicated and there are loops involved, and how malloc() and free() interact can take multiple paths.

Pointers involved with memory management should always be pointing at 0 or at allocated memory. Anything else is just asking for trouble. Therefore, I deem any direct use of free() dangerous, as it doesn't set the pointer to 0.

So if free() is considered harmful, what should one use?

In C++, I recommend the following:

static inline void insane_free(void *&p)
{
  free(p);
  p = 0;
}

This insane_free() is now a drop in replacement for free(), and can be used instead. (Since C++ programs normally use new and delete/delete[] instead, I leave it as an exercise to the reader how to work with those.)

However, C doesn't support direct references. One can pass a pointer by a pointer to accomplish similar results, but that becomes clunky and is not a drop in replacement. So in C, I recommend the following:
#define insane_free(p) { free(p); p = 0; }
It makes use of the preprocessor, so some may consider it messy, but it can be used wherever free() currently is. One could also name the macro free in order to automatically replace existing code, but it's best not to program that way, as you begin to rely on these semantics. This in turn means someone copying your code may think a call to free() is the normal free() and not realize something special is occurring when they copy it elsewhere without the macro.

Correct usage in simple cases is then:
void *p = malloc(size);
if (p)
{
  ... work with p ...
  insane_free(p);
}
else
{
  ... handle error scenario ...
}
If you think using a wrapper macro or function is overkill, and just always manually assigning the pointer to 0 after freeing is the way to go, consider that it's unwieldy to constantly do so, and you may forget to. If the above technique was always used, all use after free bugs would never have occurred in the first place.

Something else to be aware of is that there's nothing wrong with calling free(0). However, calling free() upon a pointer which is not null and not pointing to allocated memory is forbidden and will crash your program. So stick to the advice here, and you may just find memory management became significantly easier.

If all this talk of pointers is beyond you, consider acquiring Understanding and Using C Pointers.

sprintf() and asprintf()

If you do a lot of C programming, at some point, you probably have used the sprintf() function, or its safer counterpart snprintf().

These two functions sprintf() and snprintf() act like printf(), but instead of printing to the standard output, they print to a fixed-length string buffer. Now a fixed-length string buffer is great and all, but what if you wanted something which automatically allocated the amount it needed?

Enter asprintf(), a function which acts like sprintf(), but is auto-allocating, and needs no buffer supplied. This function was invented ages ago by GLIBC, shortly thereafter copied to the modern BSDs, and found its way further still into all sorts of libraries, although is not yet ubiquitous.

Let's compare the prototypes of the two:
int sprintf(char *buffer, const char *format, ...); 
int asprintf(char **ret, const char *format, ...);
The sanest approach would have been for a function like asprintf() to have the prototype of:
char *asprintf(const char *format, ...);
But its creators wanted to make it act like sprintf(), and its design can also be potentially more useful.

Instead of passing asprintf() a buffer, a pointer to a variable of type char * needs to be passed, like so:
char *buffer;
asprintf(&buffer, ...whatever...);
Now how asprintf() actually works is no big secret. The C99 standard specified that snprintf() upon failure should return the amount of characters that would be needed to contain its output. Which means that conceptually something along the following lines would be all that asprintf() needs to do:
char *buffer = malloc(snprintf(0, 0, format, data...)+1);
sprintf(buffer, format, data...);
Of course though, the above taken verbatim would be incorrect, because it mistakenly assumes that nothing can go wrong, such as the malloc() or snprintf() failing.

First let's better understand what the *printf() functions return. Upon success, they return the amount of characters written to the string (which does not include the trailing null byte). Or in other words, the return value is equivalent to calling strlen() on the data being output, which can save you needing to use a strlen() call with sprintf() or similar functions for certain scenarios. Upon failure, for whatever reason, the return is -1. Of course there's the above mentioned exception to this with snprintf(), where the amount of characters needed to contain the output would be returned instead. If during the output, the size overflows (exceeds INT_MAX), many implementations will return a large negative value (failure with snprintf(), or success with all the functions).

Like the other functions, asprintf() also returns an integer of the nature described above. Which means working with asprintf() should go something like this:
char *buffer;
if (asprintf(&buffer, ...whatever...) != -1)
{
  do_whatever(buffer);
  insane_free(buffer);
}
However, unlike the other functions, asprintf() has a second return value, its first argument, or what the function sees as *ret. To comply with the memory management discussion above, this should also be set to 0 upon failure. Unfortunately, many popular implementations, including those in GLIBC and MinGW fail to do so.

Since I develop with the above systems, and I'm using asprintf() in loops with multiple paths, it becomes unwieldy to need to pass around the buffer and the returned int, so I'd of course want saner semantics which don't leave dangling pointers in my program.

In order to correct such mistakes, I would need to take code from elsewhere, or whip up my own function. Now I find developing functions such as these to be relatively simple, but even so, I always go to check other implementations to see if there's any important points I'm missing before I go implement one. Maybe, I'll even find one which meets my standards with a decent license which I can just copy verbatim.

In researching this, to my shock and horror, I came across implementations which properly ensure that *ret is set to 0 upon failure, but the returned int may be undefined in certain cases. That some of the most popular implementations get one half wrong, and that some of the less popular get the other half wrong is just downright terrifying. This means that there isn't any necessarily portable way to check for failure with the different implementations. I certainly was not expecting that, but with the amount of horrible code out there, I guess I really shouldn't be surprised anymore.

Also in the course of research, besides finding many implementations taking a non-portable approach, many have problems in all sorts of edge cases. Such as mishandling overflow, or not realizing that two successive calls to a *printf() function with the same data may not necessarily yield the same results. Some try to calculate the length needed with some extra logic and only call sprintf() once, but this logic may not be portable, or always needs updating as new types are added to the format string as standards progress, or the C library decided to offer new features. Some of the mistakes I found seem to be due to expecting a certain implementation of underlying functions, and then later the underlying functions were altered, or the code was copied verbatim to another library, without noting the underlying functions acted differently.

So, once again, I'm finding myself needing to supply the world with more usable implementations.

Let's dive into how to implement asprinf().

Every one of these kind of functions actually has two variants, the regular which takes an unlimited amount of arguments, and the v variants which take a va_list (defined in stdarg.h) after the format argument instead. These va_lists are what ... gets turned into after use, and in fact, every non-v *printf() function is actually wrapped to a counterpart v*printf() function. This makes implementing asprintf() itself quite straight forward:



To fix the aforementioned return problems, one could also easily throw in here a check upon the correct return variable used in the underlying vasprintf() implementation and use it to set the other. However, that's not a very portable fix, and the underlying implementation of vasprintf() can have other issues as described above.

A straight forward implementation of vasprintf() would be:



As long as you have a proper C99 implementation of stdarg.h and vsnprintf(), you should be good to go. However, some systems may have vsnprintf() but not va_copy(). The va_copy() macro is needed because a va_list may not just be a simple object, but have handles to elsewhere, and a deep copy is needed. Since vsnprintf() being passed the original va_list may modify its contents, a copy is needed because the function is called twice.

Microsoft Visual C++ (MSVC, or Microsoft Vs. C++ as I like to think of it) up until the latest versions has utterly lacked va_copy(). This and several other  systems that lack it though usually have simple va_lists that can be shallow copied. To gain compatibility with them, simply employ:


#ifndef va_copy 
#define va_copy(dest, src) dest = src 
#endif

Be warned though that if your system lacks va_copy(), and a deep copy is required, using the above is a recipe for disaster.

Once we're dealing with systems where shallow copy works though, the following works just as well, as vsnprintf() will be copying the va_list it receives and won't be modifying other data.



Before we go further, there's two points I'd like to make.
  • Some implementations of vsnprintf() are wrong, and always return -1 upon failure, not the size that would've been needed. On such systems, another approach will need to be taken to calculate the length required, and the implementations here of vasprintf() (and by extension asprintf()) will just always return -1 and *ret (or *strp) will be 0.
  • The code if ((r < 0) || (r > size)) could instead be if (r != size), more on that later.
Now on Windows, vsnprintf() always returns -1 upon failure, in violation of the C99 standard. However, in violation of Microsoft's own specifications, and undocumented, I found that vsnprintf() with the first two parameters being passed 0 as in the above code actually works correctly. It's only when you're passing data there that the Windows implementation violates the spec. But in any case, relying on undocumented behavior is never a good idea.

On certain versions of MinGW, if __USE_MINGW_ANSI_STDIO is defined before stdio.h is included, it'll cause the broken Windows *printf() functions to be replaced with C99 standards compliant ones.

In any case though, Windows actually provides a different function to retrieve the needed length, _vscprintf(). A simple implementation using it would be:



This however makes the mistake of assuming that vsnprintf() is implemented incorrectly as it currently is with MSVC. Meaning this will break if Microsoft ever fixes the function, or you're using MinGW with __USE_MINGW_ANSI_STDIO. So better to use:



Lastly, let me return to that second point from earlier. The vsnprintf() function call the second time may fail because the system ran out of memory to perform its activities once the call to malloc() succeeds, or something else happens on the system to cause it to fail. But also, in a multi-threaded program, the various arguments being passed could have their data change between the two calls.

Now if you're calling functions while another thread is modifying the same variables you're passing to said function, you're just asking for trouble. Personally, I think that all the final check should do is ensure that r is equal to size, and if not, something went wrong, free the data (with insane_free() of course), and set r to -1. However, any value between 0 and size (inclusive), even when not equal to size means the call succeeded for some definition of success, which the above implementations all allow for (except where not possible Microsoft). Based on this idea, several of the implementations I looked at constantly loop while vsnprintf() continues to indicate that the buffer needs to be larger. Therefore, I'll provide such an implementation as well:



Like the first implementation, if all you lacked was va_copy(), and shallow copy is fine, it's easy to get this to work on your platform as described above. But if vsnprintf() isn't implemented correctly (hello MSVC), this will always fail.

All the code here including the appropriate headers, along with notes and usage examples are all packed up and ready to use on my asprintf() implementation website. Between everything offered, you should hopefully find something that works well for you, and is better than what your platform provides, or alternative junk out there.

As always, I'm only human, so if you found any bugs, please inform me.

Saturday, May 17, 2014


Protecting private keys



Web servers use private keys which they alone have in order to secure connections with users. Private keys must be protected at all costs.

In order to protect private keys on disk, one generally encrypts them with a password, which is then needed by the web server upon launch in order to decrypt and use it in memory. However, if measures aren't taken to secure the memory containing the private key, it can be stolen from there too, which would be catastrophic.

Normally, one doesn't need to worry about outsiders getting a hold of data from memory unless the attackers have direct access to the server itself. But bugs like Heartbleed allow remote users to grab random data from memory. Once data is in memory, the application and all the libraries it uses could divulge secret data if there is a buffer overflow lurking somewhere.

To protect against exploiting such bugs, one should ensure that buffer overflows do not have access to memory containing private data. The memory containing private keys and similar kinds of data should be protected, meaning nothing should be allowed to read from them, not even the web server itself.

Now obviously a program needs some access to a private key in order to work with it, so it can't just prevent all access from it. Rather, once a private key or similar data is loaded into memory, that memory should have its read permissions removed. When, and only when some activity needs to be performed with the private key, read permissions can be restored, the activity performed, and then read permissions revoked. This will ensure the rest of the application cannot access what it does not need, nor should be allowed to access.

On UNIX systems, one can use mprotect() to change the permission on a page of memory. On Windows, one can use VirtualProtect().

The above however has a crucial flaw - multi-threading. In a threaded application, all threads have access to data of other threads. So while one thread may be performing some critical private key related code, and allows read access for the moment, another thread can read it outside of the critical portion of code too. Therefore, even more isolation is needed.

To truly isolate the code that uses a private key and similar data, all the code that handles that stuff should be placed into its own process. The rest of the application can then request that well defined activities be performed via a pipe or other form of inter-process communication. This will also ensure that other kinds of bugs in the application, such as buffer overflows that allow arbitrary code execution cannot reestablish read access to the secret data.

On UNIX systems, one can use fork() to create a process which is still part of the same application. On all systems, the separate process can be a separate application with a well defined restrictive IPC API with limited and secured access by the web server.

No insecure library or silly bug in your web server should ever allow the application to divulge such secrets. If such services are not utilizing the techniques above, then they're just biding their time until the next Heartbleed.

Sunday, November 11, 2007


Directory safety when working with paths



As we discussed in our previous two articles, we run into issues of thread safety when we try changing the path during a running program. Therefor, in addition to bypassing PATH_MAX problems, working around buffer issues thanks to implementing with C++, we would also love to implement our functions without ever calling chdir() and changing the working directory. We'll now look into how we can make changes to our implementation to avoid a chdir() call, along with the pros and cons of techniques UNIX operating systems provide us, as well as what they're lacking.

If one looks back at how we implemented getcwd(), they'll notice that we call chdir("..") for each step in our loop. We need this to sanitize the location for these three calls:

opendir(".")
lstat(entry->d_name, &sb;)
stat(".", &sb;)


To remove the need of changing directories, we can place the ".." paths directly into our calls to opendir(), lstat(), and stat(). Thanks to C++ Strings being dynamic, it makes it easy for us to manipulate paths for these situations properly. We'll also see that being clever, we can even avoid string manipulation for some of these cases. And if our OSs supplied enough functionality, without any string manipulation at all!

Here's how we can implement getcwd() based off our previous implementation, but without calls to chdir(), with key details highlighted:

bool getcwd(std::string& path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(".", &sb;))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb;)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);
std::string up_path("..");

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
{
bool pushed = false;
DIR *dir = opendir(up_path.c_str());
if (dir)
{
dirent *entry;
up_path += "/";
std::string::size_type after_slash = up_path.size();
while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from
{
if (strcmp(entry->d_name, ".") && strcmp(entry->d_name, ".."))
{
up_path.replace(after_slash, std::string::npos, entry->d_name);
if (!lstat(up_path.c_str(), &sb;))
{
file_id child_id(sb.st_dev, sb.st_ino);
if (child_id == current_id) //We found where we came from, add its name to the list
{
path_components.push_back(entry->d_name);
pushed = true;
break;
}
}
}
}

if (pushed && !fstat(dirfd(dir), &sb;)) //If we have a reason to contiue, we update the current dir id
{
current_id = file_id(sb.st_dev, sb.st_ino);
up_path.replace(after_slash, std::string::npos, ".."); //Keep recursing towards root each iteration
}

closedir(dir);
}
if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue
}

if (current_id == root_id) //Unless they're equal, we failed above
{
//Built the path, will always end with a slash
path = "/";
for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i)
{
path += *i+"/";
}
success = true;
}
}
}

return(success);
}

First of all, we removed all the code related to saving the current directory when we started, since we don't need that information anymore. Our first major change is that we created a new C++ String called up_path, we'll use this variable to keep track of the path, initializing it to "..". then for each step through the loop, make it "../..", "../../.." and so on, till we reach the root. We use this to replace our calls to opendir() to "." as we were doing before.
At this point, we'll add a slash to the path, and keep track of the spot with the variable after_slash. Now in our read directory loop, we can replace whatever is after the slash with the filename in the directory to pass to lstat(), again bypassing the need to be in the same directory as the file when making the function call.
Now for the stat() call on the directory itself, we got a little interesting. Instead of doing a path manipulation trick again, we call fstat() on the file descriptor returned from dirfd() on the already open directly handle. Notice how the call to close directory has been moved to after the block of code, so the directory is still open. And of course it's all wrapped up nicely with appending ".." after the slash.

Noticing how we eliminated path manipulation for the last part, it would be nice to eliminate more of it, especially if anyone wants to port this C++ code to C. The good news is that on Solaris and Linux we can, and as soon as the "at" functions get standardized, we can use them on the other UNIX OSs too. You can read more about them in one of our previous articles, File Descriptors and why we can't use them.

Here's how we can use the at functions to eliminate the path manipulation needed for calling lstat(), with the differences highlighted:

bool getcwd(std::string& path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(".", &sb;))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb;)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);
std::string up_path("..");

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
{
bool pushed = false;
DIR *dir = opendir(up_path.c_str());
if (dir)
{
int dir_fd = dirfd(dir);
dirent *entry;
while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from
{
if (strcmp(entry->d_name, ".") && strcmp(entry->d_name, "..") && !fstatat(dir_fd, entry->d_name, &sb;, AT_SYMLINK_NOFOLLOW))
{
file_id child_id(sb.st_dev, sb.st_ino);
if (child_id == current_id) //We found where we came from, add its name to the list
{
path_components.push_back(entry->d_name);
pushed = true;
break;
}
}
}

if (pushed && !fstat(dir_fd, &sb;)) //If we have a reason to contiue, we update the current dir id
{
current_id = file_id(sb.st_dev, sb.st_ino);
up_path += "/.."; //Keep recursing towards root each iteration
}

closedir(dir);
}
if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue
}

if (current_id == root_id) //Unless they're equal, we failed above
{
//Built the path, will always end with a slash
path = "/";
for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i)
{
path += *i+"/";
}
success = true;
}
}
}

return(success);
}

As should be easily discernible, the string manipulation got easier, and mostly vanished. All the keeping track of a slash, and replaces has been replaced with only needing to append "/.." at the end of each loop.
Instead of manipulating a path string for our call to lstat(), we save the directory's file descriptor above (and subsequently use that instead of recalculating it lower down for fstat()), then use it to get the lstat() for each file in the directory, but now via fstatat().
fstatat() is the same as stat()/lstat(), but takes a directory descriptor as the first parameter to offset where the filename is relative to. The last parameter can be 0 or AT_SYMLINK_NOFOLLOW, which makes it act like stat() or lstat() respectively. fstatat() instead of a file descriptor can also take the special value AT_FDCWD to have it automatically work in the current directory.

This implementation should be much more elegant, but a bit less portable. If one would like to implement fstatat() themselves, it's not hard, here's how you can do it:

int fstatat(int dirfd, const char *pathname, struct stat *buf, int flags)
{
int success = -1;

if ((!flags || (flags == AT_SYMLINK_NOFOLLOW)))
{
int cwdfd = -1;
if ((dirfd == AT_FDCWD) || (pathname && (*pathname == '/')) || (((cwdfd=open(".", O_RDONLY)) != -1) && !fchdir(dirfd)))
{
success = (!flags) ? stat(pathname, buf) : lstat(pathname, buf);
}

if (cwdfd != -1)
{
fchdir(cwdfd);
close(cwdfd);
}
}
else
{
errno = EINVAL;
}

return(success);
}

You'll also need the defines for AT_FDCWD and AT_SYMLINK_NOFOLLOW. You have to make sure that the value you choose for AT_FDCWD can't be a valid file descriptor or the normal failure return value. Therefor I chose -2 in my implementations, but choosing any negative value should be okay. AT_SYMLINK_NOFOLLOW shouldn't matter what it is, and a value of 1 or for all I care, 42, should be fine.
If your OS supports the at functions (currently Linux 2.6.16+ and recent versions of Solaris), it's better not to use a custom implementation, as this isn't thread safe, since it calls fchdir() internally, and for our example, runs counter to the whole point of using fstatat(). It would also be faster to use the original getcwd() implementation from a few days ago than using an emulated fstatat(), since there's less overhead from repeated fchdir() calls.
It's also interesting to note that glibc on Linux now, even for older than 2.6.16 implements fstatat() and the whole slew of at functions even when not supported in the Kernel. It's similar to ours, can be thread unsafe due to changing the working directory, and for some inexplicable reason, segfaulted in certain circumstances when I ran a large battery of tests on it and my implementation against the Kernel's to make sure that mine was working properly.

Anyways, with that out of the way, one can't hope but wonder if it would be possible to also eliminate the need of any path string manipulation for the call to opendir(). Mysteriously, neither Solaris nor Linux have an opendirat() call. If there was such, we could easily keep the previous directory open, till we obtained a new handle for its parent directory.
Baring having an opendirat() function, it'd be nice to implement such a thing. Some UNIX OSs I understand have a function perhaps named fopendir() or fdopendir() to promote a file descriptor to a directory handle. With such a function, we can simply write an opendirat() which calls openat(), then promotes it, but Linux doesn't have a method of promotion, although Solaris does (fdopendir()). Lets hope the people who are standardizing the new at functions for POSIX seriously consider what they're doing, otherwise we won't be able to use them.

Now that we've gotten getcwd() to be safe, we still need realpath() to be. In our realpath() implementation, the only unsafe parts was our function chdir_getcwd() which called chdir() internally as well as getcwd(). getcwd() is now taken care of, but we still need to rewrite the rest of chdir_getcwd() to not actually chdir().
To do such should now be easy, we simply do getcwd(), but with a new start path. We'd make a new function called getcwd_internal() which would take two parameters, one for the current directory, and another to initialize the up one, which we can wrap everything else around. Basically copy your favorite getcwd(), but make the modifications to the beginning like so:

bool getcwd_internal(std::string& path, const std::string& start_dir, std::string& up_path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(start_dir.c_str(), &sb;))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb;)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
--SNIP--


Now we'd turn getcwd() and chdir_getcwd() into wrappers like so:

bool getcwd(std::string& path)
{
std::string up_path("..");
return(getcwd_internal(path, ".", up_path));
}

bool chdir_getcwd(const std::string& dir, std::string& path)
{
std::string up_path(dir+"/..");
return(getcwd_internal(path, dir, up_path));
}


Note that the name of chdir_getcwd() is now misleading that it doesn't actually change directories anymore. For this reason, it's a good idea to not put any implementation details into a function name, as a user shouldn't have to know about them, and over time, it can become inaccurate.

And there you have it folks, getcwd() and realpath() implemented safely. Hopefully in the process all you loyal readers learned a couple other things too.

Other improvements over all we discussed, might be to run a smart pass over concatenated path names at key locations to remove extra slashes together "///", unneeded current directory "/./", and extra parent directories "path1/path2/.." -> "path1/", which is useful if certain functions like opendir() or the stat() family of calls have internal buffer issues. But doing such is a discussion for another time.

This wraps up our long discussion. All comments and suggestions are welcome.

Friday, November 9, 2007


Implementing realpath() in C++



Before reading this post, please read the background in the previous post PATH_MAX simply isn't, as it covers important background, and has some code we'll be building on over here.

Here's what realpath() is supposed to do:

realpath() expands all symbolic links and resolves references to ’/./’, ’/../’ and extra ’/’ characters in the null terminated string named by path and stores the canonicalized absolute pathname in the buffer of size PATH_MAX named by resolved_path. The resulting path will have no symbolic link, ’/./’ or ’/../’ components.


Instead of attacking implementing such a thing head on, we'll take a bottom up approach. One should consider that being able to break down a path between the path components prior to the last slash and after the last slash is essential to figuring out which directory the file actually lies.

The built in functions for this are called dirname() and basename(), but they modify their arguments, and they do some funny handling in special cases which we don't want.

So to implement our own for our purposes, we can write a function as follows:

void relative_dir_base_split(const std::string& path, std::string& dir, std::string& base)
{
std::string::size_type slash_pos = path.rfind("/"); //Find the last slash
if (slash_pos != std::string::npos) //If there is a slash
{
slash_pos++;
dir = path.substr(0, slash_pos); //Directory is before slash
base = path.substr(slash_pos); //And obviously, the file is after
}
else //Otherwise, there is no directory present
{
dir.clear();
base = path;
}
}

The function is simple enough, pass it a path, and two C++ strings for the return values, and have it split them up.

The next function which we'll use is one to get the realpath() of a directory. We can easily get such a thing by chdir()'ing to the directory, running our getcwd(), and then chdir() back.
This simple function is as follows:


bool chdir_getcwd(const std::string& dir, std::string& path)
{
bool success = false;
int start_fd = open(".", O_RDONLY); //Open current directory so we can save a handle to it
if (start_fd != -1)
{
if (!chdir(dir.c_str())) //Change to directory
{
success = getcwd(path); //And get its path
fchdir(start_fd); //And change back of course
}
close(start_fd);
}
return(success);
}

This simple function to resolve where a directory is located uses should be obvious, however, it's also a core function in figuring our where a file is, since a file is located at the realpath() of its parent directory, plus the base name for the file itself.

Now to get the realpath() of a file, we have this simple wrapper over the above functions:

static inline bool realpath_file(const std::string& path, std::string& resolved_path)
{
bool success = false;
std::string dir;
std::string base;
relative_dir_base_split(path, dir, base);

//If there is a directory, get the realpath() for it, otherwise the current directory
if (dir.size() ? chdir_getcwd(dir, resolved_path) : getcwd(resolved_path))
{
resolved_path += base;
success = true;
}
return(success);
}

Now we have chdir_getcwd() which is basically a realpath() for a directory, and realpath_file() for files. Note however, that if you call realpath_file() directly, it won't actually ensure that the file exists, so probably a bad idea to call directly, unless you don't care if it exists or not. It also won't handle symlinks. To handle symlinks we need quite a bit more code.

First, let us write a wrapper in C++ for reading links:

bool readlink_internal(const std::string& path, std::string& buffer, ssize_t length)
{
bool success = false;
if (length > 0)
{
char *buf = new(nothrow) char[length+1]; //Room for Null
if (buf)
{
ssize_t amount = ::readlink(path.c_str(), buf, length+1); //Give room for failure
if ((amount > 0) && (amount <= length)) //If > length, it was modified mid check
{
buf[amount] = 0;
buffer = buf;
success = true;
}
delete[] buf;
}
}
return(success);
}

This function simply wraps up the built in one, and uses C++ strings for dynamic allocation, so we don't have to worry about it directly in our function to resolve symlinks.

Before diving into the actual details in resolving a symlink, let me present one more helper function:

void build_path_base_swap(std::string &path;, const std::string& newbase)
{
string dir;
string base;
relative_dir_base_split(path, dir, base);

if (dir.size())
{
path = dir + newbase;
}
else
{
path = newbase;
}
}


This above function will take a path, and replace the base part of it, meaning if you pass it "path/cows.txt" and "feet.txt", path will become "path/feet.txt".

Now onto the symlink discussion.

Resolving symlinks in itself is hard though, since one symlink could lead to another, meaning we'll need a loop. We also need to defend ourselves against an infinite loop if one symlink links to itself, or a symlink earlier on in our resolve loop.

It is not safe enough to call a regular stat() to check for the final link existing (and not a loop), since during our resolve, an attacker can modify a link, so we have to detect that.
Unfortunately, most OSs/libraries have a define called MAXSYMLINKS which I've seen set anywhere from 8 to 30. They stop their loop once they've iterated MAXSYMLINKS times and return an errno of ELOOP. Unfortunately, doing such has two flaws, which I hope should be apparent. First of all, if a link links to itself immediately, it would take MAXSYMLINKS passes until the function finally realizes it (I'm looking at you OpenBSD). The other issue is that it's just wrong.
Hopefully any of us who has taken a data structures course would know that there are algorithms for detecting a loop in a linked list, and not have to resort to any dumb method which doesn't actually detect a loop. Also a number such as 8 is way too low which I've seen some do. I look at one of my Linux boxes, and I see some binaries are actually symlinked a dozen times. Like "upx" is symlinked to "upx-3.00", which is symlinked to "upx-3.00-nonfree", which is symlinked to, etc...

The optimal way to detect a loop in a linked list is explained here, see the last paragraph for the optimal O(N) solution. Unfortunately though, the optimal solution means we need to create a second iterator to check on symlinks, resolving links ahead of the main loop, two at a time. While great in memory, it can be a pain on a hard drive, where we have to deal with slow access, and the links can change during the check. Therefor I settled on a variation of the first solution, where we'll keep track of the file IDs (described in the previous post) of each symlink, and see if we ever hit a match. We'll store this data in an std::set, which is usually implemented as a Red & Black Tree which has O(log(N)) time for insertion and searching. Making our algorithm work at O(log(N))*2 (since we insert and find each time) * N, which reduces to O(N log(N)).

Here's the code:

bool symlink_resolve(const std::string& start, std::string& end)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
if (start.size())
{
std::string path = start; //Need a modifyable copy
struct stat sb;
std::set<file_id> seen_links;

bool resolved_link;
do //The symlink resolve loop
{
resolved_link = false;
if (!lstat(path.c_str(), &sb;))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (seen_links.find(current_id) == seen_links.end()) //Not a link we've seen
{
seen_links.insert(current_id); //Add to our set

if (S_ISLNK(sb.st_mode)) //Another link
{
std::string newpath;
if (readlink_internal(path, newpath, sb.st_size))
{
if (newpath[0] == '/') //Absolute
{
path = newpath;
}
else //We need to calculate the relative path in relation to the current
{
build_path_base_swap(path, newpath);
}
resolved_link = true;
} //Else, Link can't be read, time to quit
}
else //Yay, it's not a link! got to the last part finally!
{
success = realpath_file(path, end);
}
} //Else, Nice try, someone linked a link back into a previous link during the scan to try to trick us into an infinite loop
} //Else, Dangling link, can't resolve
} while (resolved_link);
}
return(success);
}


And now with all those helper functions out of the way, we can finally write the actual realpath() function itself, and it's a heck of a lot simpler than other implementations you'll find out there.

Here it is:

bool realpath(const std::string& path, std::string& resolved_path, bool resolve_link = true)
{
bool success = false;
if (path.size())
{
struct stat sb;
if (!stat(path.c_str(), &sb;))
{
bool (*rp)(const std::string&, std::string&) = resolve_link ? symlink_resolve : realpath_file;
success = S_ISDIR(sb.st_mode) ? chdir_getcwd(path, resolved_path) : rp(path, resolved_path);
}
}
return(success);
}

It simply stat()s then runs chdir_getcwd() on directories and symlink_resolve() on files. I also offer an added feature. If you don't want the file name itself resolved, which is quite often what people want when a program saves a new file based off a name of another files passed to it, you can pass false as the third parameter, and it'll call realpath_file() instead of symlink_resolve(). With this wrapper, realpath_file() also has a stat() call before it, so the file is checked for existence like the built in realpath().

This implementation protects against the issues of PATH_MAX as described in our previous posts. It's also written in C++ to avoid buffer issues. Unlike built in ones, we generally have faster realpath() on directory, since it uses a getcwd() technique, while most other ones out there use a complicated set of steps on every component in the path until the last one is reached, which is especially slow if there are symlinks in the path. We also beat the built in ones regarding speed in cases of link to self, and have support for a deeper set of links. Some implementations also have some fixed sized buffer which can easily get full, and return failure, which is annoying, but our implementation won't run into. Lastly, we offer not resolving the links at all on the last component.

Drawbacks to our implementation is that it won't work on Windows as described in our last post. But you can #ifdef for that, and easily enough wrap around fullpath(). It also isn't thread safe due to the use of chdir() (which many built in implementations also do), see the previous post on how to work around that. Next time, we'll discuss methods to do all this without a chdir() at all, thus ensuring thread safety without any headaches.

Saturday, November 3, 2007


PATH_MAX simply isn't



Many C/C++ programmers at some point may run into a limit known as PATH_MAX. Basically, if you have to keep track of paths to files/directories, how big does your buffer have to be?
Most Operating Systems/File Systems I've seen, limit a filename or any particular path component to 255 bytes or so. But a full path is a different matter.

Many programmers will immediately tell you that if your buffer is PATH_MAX, or PATH_MAX+1 bytes, it's long enough. A good C++ programmer of course would use C++ strings (std::string or similar with a particular API) to avoid any buffer length issues. But even when having dynamic strings in your program taking care of the nitty gritty issue of how long your buffers need to be, they only solve half the problem.

Even a C++ programmer may at some point want to call the getcwd() or realpath() (fullpath() on Windows) functions, which take a pointer to a writable buffer, and not a C++ string, and according to the standard, they don't do their own allocation. Even ones that do their own allocation very often just allocate PATH_MAX bytes.

getcwd() is a function to return what the current working directory is. realpath() can take a relative or absolute path to any filename, containing .. or levels of /././. or extra slashes, and symlinks and the like, and return a full absolute path without any extra garbage. These functions have a flaw though.

The flaw is that PATH_MAX simply isn't. Each system can define PATH_MAX to whatever size it likes. On my Linux system, I see it's 4096, on my OpenBSD system, I see it's 1024, on Windows, it's 260.

Now performing a test on my Linux system, I noticed that it limits a path component to 255 characters on ext3, but it doesn't stop me from making as many nested ones as I like. I successfully created a path 6000 characters long. Linux does absolutely nothing to stop me from creating such a large path, nor from mounting one large path on another. Running getcwd() in such a large path, even with a huge buffer, fails, since it doesn't work with anything past PATH_MAX.

Even a commercial OS like Mac OS X defines it as 1024, but tests show you can create a path several thousand characters long. Interestingly enough, OSX's getcwd() will properly identify a path which is larger than its PATH_MAX if you pass it a large enough buffer with enough room to hold all the data. This is possible, because the prototype for getcwd() is:
char *getcwd(char *buf, size_t size);


So a smart getcwd() can work if there's enough room. But unfortunately, there is no way to determine how much space you actually need, so you can't allocate it in advance. You'd have to keep allocating larger and larger buffers hoping one of them will finally work, which is quite retarded.

Since a path can be longer than PATH_MAX, the define is useless, writing code based off of it is wrong, and the functions that require it are broken.

An exception to this is Windows. It doesn't allow any paths to be created larger than 260 characters. If the path was created on a partition from a different OS, Windows won't allow anything to access it. It sounds strange that such a small limit was chosen, considering that FAT has no such limit imposed, and NTFS allows paths to be 32768 characters long. I can easily imagine someone with a sizable audio collection having a 300+ character path like so:
"C:\Documents and Settings\Jonathan Ezekiel Cornflour\My Documents\My Music\My Personal Rips\2007\Technological\Operating System Symphony Orchestra\The GNOME Musical Men\I Married Her For Her File System\You Don't Appreciate Marriage Until You've Noticed Tax Pro's Wizard For Married Couples.Track 01.MP5"


Before we forget, here's the prototype for realpath:
char *realpath(const char *file_name, char *resolved_name);


Now looking at that prototype, you should immediately say to yourself, but where's the size value for resolved_name? We don't want a buffer overflow! Which is why OSs will implement it based on the PATH_MAX define.
The resolved_name argument must refer to a buffer capable of storing at least PATH_MAX characters.

Which basically means, it can never work on a large path, and no clever OS can implement around it, unless it actually checks how much RAM is allocated on that pointer using an OS specific method - if available.

For these reasons, I've decided to implement getcwd() and realpath() myself. We'll discuss the exact specifics of realpath() next time, for now however, we will focus on how one can make their own getcwd().

The idea is to walk up the tree from the working directory, till we reach the root, along the way noting which path component we just went across.
Every modern OS has a stat() function which can take a path component and return information about it, such as when it was created, which device it is located on, and the like. All these OSs except for Windows return the fields st_dev and st_ino which together can uniquely identify any file or directory. If those two fields match the data retrieved in some other way on the same system, you can be sure they're the same file/directory.
To start, we'd determine the unique ID for . and /, once we have those, we can construct our loop. At each step, when the current doesn't equal the root, we can change directory to .., then scan the directory (using opendir()+readdir()+closedir()) for a component with the same ID. Once a matching ID is found, we can denote that as the correct name for the current level, and move up one.

Code demonstrating this in C++ is as follows:


bool getcwd(std::string& path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
int start_fd = open(".", O_RDONLY); //Keep track of start directory, so can jump back to it later
if (start_fd != -1)
{
struct stat sb;
if (!fstat(start_fd, &sb;))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb;)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
{
bool pushed = false;

if (!chdir("..")) //Keep recursing towards root each iteration
{
DIR *dir = opendir(".");
if (dir)
{
dirent *entry;
while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from
{
if ((strcmp(entry->d_name, ".") && strcmp(entry->d_name, "..") && !lstat(entry->d_name, &sb;)))
{
file_id child_id(sb.st_dev, sb.st_ino);
if (child_id == current_id) //We found where we came from, add its name to the list
{
path_components.push_back(entry->d_name);
pushed = true;
break;
}
}
}
closedir(dir);

if (pushed && !stat(".", &sb;)) //If we have a reason to contiue, we update the current dir id
{
current_id = file_id(sb.st_dev, sb.st_ino);
}
}//Else, Uh oh, can't read information at this level
}
if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue
}

if (current_id == root_id) //Unless they're equal, we failed above
{
//Built the path, will always end with a slash
path = "/";
for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i)
{
path += *i+"/";
}
success = true;
}
fchdir(start_fd);
}
}
close(start_fd);
}

return(success);
}


Before we accept that as the defacto method to use in your application, let us discuss the flaws.

As mentioned above, it doesn't work on Windows, but a simple #ifdef for Windows can just make it a wrapper around the built in getcwd() with a local buffer of size PATH_MAX, which is fine for Windows, and pretty much no other OS.

This function uses the name getcwd() which can conflict with the built in C based one which is a problem for certain compilers. The fix is to rename it, or put it in its own namespace.

Next, the built in getcwd() implementations I checked only have a trailing slash on the root directory. I personally like having the slash appended, since I'm usually concatenating a filename onto it, but note that if you're not using it for concatenation, but to pass to functions like access(), stat(), opendir(), chdir(), and the like, an OS may not like doing the call with a trailing slash. I've only noticed that being an issue with DJGPP and a few functions. So if it matters to you, the loop near the end of the function can easily be modified to not have the trailing slash, except in the case that the root directory is the entire path.

This function also changes the directory in the process, so it's not thread safe. But then again, many built in implementations aren't thread safe either. If you use threads, calculate all the paths you need prior to creating the threads. Which is probably a good idea, and keep using path names based off of your absolute directories in your program, instead of changing directories during the main execution elsewhere in the program. Otherwise, you'll have to use a mutex around the call, which is also a valid option.

There could also be the issue that some level of the path isn't readable. Which can happen on UNIX, where to enter a directory, one only needs execute permission, and not read permission. I'm not sure what one can do in that case, except maybe fall back on the built in one hoping it does some magical Kernel call to get around it. If anyone has any advice on this one, please post about it in the comments.

Lastly, this function is written in C++, which is annoying for C users. The std::vector can be replaced with a linked list keeping track of the components, and at the end, allocate the buffer size needed, and return the allocated buffer. This requires the user to free the buffer on the outside, but there really isn't any other safe way of doing this.
Alternatively, instead of a linked list, a buffer which is constantly reallocated can be used while building the path, constantly memmove()'ing the built components over to the higher part of the buffer.

During the course of the rest of the program, all path manipulation should be using safe allocation managing strings such as std::string, or should be based off of the above described auto allocating getcwd() and similar functions, and constantly handling the memory management, growing as needed. Be careful when you need to get any path information from elsewhere, as you can never be sure how large it will be.

I hope developers realize that when not on Windows, using the incorrect define PATH_MAX is just wrong, and fix their applications. Next time, we'll discuss how one can implement their own realpath().

Monday, March 26, 2007


Methods for safe string handling



Every now and then you hear about how a buffer overflow was discovered in some program. Immediately, everyone jumps on the story with their own take on the matter. Fans of a different programming language will say: "of course this wouldn't have happened if you'd have used my programming language". Secure library advocates will say: "you should have used that library instead". While experts of that language will say: "The guy was an idiot, he coded that all wrong".

I'd like to look at basic "C string" handling in C. We're talking about functions like strlen(), strcpy(), strcat(), and strcmp(). These functions report length, copy, concatenate, and compare C strings respectively. A C string is an array of characters, which is terminated with a null character to signify the end. strlen() would find the length by seeing how far from the beginning the null was. strcat() would find the null in the first C string, and then to that location copy characters from the second string, till it finds the null. So on and so forth with all the C string functions.

Now these functions are seen by some as inherently broken, since they can read/write data right off the end of the buffer, and the terminating nulls can sometimes vanish if one isn't careful. You see these kinds of issues with C++ programs too. Usually when a Java programmer hears about this, they tell you to use Java, since Java has a built in string class which can't get screwed up by any of these simple operations. Learned C++ programmers know that C++ also has a built in string class, just as good, if not better than Java's string class. Knowledgeable C++ programmers will use C++'s strings to avoid all these issues, and have really nice features, such as sane string comparison using ==.

Switching from C++ to Java, because people know that Java has a string class is kind of ridiculous. Yet I keep hearing from all kinds of people how programmers should switch to Java because of it. One using C++ should generally use the C++ class, unless they have a good reason otherwise. It's a shame most C++ programmers use C strings instead of C++ ones, probably due to them not knowing about it. However this doesn't help when programming in pure C.

For pure C, you can turn to one of the libraries for handling strings, such as SafeStr. But most people won't choose to go this route. Due to this, care has to be taken.

Now I won't kid you, if there's a buffer overflow in a C program, it is the programmer's fault. If s/he was more careful, it wouldn't have happened. However some areas of code are large, have many code paths, and are downright confusing. In those cases, it's easy to screw up. To help prevent screwing up, there exist some more C functions to handling strings properly, and some C libraries also provide extra non standard functions.

In answer to just overflowing a buffer when copying or concatenating, "n" versions are provided. These are strncpy(), strncat(). They take a third parameter to tell it how much they're dealing with. strncpy()'s n refers to how big the buffer is, and it'll only copy up to that amount of characters. strncat()'s n is up to how many characters can be stuck onto the end of the first string. In the case of strncpy(), if null isn't found in the copy process, the result is not null terminated. Leaving us with the other problem. strncat() on the other hand will always null terminate, because it attaches n+1 bytes whenever the second string's end isn't reached. Meaning that you have to tell strncat() length_of(remaining bytes in buffer)-1. This leads to confusion, because of different n meanings, and because strncpy() introduces another problem.

There's also strncmp(), for specifying the maximum amount of characters to compare, which you can use if one of the strings isn't null terminated. Surprisingly however, there is no strnlen(), to check how many characters a string is, without running off the end. Considering that strncpy() doesn't always null terminate, sounds like a useful feature to have.

Taking this into account, in some C libraries, you'll find strnlen() which returns the length, or instead will return the value of n, in the case where no null was found. Those needing it, it's an easy function to implement yourself:
size_t strnlen(const char *s, size_t n)
{
  const char *p = (const char *)memchr(s, 0, n);
  return(p ? p-s : n);
}


Although it would be intelligent to follow up every call to "strncpy(s, n)" with a "s[n-1] = 0" to terminate it yourself. But this hardly helps the confusion. Also take into account that str[n][cpy/cat]() return the destination for their return value, so you'll sometimes see code like:
if (!strcmp(strncpy(buf, entered_text, sizeof(buf)), param))
{
  do_something();
}

However, this code is broken as buf may not be null terminated. A correct version would perhaps be:
if (!strncmp(strncpy(buf, entered_text, sizeof(buf)), param, min(sizeof(param), sizeof(buf))))
{
  do_something();
}

Which is ugly at best.

To solve and work around these issues, OpenBSD has invented the strlcpy() and strlcat() functions, which have been implemented in all BSD derivatives (including Solaris and Mac OS X). Manpage here.

Although I found the standard descriptions confusing at best. Here's my take on it after some study:
size_t strlcpy(char *dest, const char *src, size_t size);

Description:
    The strlcpy() function copies the C string pointed to by src (including the null) to
the array pointed to by dest. However, not more than size bytes of src are copied. Meaning
at most size-1 characters will be copied. The copy will always be null terminated, unless
size was 0, in which case nothing is done to dest.

Return Value:
    Return is always the amount of characters needed to hold the copy. Meaning strlen(src).
If the return value is <size, everything was copied.


With strlcpy(), you can run it once with a size of 0 if you want to find out how much you need to allocate. Although pointless, you'd be better off with strlen(). Now this won't help much if src isn't null terminated, but it should avoid issues you have with misusing the return value like in the case offered above, or when there wasn't enough room. If you always pass strlcpy() the sizeof() the buffer, or the value passed to malloc() as the case may be, you should be safe.

If you read the manpage, you also see a usefulness to the return value. A problem with constantly using strcat() is that you have to keep iterating through the former strings, leading to a speed loss. With strlcpy(), you can do the following for concatenation:

if ((n1 = strlcpy(a, b, sizeof(a))) < sizeof(a))
{
  if ((n2 = strlcpy(a+n1, c, sizeof(a)-n1)) < sizeof(a)-n1)
  {
    if ((n3 = strlcpy(a+n2, d, sizeof(a)-n2)) < sizeof(a)-n2)
    {
      etc...
    }
  }
}

A nice trick for mass concatination, although as the manpage points out, ridiculous, and negates strlcat().

Moving onwards, the manpage listed above for strlcpy() is also for strlcat(), yet as above, I found it a bit confusing too. Here's my take:

size_t strlcat(char *dest, const char *src, size_t size);

Description:
    The strlcat() function appends the src string to the dest string overwriting the ‘\0’
character at the end of dest, and then adds a terminating ‘\0’ character. However, not more
than size-strlen(dest) bytes of src are copied. Meaning a maximum of size-1 characters will
fill dest in the end. The copy will always be null terminated, unless size was less than the
length of dest, or dest is not null terminated, in which case nothing is done to dest.

Return Value:
    Return is the amount of characters needed to hold the copy when dest initially is null
terminated and its length is less than size. Otherwise the return is size+strlen(src). If
the return value is <size, everything was copied.


What's nice about strlcat() is that for the size param, you can pass it the sizeof() or the malloc() value like you do for strlcpy(). But beware the return value, the OpenBSD code is rightly commented as follows: "Returns strlen(src) + min(siz, strlen(initial dst))". Take a moment to comprehend that.

If you're not using a BSD and you want these functions, code is here and here. Be wary of some of the other implementations you find online. I looked at some of them, and they acted differently in some other corner cases. One I looked at even crashed in one of the corner cases.

However looking at that code there, it looks a bit messy. Reviewing our previous multiple concatenation case, which is also spoken about in the manpage, one sees these as a bit weak. If one wants nice multi concat without too much fuss, they'd normally use snprintf() (C99) with a bunch of "%s%s%s" as the format. I myself though prefer a more elegant solution to all of this.

I therefor have created the following logical extension of OpenBSD's l functions, I give you strlmrg():
size_t strlmrg(char *dest, size_t size, ...)
{
  char *s, *end = dest + (size-1);
  size_t needed = 0;

  va_list ap;
  va_start(ap, size);

  while ((s = va_arg(ap, char *)))
  {
    if (s == dest)
    {
      size_t n = strnlen(s, (end+1)-s);
      needed += n;
      dest += n;
    }
    else
    {
      needed += strlen(s);
      if (dest && (dest < end))
      {
        while (*s && (dest < end))
        {
          *dest++ = *s++;
        }
        *dest = 0;
      }
    }
  }

  va_end(ap);
  return(needed);
}

Pass strlmrg() the destination buffer, it's size (from sizeof() or the param to malloc()), and all the strings you want concatenated, followed by a null pointer.
Example 1:
printf("%zu; %s\n", strlmrg(line, sizeof(line), "I ", "Went ", "To ", "The ", "Park.", NULL), line);

It would print: "19; I Went To The Park."
Example 2:
n = strlmrg(buffer, sizeof(buffer), a, b, c, d, e, f, (void *)0);

Which would concatenate a to f inside buffer (given that it could fit), and return the amount of characters copied. Note, it returns how many characters would be copied, so you can use it to determine the size. See this example:
size_t n = strlmrg(0, 0, a, b, c, (void *)0);
char *p = malloc(n+1); //+1 for the null
strlmrg(p, n+1, a, b, c,(void *)0); //Again, +1 for the null

When strlmrg() returns less than size, everything was merged in. The result is always null terminated except when dest is null, size is 0, or it encounters one of the source pointers to match the location it is currently trying to copy to.
You should avoid passing one of the source pointers to be a location from the destination buffer. If you happened to pass in such an overlapping source pointer, and it's not null terminated prior to it reaching size, you will get size as the return value instead of the full size. Also don't try to pass it any non null terminated source pointer, or forget to pass the last null pointer.

Once we have strlmrg() implemented, it also paves the way for a simple and straightforward implementation for strlcpy() and strlcat().
size_t strlcpy(char *dest, const char *src, size_t size)
{
  return(strlmrg(dest, size, src, (void *)0));
}

size_t strlcat(char *dest, const char *src, size_t size)
{
  return(strlmrg(dest, size, dest, src, (void *)0));
}

And unlike the official ones, these won't crash if dest or src is null. I tested these wrappers, and they seemed to match results with the official ones in every regular and edge case I tried.

I also tested strlmrg() in a variety of cases, and it seems to be very good and secure. If you find a bug, or have an improvement to offer, feel free to post about it.

Thoughts?

Saturday, March 17, 2007


What to do about gets()



A friend of mine recently asked me if I was to make my own C library, how would I implement gets()?
An interesting question indeed.

When I first learned C quite a few years back, I was reading a book to find out what I/O functions were available. I saw gets() and it looked like the function to use to receive a line of input, but looking at the prototype:

char *gets(char *s);

I was wondering where one specified the length of the buffer s. Looking further in the chapter I found the usable function fgets() which seemed fine, except that I had to specify to it to use stdin, which seemed unnecessary, but no big deal.

Some time after that I was reading about how this worm spread around the world by buffer overflowing gets() in the UNIX finger command, and my first impression was: "Who in their right mind would use gets()?".

More recently, when I was teaching my C class at the local university, we reached the point where I felt I would discuss various forms of user I/O that day. I figured after explaining a couple of things, I would teach puts(), fputs(), gets(), and fgets(), and explain why to never use gets(). As soon as I wrote the gets() prototype on the board and said it was used for input, I was immediately bombarded by two students shouting out: "But how does one pass the buffer length?".

So I went on to explain they were right, and the old finger worm, and how it was obvious to me too when I first saw gets(). Yet the class wondered who could blunder so badly? And yes indeed I continue to wonder why isn't it obvious to some people the function is broken? Even students just learning how to program realized how obvious it was that the function is broken.

Thankfully, the manpage for gets() today shows:

Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.


Interestingly though, when the C99 standard came out, they decided to leave gets() in, reasoning how it was already there, or it's okay to use for a test app or some such nonsense. Yet if I were to code my own library, I wouldn't want to break the standard either. So I wouldn't remove it, yet I wouldn't alter the prototype to take a size parameter either. So what would I do?

Then it hit me the solution to the problem, the return is specified as so:

gets() returns s on success, and NULL on error or when end of file occurs while no characters have been read.

So now I proudly present to you my implementation of gets():

char *gets(char *s)
{
return NULL;
}

Very simply, it just always returns an error, so it's standards compliant AND secure.

Of course someone is going to wonder, but what about existing programs that don't check the return value? Which I must simply regard as broken for using gets(), and for using a function which can fail but never check for it.

Now this may be annoying at first to those beginners who use gets() and wonder why is it always failing, but I figure this is a good way to exorcise a broken function out of people's systems and to teach beginners to think clearly when choosing their building blocks to use.