Insane Coding: Windows

Showing posts with label Windows. Show all posts

Monday, June 30, 2014

Memory management in C and auto allocating sprintf() - asprintf()

Posted by insane coder at Monday, June 30, 2014

Memory Management

Memory management in C is viewed by some to be quite tricky. One needs to work with pointers that can point anywhere in memory, and if misused, cause a program to misbehave, or worse.

The basic functions to allocate and deallocate memory in C are malloc() and free() respectively. The malloc() function takes a size in bytes of how much to allocate, and returns a pointer to the allocated memory to be used. Upon failure, a null pointer, which can be thought of as a pointer pointing to 0 is returned. The free() function takes the pointer returned by malloc(), and deallocates the memory, or frees up the memory it was once pointing to.

To work with malloc() in simple situations, typically, code along the following lines is used:

void *p = malloc(size);
if (p)
{
... work with p ...
free(p);
}
else
{
... handle error scenario ...
}

Unfortunately many experienced programmers forget to handle the failure scenario. I've even heard some say they purposely don't, as they have no clue how to proceed, and just letting the program crash is good enough for them. If you meet someone who makes that argument, revoke their programming license. We don't need such near sighted idiots writing libraries.

In any case, even the above can lead to some misuse. After this block of code runs, what is p now pointing to?

After the above code runs, in the case that malloc() succeeded, p is now pointing to memory in middle of nowhere, and can't be used. This is known as a dangling pointer. Dangling pointers can be dangerous, as an if clause as above will think the pointer is valid, and may do something with it, or lead to the infamous use after free bug. This becomes more likely to occur as the situation becomes more complicated and there are loops involved, and how malloc() and free() interact can take multiple paths.

Pointers involved with memory management should always be pointing at 0 or at allocated memory. Anything else is just asking for trouble. Therefore, I deem any direct use of free() dangerous, as it doesn't set the pointer to 0.

So if free() is considered harmful, what should one use?

In C++, I recommend the following:

static inline void insane_free(void *&p)
{
free(p);
p = 0;
}

This insane_free() is now a drop in replacement for free(), and can be used instead. (Since C++ programs normally use new and delete/delete[] instead, I leave it as an exercise to the reader how to work with those.)

However, C doesn't support direct references. One can pass a pointer by a pointer to accomplish similar results, but that becomes clunky and is not a drop in replacement. So in C, I recommend the following:

#define insane_free(p) { free(p); p = 0; }

It makes use of the preprocessor, so some may consider it messy, but it can be used wherever free() currently is. One could also name the macro free in order to automatically replace existing code, but it's best not to program that way, as you begin to rely on these semantics. This in turn means someone copying your code may think a call to free() is the normal free() and not realize something special is occurring when they copy it elsewhere without the macro.

Correct usage in simple cases is then:

void *p = malloc(size);
if (p)
{
... work with p ...
insane_free(p);
}
else
{
... handle error scenario ...
}

If you think using a wrapper macro or function is overkill, and just always manually assigning the pointer to 0 after freeing is the way to go, consider that it's unwieldy to constantly do so, and you may forget to. If the above technique was always used, all use after free bugs would never have occurred in the first place.

Something else to be aware of is that there's nothing wrong with calling free(0). However, calling free() upon a pointer which is not null and not pointing to allocated memory is forbidden and will crash your program. So stick to the advice here, and you may just find memory management became significantly easier.

If all this talk of pointers is beyond you, consider acquiring Understanding and Using C Pointers.

sprintf() and asprintf()

If you do a lot of C programming, at some point, you probably have used the sprintf() function, or its safer counterpart snprintf().

These two functions sprintf() and snprintf() act like printf(), but instead of printing to the standard output, they print to a fixed-length string buffer. Now a fixed-length string buffer is great and all, but what if you wanted something which automatically allocated the amount it needed?

Enter asprintf(), a function which acts like sprintf(), but is auto-allocating, and needs no buffer supplied. This function was invented ages ago by GLIBC, shortly thereafter copied to the modern BSDs, and found its way further still into all sorts of libraries, although is not yet ubiquitous.

Let's compare the prototypes of the two:

int sprintf(char *buffer, const char *format, ...);

int asprintf(char **ret, const char *format, ...);

The sanest approach would have been for a function like asprintf() to have the prototype of:

char *asprintf(const char *format, ...);

But its creators wanted to make it act like sprintf(), and its design can also be potentially more useful.

Instead of passing asprintf() a buffer, a pointer to a variable of type char * needs to be passed, like so:

char *buffer;
asprintf(&buffer, ...whatever...);

Now how asprintf() actually works is no big secret. The C99 standard specified that snprintf() upon failure should return the amount of characters that would be needed to contain its output. Which means that conceptually something along the following lines would be all that asprintf() needs to do:

char *buffer = malloc(snprintf(0, 0, format, data...)+1);
sprintf(buffer, format, data...);

Of course though, the above taken verbatim would be incorrect, because it mistakenly assumes that nothing can go wrong, such as the malloc() or snprintf() failing.

First let's better understand what the *printf() functions return. Upon success, they return the amount of characters written to the string (which does not include the trailing null byte). Or in other words, the return value is equivalent to calling strlen() on the data being output, which can save you needing to use a strlen() call with sprintf() or similar functions for certain scenarios. Upon failure, for whatever reason, the return is -1. Of course there's the above mentioned exception to this with snprintf(), where the amount of characters needed to contain the output would be returned instead. If during the output, the size overflows (exceeds INT_MAX), many implementations will return a large negative value (failure with snprintf(), or success with all the functions).

Like the other functions, asprintf() also returns an integer of the nature described above. Which means working with asprintf() should go something like this:

char *buffer;
if (asprintf(&buffer, ...whatever...) != -1)
{
do_whatever(buffer);
insane_free(buffer);
}

However, unlike the other functions, asprintf() has a second return value, its first argument, or what the function sees as *ret. To comply with the memory management discussion above, this should also be set to 0 upon failure. Unfortunately, many popular implementations, including those in GLIBC and MinGW fail to do so.

Since I develop with the above systems, and I'm using asprintf() in loops with multiple paths, it becomes unwieldy to need to pass around the buffer and the returned int, so I'd of course want saner semantics which don't leave dangling pointers in my program.

In order to correct such mistakes, I would need to take code from elsewhere, or whip up my own function. Now I find developing functions such as these to be relatively simple, but even so, I always go to check other implementations to see if there's any important points I'm missing before I go implement one. Maybe, I'll even find one which meets my standards with a decent license which I can just copy verbatim.

In researching this, to my shock and horror, I came across implementations which properly ensure that *ret is set to 0 upon failure, but the returned int may be undefined in certain cases. That some of the most popular implementations get one half wrong, and that some of the less popular get the other half wrong is just downright terrifying. This means that there isn't any necessarily portable way to check for failure with the different implementations. I certainly was not expecting that, but with the amount of horrible code out there, I guess I really shouldn't be surprised anymore.

Also in the course of research, besides finding many implementations taking a non-portable approach, many have problems in all sorts of edge cases. Such as mishandling overflow, or not realizing that two successive calls to a *printf() function with the same data may not necessarily yield the same results. Some try to calculate the length needed with some extra logic and only call sprintf() once, but this logic may not be portable, or always needs updating as new types are added to the format string as standards progress, or the C library decided to offer new features. Some of the mistakes I found seem to be due to expecting a certain implementation of underlying functions, and then later the underlying functions were altered, or the code was copied verbatim to another library, without noting the underlying functions acted differently.

So, once again, I'm finding myself needing to supply the world with more usable implementations.

Let's dive into how to implement asprinf().

Every one of these kind of functions actually has two variants, the regular which takes an unlimited amount of arguments, and the v variants which take a va_list (defined in stdarg.h) after the format argument instead. These va_lists are what ... gets turned into after use, and in fact, every non-v *printf() function is actually wrapped to a counterpart v*printf() function. This makes implementing asprintf() itself quite straight forward:

To fix the aforementioned return problems, one could also easily throw in here a check upon the correct return variable used in the underlying vasprintf() implementation and use it to set the other. However, that's not a very portable fix, and the underlying implementation of vasprintf() can have other issues as described above.

A straight forward implementation of vasprintf() would be:

As long as you have a proper C99 implementation of stdarg.h and vsnprintf(), you should be good to go. However, some systems may have vsnprintf() but not va_copy(). The va_copy() macro is needed because a va_list may not just be a simple object, but have handles to elsewhere, and a deep copy is needed. Since vsnprintf() being passed the original va_list may modify its contents, a copy is needed because the function is called twice.

Microsoft Visual C++ (MSVC, or Microsoft Vs. C++ as I like to think of it) up until the latest versions has utterly lacked va_copy(). This and several other systems that lack it though usually have simple va_lists that can be shallow copied. To gain compatibility with them, simply employ:

#ifndef va_copy

#define va_copy(dest, src) dest = src

#endif

Be warned though that if your system lacks va_copy(), and a deep copy is required, using the above is a recipe for disaster.

Once we're dealing with systems where shallow copy works though, the following works just as well, as vsnprintf() will be copying the va_list it receives and won't be modifying other data.

Before we go further, there's two points I'd like to make.

Some implementations of vsnprintf() are wrong, and always return -1 upon failure, not the size that would've been needed. On such systems, another approach will need to be taken to calculate the length required, and the implementations here of vasprintf() (and by extension asprintf()) will just always return -1 and *ret (or *strp) will be 0.
The code if ((r < 0) || (r > size)) could instead be if (r != size), more on that later.

Now on Windows, vsnprintf() always returns -1 upon failure, in violation of the C99 standard. However, in violation of Microsoft's own specifications, and undocumented, I found that vsnprintf() with the first two parameters being passed 0 as in the above code actually works correctly. It's only when you're passing data there that the Windows implementation violates the spec. But in any case, relying on undocumented behavior is never a good idea.

On certain versions of MinGW, if __USE_MINGW_ANSI_STDIO is defined before stdio.h is included, it'll cause the broken Windows *printf() functions to be replaced with C99 standards compliant ones.

In any case though, Windows actually provides a different function to retrieve the needed length, _vscprintf(). A simple implementation using it would be:

This however makes the mistake of assuming that vsnprintf() is implemented incorrectly as it currently is with MSVC. Meaning this will break if Microsoft ever fixes the function, or you're using MinGW with __USE_MINGW_ANSI_STDIO. So better to use:

Lastly, let me return to that second point from earlier. The vsnprintf() function call the second time may fail because the system ran out of memory to perform its activities once the call to malloc() succeeds, or something else happens on the system to cause it to fail. But also, in a multi-threaded program, the various arguments being passed could have their data change between the two calls.

Now if you're calling functions while another thread is modifying the same variables you're passing to said function, you're just asking for trouble. Personally, I think that all the final check should do is ensure that r is equal to size, and if not, something went wrong, free the data (with insane_free() of course), and set r to -1. However, any value between 0 and size (inclusive), even when not equal to size means the call succeeded for some definition of success, which the above implementations all allow for (except where not possible Microsoft). Based on this idea, several of the implementations I looked at constantly loop while vsnprintf() continues to indicate that the buffer needs to be larger. Therefore, I'll provide such an implementation as well:

Like the first implementation, if all you lacked was va_copy(), and shallow copy is fine, it's easy to get this to work on your platform as described above. But if vsnprintf() isn't implemented correctly (hello MSVC), this will always fail.

All the code here including the appropriate headers, along with notes and usage examples are all packed up and ready to use on my asprintf() implementation website. Between everything offered, you should hopefully find something that works well for you, and is better than what your platform provides, or alternative junk out there.

As always, I'm only human, so if you found any bugs, please inform me.

Wednesday, December 14, 2011

Progression and Regression of Desktop User Interfaces

Posted by insane coder at Wednesday, December 14, 2011

As this Gregorian year comes to a close, with various new interfaces out now, and some new ones on the horizon, I decided to recap my personal experiences with user interfaces on the desktop, and see what the future will bring.

When I was younger, there were a few desktop environments floating around, and I've seen a couple of them at school or a friend's house. But the first one I had on my own personal computer, and really played around with was Windows 3.

Windows 3 came with something called Program Manager. Here's what it looked like:

The basic idea was that you had a "start screen", where you had your various applications grouped by their general category. Within each of these "groups", you had shortcuts to the specific applications. Now certain apps like a calculator you only used in a small window, but most serious apps were only used maximized. If you wanted to switch to another running app, you either pressed the alt+tab keyboard shortcut to cycle through them, or you minimized everything, where you then saw a screen listing all the currently running applications.

Microsoft also shipped a very nice file manager, known as "File Manager", which alone made it useful to use Windows. It was rather primitive though, and various companies released various add-ons for it to greatly enhance its abilities. I particularly loved Wiz File Manager Pro.

Various companies also made add-ons for Program Manager, such as to allow nested groups, or shortcuts directly on the main screen outside of a group, or the ability to change the icons of selected groups. Microsoft would've probably built some of these add-ons in if it continued development of Program Manager.

Now not everyone used Windows all the time back then, but only selectively started it up when they wanted something from it. I personally did everything in DOS unless I wanted to use a particular Windows app, such as file management or painting, or copying and pasting stuff around. Using Windows all the time could be annoying as it slowed down some DOS apps, or made some of them not start at all due to lack of memory and other issues.

In the summer of 1995, Microsoft released Windows 4 to the world. It came bundled with MS-DOS 7, and provided a whole new user experience.

Now the back-most window, the "desktop", no longer contained a list of the running programs (in explorer.exe mode), but rather you could put your own shortcuts and groups and groups of groups there. Rather running programs would appear at the bottom of the screen in a "taskbar". The taskbar now also contained a clock, and a "start menu", to launch applications. Some always running applications which were meant to be background tasks appeared as a tiny icon next to the clock in an area known as a "tray".

This was a major step forward in usability. Now, no matter which application you were currently using, you could see all the ones that were currently running on the edge of your screen. You could also easily click on one of them to switch to it. You didn't need to minimize all to see them anymore. Thanks to the start menu, you could also easily launch all your existing programs without needing to minimize all back down to Program Manager. The clock always being visible was also a nice touch.

Now when this came out, I could appreciate these improvements, but at the same time, I also hated it. A lot of us were using 640x480 resolutions on 14" screens back then. Having something steal screen space was extremely annoying. Also with how little memory systems had back at the time (4 or 8 MB of RAM), you generally weren't running more than 2 or 3 applications at a time and could not really appreciate the benefits of having an always present taskbar. Some people played with taskbar auto-hide because of this.

The start menu was also a tad ridiculous. Lots of clicks were needed to get anywhere. The default appearance also had too many useless things on it.

Did anyone actually use help? Did most people launch things from documents? Microsoft released a nice collection of utilities called "PowerToys" which contained "TweakUI" which you could use to make things work closer to how you want.

The default group programs installed to within the start menu was quite messy though. Software installers would not organize their shortcuts into the groups that Windows came with, but each installed their apps into their own unique group. Having 50 submenus pop out was rather unwieldy, and I personally organized each app after I installed it. Grouping into "System Tools", "Games", "Internet Related Applications", and so on. It was annoying to manually do all this though, as when removing an app, you had to remove its group manually. On upgrades, one would also have to readjust things each time too.

Windows 4 also came with the well known Windows Explorer file manager to replace the older one. It was across the board better than the vanilla version of File Manager that shipped with Windows 3.

I personally started dual booting MS-DOS 6.22 + Windows for Workgroups 3.11 and Windows 95 (and later tri-booted with OS/2 Warp). Basically I used DOS and Windows for pretty much everything, and Windows 95 for those apps that required it. Although I managed to get most apps to work with Windows 3 using Win32s.

As I got a larger screen and more RAM though, I finally started to appreciate what Windows 4 offered, and started to use it almost exclusively. I still exited Windows into DOS 7 though for games that needed to use more RAM, or ran quicker that way on our dinky processors from back then.

Then Windows 4.10 / Internet Explorer 4 came out which offered a couple of improvements. First was "quick launch" which allowed you to put shortcuts directly on your taskbar. You could also make more than one taskbar and put all your shortcuts on it. I personally loved this feature, I put one taskbar on top of my screen, and loaded it with shortcuts to all my common applications, and had one on the bottom for classical use. Now I only had to dive into the start menu for applications I rarely used.

It also offered a feature called "active desktop" which made the background of the desktop not just an image, but a web page. I initially loved the feature, as I edited my own page, and stuck in an input line which I would use to launch a web browser to my favorite search engine at the time (which changed monthly) with my text already searched for. After a while active desktop got annoying though, as every time IE crashed, it disabled it, and you had to deal with extra error messages, and go turn it on manually.

By default this new version also made every Windows Explorer window have this huge sidebar stealing your precious screen space. Thankfully though, you could turn it off.

All in all though, as our CPUs got faster, RAM became cheaper, and large screens more available, this interface was simply fantastic. I stopped booting into other OSs, or exiting Windows itself.

Then Windows 5 came out for consumers, and UI wise, there weren't really any significant changes. The default look used these oversized bars and buttons on each window, but one could easily turn that off. The start menu got a bit of a rearrangement to now feature your most used programs up front, and various actions got pushed off to the side. Since I already put my most used programs on my quick launch on top, this start menu was a complete waste of space. Thankfully, it could also be turned off. I still kept using Windows 98 though, as I didn't see any need for this new Windows XP, and it was just a memory hog in comparison at the time.

What was more interesting to me however was that at work, all our machines ran Linux with GNOME and KDE. When I first started working there, they made me take a course on how to use Emacs, as every programmer needs a text editor. I was greatly annoyed by the thing however, where was my shift highlight with shift+delete and shift+insert or ctrl+x and ctrl+v cut and paste? Thankfully though I soon found gedit and kedit which was like edit/notepad/wordpad but for Linux.

Now I personally don't use a lot of windowsy type software often. My primary usage of a desktop consists of using a text editor, calculator, file manager, console/terminal/prompt, hex editor, paint, cd/dvd burner, and web browser. Only rarely do I launch anything else. Perhaps a PDF, CHM, or DJVU reader when I need to read something.

After using Linux with GNOME/KDE at work for a while, I started to bring more of my work home with me and wanted these things installed on my own personal computer. So dual booting Windows 98 + Linux was the way to go. I started trying to tweak my desktop a bit, and found that KDE was way more capable than GNOME, as were most of their similar apps that I was using. KDE basically offered me everything that I loved about Windows 98, but on steroids. KWrite/KATE was notepad/wordpad but on steroids. The syntax highlighting was absolutely superb. KCalc was a fine calc.exe replacement. Konqueror made Windows Explorer seem like a joke in comparison.

Konqueror offered network transparency, thumbnails of everything rather quickly, even of text files (no pesky thumbs.db either!). An info list view which was downright amazing:

This is a must have feature here. Too often are documents distributed under meaningless file names. With this, and many other small features, nothing else I've seen has even come close to Konqueror in terms of file management.

Konsole was just like my handy DOS Prompt, except with tab support, and better maximizing, and copy and paste support. KHexEdit was simply better than anything I had for free on Windows. KolourPaint is mspaint.exe with support for way more image formats. K3b was also head and shoulders above Easy CD Creator or Nero Burning ROM. For Web Browsers, I was already using Mozilla anyway on Windows, and had the same option on Linux too.

For the basic UI, not only did KDE have everything I liked about Windows, it came with an organized start menu. Which also popped out directly, instead from a "programs" submenu.

The taskbar was also enhanced that I could stick various applets on it. I could stick volume control directly on it. Not a button which popped out a slider, the sliders themselves could appear on the taskbar. Now for example, I could easily adjust my microphone volume directly, without popping up or clicking on anything extra. There was an eyedropper which you could just push to find out the HTML color of anything appearing on the screen - great for web developers. Another thing which I absolutely love, I can see all my removable devices listed directly on my taskbar. If I have a disk in my drive, I see an icon for that drive appearing directly on my taskbar, and I can browse it, burn to it, eject it, whatever. With this, everything I need is basically at my finger tips.

Before long I found myself using Linux/KDE all the time. On newer machines I started to dual boot Windows XP with Linux/KDE, so I could play the occasional Windows game when I wanted to, but for real work, I'd be using Linux.

Then KDE 4 comes out, and basically half the stuff I loved about KDE was removed. No longer is it Windows on steroids. Now KDE 4.0 was not intended for public consumption. Somehow all the distros except for Debian seemed to miss that. Everyone wants to blame KDE for miscommunicating this, but it's quite clear 4.0 was only for developers if you watched the KDE 4 release videos. Any responsible package maintainer with a brain in their skull should've also realized that 4.0 was not ready for prime time. Yet it seems the people managing most distros are idiots that just need the "latest" version of everything, ignoring if it's better or not, or even stable.

At the time, all these users were upset, and all started switching to GNOME. I don't know why anyone who truly loved the power KDE gave you would do that. If KDE 3 > GNOME 2 > KDE 4, why did people migrate to GNOME 2 when they could just not "upgrade" from KDE 3? It seems to me that people never really appreciated what KDE offers in the first place if they bothered moving to GNOME instead of keeping what was awesome.

Nowadays people tell me that KDE 4 has feature parity with KDE 3, but I have no idea what they're talking about. The Konqueror info list feature that I described above still doesn't seem to exist in KDE 4. You can no longer have applets directly on your taskbar. Now I have to click a button to pop up a list of my devices, and only then can I do something with them. No way to quickly glance to see what is currently plugged in. Konsole's tabs now stretch to the full width of your screen for some pointless reason. If you want to switch between tabs with your mouse, prepare for carpal tunnel syndrome. Who thought that icons should grow if they can? It's exactly like those idiotic theoretical professors who spout that CPU usage must be maximized at all times, and therefore use algorithms that schedule things for maximum power draw, despite that in normal cases performance does not improve by using these algorithms. I'd rather pack in more data if possible, having multiple columns of information instead of just huge icons.

KHexEdit has also been completely destroyed. No longer is the column count locked to hex (16). I can't imagine anyone who seriously uses a hex editor designed the new version. For some reason, KDE now also has to act like your mouse only has one button, and right click context menus vanished from all over the place. It's like the entire KDE development team got invaded by Mac and GNOME users who are too stupid to deal with anything other than a big button in the middle of the screen.

Over in the Windows world. Windows 6 (with 6.1 comically being consumerized as "Windows 7") came out with a bit of a revamp. The new start menu seems to fail basic quality assurance tests for anything other than default settings. Try to set things to offer a classic start menu, this is what you get:

If you use extra large icons for your grandparents, you also find that half of the Control Panel is now inaccessible. Ever since Bill Gates left, it seems Windows is going down the drain.

But hardly are problems solely for KDE and Windows, GNOME 3 is also a major step back according to what most people tell me. Many of these users are now migrating to XFCE. If you like GNOME 2, why are you migrating to something else for? And what is it with people trying to fix what isn't broken? If you want to offer an alternate interface, great, but why break or remove the one you already have?

Now a new version of Windows is coming out with a new interface being called "Metro". They should really be calling it "Retro". It's Windows 3 Program Manager with a bunch of those third party add-ons, with a more modern look to it. Gone is the Windows 4+ taskbar so you can see what was running, and easily switch applications via mouse. Now you'll need to press the equivalent of a minimize all to get to the actual desktop. Another type of minimize to get back to Program Manager to launch something else, or "start screen" as they're now calling it.

So say goodbye to all the usability and productivity advantages Windows 4 offered us, they want to take us back to the "dark ages" of modern computing. Sure a taskbar-less interface makes sense on handheld devices with tiny screens or low resolution, but on modern 19"+ screens? The old Windows desktop+taskbar in the upcoming version of Windows is now just another app in their Metro UI. So "Metro" apps won't appear on the classic taskbar, and "classic" applications won't appear on the Metro desktop where running applications are listed.

I'm amazed at how self destructive the entire market became over the last few years. I'm not even sure who to blame, but someone has to do something about it. It's nice to see that a small group of people took KDE 3.5, and are continuing to develop it, but they're rebranding everything with crosses everywhere and calling it "Trinity" desktop. Just what we need, to bring religious issues now into desktop environments. What next? The political desktop?

Saturday, October 30, 2010

This just in, 20% of enterprises and most IT people are idiots

Posted by insane coder at Saturday, October 30, 2010

So, who still runs Internet Explorer 6? I do, because sometimes, I'm a web developer.
Along with IE 6, I also run later versions of IE, as well as Firefox, Opera, Chrome, Safari, Arora, Maxthon, and Konqueror. So do my web developer friends and coworkers.

The reason why we do is simple. We want to test our products with every browser with any sort of popularity. Or is a browser that comes with some sort of OS or environment with any sort of popularity. Same goes for browsers with a specific engine.

By playing with so many browsers, we get a feel for things which seem to not be well known (even when documented well), or completely missed by the "pros" who write the most noise on the subject at hand. After work, sometimes my coworkers and I like to get together and joke about how Google security researchers put out a security memo on IE, citing no recourse, while the solution is clearly documented on MSDN, or any similar scenario.

Perhaps we're bad guys for not writing lengthy articles on every subject, and keeping knowledge to ourselves. On the other hand, we get to laugh at the general population on a regular basis. Something which I'm sure every geek at heart revels in.

Here's a small sampling of popular beliefs that we snicker at:

Internet Explorer 6 can't properly display transparent PNGs without resorting to fancy CSS+JS hacks.

JavaScript event handling code can't receive the event handle with IE.

Menus are required to be written in JavaScript to work with all browsers.

IE is unable to receive multiple cookies in a single Set-Cookie field.

Cookies is the (only) proper way to store HTTP state.

IE is not the most advanced browser when it comes to text and typography (because it is, by far).

SSL/TLS can never be used with multiple virtual hosts.

Common examples of how to write proper cross browser code is accurate. (Such as common methods to support embedded fonts for IE and other browsers break Konqueror, when supporting all of them is a piece of cake. Or use embed tags for flash.)

HTTP doesn't have native authentication abilities.

JavaScript should be included in HTML head.

SQL query parameters should be escaped.

Faith in standards committees, large corporations, open source projects, or security researchers.

That multiple versions of IE can't be run on Windows easily, especially older versions of IE on newer versions of Windows.

That last one is actually what prompted this article. 20% of enterprises say they can't upgrade to newer versions of Windows because they need to use IE 6. On top of this, almost every IT guy who had anything to say about this believe in this situation or mentions virtualization as an out. Heck, even Microsoft themselves are saying you need a special XP mode in Windows 7 for IE 6, as is every major article site on the net who comment on this situation.

Hilarious considering that you can install and run IE 6 just fine in Windows 7. There's plenty of solutions out there besides that one too. They've also been around for several years.

Anyways, experts, pros, designers, IT staff and average Internet surfers out there, just keep on being clueless on every single topic, some of us are having a real laugh.

Thursday, June 3, 2010

Undocumented APIs

Posted by insane coder at Thursday, June 03, 2010

This topic has been kicked around all over for at least a decade. It's kind of hard to read a computer programming magazine, third party documentation, technology article site, third party developer mailing list or the like without running into it.

Inevitably someone always mentions how some Operating System has undocumented APIs, and the creators of the OS are using those undocumented APIs, but telling third party developers not to use them. There will always be the rant about unfair competition, hypocrisy, and other evils.

Large targets are Microsoft and Apple, for creating undocumented APIs that are used by Internet Explorer, Microsoft Office, Safari, iPhone development, and other in-house applications. To the detriment of Mozilla, OpenOffice.org, third parties, and hobbyist developer teams.

If you wrote one of those articles/rants/complaints, or agreed with them, I'm asking you to turn in your Geek Membership card now. You're too stupid, too idiotic, too dumb, too nearsighted, and too vilifying (add other derogatory adjectives you can think of) to be part of our club. You have no business writing code or even discussing programming.

Operating Systems and other applications that communicate via APIs provide public interfaces that are supposed to be stable. If you find a function in official public documentation, that generally signifies it is safe to use the function in question, and the company will support it for the foreseeable future versions. If a function is undocumented, it's because the company doesn't think they will be supporting the function long term.

Now new versions which provide even the exact same API inevitably break things. It is so much worse for APIs which you're not supposed to be using in the first place. Yet people who complain about undocumented APIs, also complain when the new version removes a function they were using. They blame the company for purposely removing features they knew competition or hobbyists were using. That's where the real hypocrisy is.

As much as you hate a new version breaking your existing software, so does Microsoft, Apple, and others. The popularity of their products is tied to what third party applications exist. Having popular third party applications break on new versions is a bad thing. That's why they don't document APIs they plan on changing.

Now why do they use them then? Simple, you can write better software when you intimately understand the platform you're developing for, and write the most direct code possible. When issues occur because of their use, it's easy for the in-house teams to test their own software for compatibility with upcoming versions and fix them. If Microsoft sees a new version of Windows break Internet Explorer, they can easily patch the Internet Explorer provided with the upcoming version, or provide a new version altogether. But don't expect them to go patching third party applications, especially ones which they don't have the source to. They may have done so in the past, and introduced compatibility hacks, but this really isn't their problem, and you should be grateful when they go out of their way to do so.

So quit your complaining about undocumented APIs, or please go around introducing yourself as "The Dumb One".

Monday, October 19, 2009

Why online services suck

Posted by insane coder at Monday, October 19, 2009

Does anyone other than me think online services suck?

The thing that annoys me the most is language settings. Online service designers one day had this great idea to check the geographical IP address the user visited their site from, and use it to automatically set the language to the native one for the country they visited from. While this sounds nice in theory, most people only know their mother tongue, and also go on vacation now and then, or visit some other country for business purposes.

So here I am, on business in a foreign country, and I connect my laptop into the Ethernet jack in my hotel room which comes with free Internet access, so I can check my e-mail. What's the first thing I notice? The entire interface is no longer in English. Even worse is that the various menu items and buttons are moved around in this other language.

Even Google, known for being ahead of the curve when it comes to web services can't help but make the same mistakes. I'm sitting here looking at the menu on top of Blogger, wondering which one is login.

For Google this is a worse offense compared to other service providers, as I already was logged into their main site.

Google keeps their cookies set for all eternity (well, until the next time rollover disaster), and they know I always used Google in English. Now it sees me connecting from a different country than usual and thinks I want my language settings switched? Even after I set it to English on their main page, I have to figure out how to set it to English again on Blogger and YouTube?

What's really sad about all this is that every web browser sends each website as part of its request a "user agent", which tells the web server the name of the browser, a version number, operating system details, and language information. My browser is currently sending: "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)". Notice the en-US? That tells the site the browser is in English, for the United States. If I downloaded a different version of Firefox, or installed a language package and switched Firefox to a different language, it would tell the web server that I did so. If one uses Windows in another language, Internet Explorer will also tell the web server the language Windows/Internet Explorer is in.

Why are these service providers ignoring browser information, and instead solely looking at geographical information? People travel all the times these days. Let us also not forget those in restrictive countries who use foreign proxy servers to access the internet.

However, common issues, such as annoying language support is hardly the end of the problems. In terms of online communication, virtually all of them suffer from variations of spam. Again, where is Google here? Every time I go read comments on Blogger, I see nothing but spam posts. Even when I go to cleanup my own site, the spam just fills up again a few days later.

Where's the flag as spam button? Where's the flag this user as solely a spammer button?

Sure Google as a site manager lets me block all comments on my site till I personally review them to see if they're spam, but in today's need for hi-speed communication is that really an option when you may have a hot topic on hand? Why can't readers flag posts on their own?

In terms of management, why doesn't Blogger's site management features include a list where I can check off posts and hit one mass delete, instead of having to click delete and "Yes I'm sure" on each and every spam post? Why can't I delete all posts from user X and ban that user from ever posting on my site again?

Okay, maybe this isn't so much an article why online services suck, but more about language and spam complaints, and mostly at Google for the moment. Jet-lag, and getting your E-mail interface in Gibberish does wonders for a friendly post. I'll try to come up something better for my next article.

Thursday, June 18, 2009

State of sound in Linux not so sorry after all

Posted by insane coder at Thursday, June 18, 2009

About two years ago, I wrote an article titled the "The Sorry State of Sound in Linux", hoping to get some sound issues in Linux fixed. Now two years later a lot has changed, and it's time to take another look at the state of sound in Linux today.

A quick summary of the last article for those that didn't read it:

Sound in Linux has an interesting history, and historically lacked sound mixing on hardware that was more software based than hardware.
Many sound servers were created to solve the mixing issue.
Many libraries were created to solve multiple back-end issues.
ALSA replaced OSS version 3 in the Kernel source, attempting to fix existing issues.
There was a closed source OSS update which was superb.
Linux distributions have been removing OSS support from applications in favor of ALSA.
Average sound developer prefers a simple API.
Portability is a good thing.
Users are having issues in certain scenarios.

Now much has changed, namely:

OSS is now free and open source once again.
PulseAudio has become widespread.
Existing libraries have been improved.
New Linux Distributions have been released, and some existing ones have attempted an overhaul of their entire sound stack to improve users' experience.
People read the last article, and have more knowledge than before, and in some cases, have become more opinionated than before.
I personally have looked much closer at the issue to provide even more relevant information.

Let's take a closer look at the pros and cons of OSS and ALSA as they are, not five years ago, not last year, not last month, but as they are today.

First off, ALSA.
ALSA consists of three components. First part is drivers in the Kernel with an API exposed for the other two components to communicate with. Second part is a sound developer API to allow developers to create programs which communicate with ALSA. Third part is a sound mixing component which can be placed between the other two to allow multiple programs using the ALSA API to output sound simultaneously.

To help make sense of the above, here is a diagram:

Note, the diagrams presented in this article are made by myself, a very bad artist, and I don't plan to win any awards for them. Also they may not be 100% absolutely accurate down to the last detail, but accurate enough to give the average user an idea of what is going on behind the scenes.

A sound developer who wishes to output sound in their application can take any of the following routes with ALSA:

Output using ALSA API directly to ALSA's Kernel API (when sound mixing is disabled)
Output using ALSA API to sound mixer, which outputs to ALSA's Kernel API (when sound mixing is enabled)
Output using OSS version 3 API directly to ALSA's Kernel API
Output using a wrapper API which outputs using any of the above 3 methods

As can be seen, ALSA is quite flexible, has sound mixing which OSSv3 lacked, but still provides legacy OSSv3 support for older programs. It also offers the option of disabling sound mixing in cases where the sound mixing reduced quality in any way, or introduced latency which the end user may not want at a particular time.

Two points should be clear, ALSA has optional sound mixing outside the Kernel, and the path ALSA's OSS legacy API takes lacks sound mixing.

An obvious con should be seen here, ALSA which was initially designed to fix the sound mixing issue at a lower and more direct level than a sound server doesn't work for "older" programs.

Obvious pros are that ALSA is free, open source, has sound mixing, can work with multiple sound cards (all of which OSS lacked during much of version 3's lifespan), and included as part of the Kernel source, and tries to cater to old and new programs alike.

The less obvious cons are that ALSA is Linux only, it doesn't exist on FreeBSD or Solaris, or Mac OS X or Windows. Also, the average developer finds ALSA's native API too hard to work with, but that is debatable.

Now let's take a look at OSS today. OSS is currently at version 4, and is a completely different beast than OSSv3 was.
Where OSSv3 went closed source, OSSv4 is open sourced today, under GPL, 3 clause BSD, and CDDL.
While a decade old OSS was included in the Linux Kernel source, the new greatly improved OSSv4 is not, and thus may be a bit harder for the average user to try out. Older OSSv3 lacked sound mixing and support for multiple sound cards, OSSv4 does not. Most people who discuss OSS or try OSS to see how it stacks up against ALSA unfortunately are referring to, or are testing out the one that is a decade old, providing a distortion of the facts as they are today.

Here's a diagram of OSSv4:

A sound developer wishing to output sound has the following routes on OSSv4:

Output using OSS API right into the Kernel with sound mixing
Output using ALSA API to the OSS API with sound mixing
Output using a wrapper API to any of the above methods

Unlike in ALSA, when using OSSv4, the end user always has sound mixing. Also because sound mixing is running in the Kernel itself, it doesn't suffer from the latency ALSA generally has.

Although OSSv4 does offer their own ALSA emulation layer, it's pretty bad, and I haven't found a single ALSA program which is able to output via it properly. However, this isn't an issue, since as mentioned above, ALSA's own sound developer API can output to OSS, providing perfect compatibility with ALSA applications today. You can read more about how to set that up in one of my recent articles.

ALSA's own library is able to do this, because it's actually structured as follows:

As you can see, it can output to either OSS or ALSA Kernel back-ends (other back-ends too which will be discussed lower down).

Since both OSS and ALSA based programs can use an OSS or ALSA Kernel back-end, the differences between the two are quite subtle (note, we're not discussing OSSv3 here), and boils down to what I know from research and testing, and is not immediately obvious.

OSS always has sound mixing, ALSA does not.
OSS sound mixing is of higher quality than ALSA's, due to OSS using more precise math in its sound mixing.
OSS has less latency compared to ASLA when mixing sound due to everything running within the Linux Kernel.
OSS offers per application volume control, ALSA does not.
ALSA can have the Operating System go into suspend mode when sound was playing and come out of it with sound still playing, OSS on the other hand needs the application to restart sound.
OSS is the only option for certain sound cards, as ALSA drivers for a particular card are either really bad or non existent.
ALSA is the only option for certain sound cards, as OSS drivers for a particular card are either really bad or non existent.
ALSA is included in Linux itself and is easy to get ahold of, OSS (v4) is not.

Now the question is where does the average user fall in the above categories? If the user has a sound card which only works (well) with one or the other, then obviously they should use the one that works properly. Of course a user may want to try both to see if one performs better than the other one.

If the user really needs to have a program output sound right until Linux goes into suspend mode, and then continues where it left off when resuming, then ALSA is (currently) the only option. I personally don't find this to be a problem, and furthermore I doubt it's a large percentage of users that even use suspend in Linux. Suspend in general in Linux isn't great, due to some rogue piece of hardware like a network or video card which screws it up.

If the user doesn't want a hassle, ALSA also seems the obvious choice, as it's shipped directly with the Linux Kernel, so it's much easier for the user to use a modern ALSA than it is a modern OSS. However it should be up to the Linux Distribution to handle these situations, and to the end user, switching from one to the other should be seamless and transparent. More on this later.

Yet we also see due to better sound mixing and latency when sound mixing is involved, that OSS is the better choice, as long as none of the above issues are present. But the better mixing is generally only noticed at higher volume levels, or rare cases, and latency as I'm referring to is generally only a problem if you play heavy duty games, and not a problem if you just want to listen to some music or watch a video.

But wait this is all about the back-end, what about the whole developer API issue?

Many people like to point fingers at the various APIs (I myself did too to some extent in my previous article). But they really don't get it. First off, this is how your average sound wrapper API works:

The program outputs sound using a wrapper, such as OpenAL, SDL, or libao, and then sound goes to the appropriate high level or low level back-end, and the user doesn't have to worry about it.

Since the back-ends can be various Operating Systems sound APIs, they allow a developer to write a program which has sound on Windows, Mac OS X, Linux, and more pretty easily.

Some like Adobe like to say how this is some kind of problem, and makes it impossible to output sound in Linux. Nothing could be further from the truth. Graphs like these are very misleading. OpenAL, SDL, libao, GStreamer, NAS, Allegro, and more all exist on Windows too. I don't see anyone complaining there.

I can make a similar diagram for Windows:

This above diagram is by no means complete, as there's XAudio, other wrapper libs, and even some Windows only sound libraries which I've forgotten the name of.

This by no means bothers anybody, and should not be made an issue.

In terms of usage, the libraries stack up as follows:
OpenAL - Powerful, tricky to use, great for "3D audio". I personally was able to get a lot done by following a couple of example and only spent an hour or two adding sound to an application.
SDL - Simplistic, uses a callback API, decent if it fits your program design. I personally was able to add sound to an application in half an hour with SDL, although I don't think it fits every case load.
libao - Very simplistic, incredibly easy to use, although problematic if you need your application to not do sound blocking. I added sound to a multitude of applications using libao in a matter of minutes. I just think it's a bit more annoying to do if you need to give your program its own sound thread, so again depends on the case load.

I haven't played with the other sound wrappers, so I can't comment on them, but the same ideas are played out with each and every one.

Then of course there's the actual OSS and ALSA APIs on Linux. Now why would anyone use them when there are lovely wrappers that are more portable, customized to match any particular case load? In the average case, this is in fact true, and there is no reason to use OSS or ALSA's API to output sound. In some cases, using a wrapper API can add latency which you may not want, and you don't need any of the advantages of using a wrapper API.

Here's a breakdown of how OSS and ALSA's APIs stack up.
OSSv3 - Easy to use, most developers I spoke to like it, exists on every UNIX but Mac OS X. I added sound to applications using OSSv3 in 10 minutes.
OSSv4 - Mostly backwards compatible with v3, even easier to use, exists on every UNIX except Mac OS X and Linux when using the ALSA back-end, has sound re-sampling, and AC3 decoding out of the box. I added sound to several applications using OSSv4 in 10 minutes each.
ALSA - Hard to use, most developers I spoke to dislike it, poorly documented, not available anywhere but Linux. Some developers however prefer it, as they feel it gives them more flexibility than the OSS API. I personally spent 3 hours trying to make heads or tails out of the documentation and add sound to an application. Then I found sound only worked on the machine I was developing on, and had to spend another hour going over the docs and tweaking my code to get it working on both machines I was testing on at the time. Finally, I released my application with the ALSA back-end, to find several people complaining about no sound, and started receiving patches from several developers. Many of those patches fixed sound on their machine, but broke sound on one of my machines. Here we are a year later, and my application after many hours wasted by several developers, ALSA now seems to output sound decently on all machines tested, but I sure don't trust it. We as developers don't need these kinds of issues. Of course, you're free to disagree, and even cite examples how you figured out the documentation, added sound quickly, and have it work flawlessly everywhere by everyone who tested your application. I must just be stupid.

Now I previously thought the OSS vs. ALSA API issue was significant to end users, in so far as what they're locked into, but really it only matters to developers. The main issue is though, if I want to take advantage of all the extra features that OSSv4's API has to offer (and I do), I have to use the OSS back-end. Users however don't have to care about this one, unless they use programs which take advantage of these features, which there are few of.

However regarding wrapper APIs, I did find a few interesting results when testing them in a variety of programs.
App -> libao -> OSS API -> OSS Back-end - Good sound, low latency.
App -> libao -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> libao -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> libao -> ALSA API -> ALSA Back-end - Bad sound, horrible latency.
App -> SDL -> OSS API -> OSS Back-end - Good sound, really low latency.
App -> SDL -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> SDL -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> SDL -> ALSA API -> ALSA Back-end - Good sound, minor latency.
App -> OpenAL -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OpenAL -> OSS API -> ALSA Back-end - Adequate sound, bad latency.
App -> OpenAL -> ALSA API -> OSS Back-end - Bad sound, bad latency.
App -> OpenAL -> ALSA API -> ALSA Back-end - Adequate sound, bad latency.
App -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> ALSA API -> OSS Back-end - Great sound, low latency.
App -> ALSA API -> ALSA Back-end - Good sound, bad latency.

If you're having a hard time trying to wrap your head around the above chart, here's a summary:

OSS back-end always has good sound, except when using OpenAL->ALSA to output to it.
ALSA generally sounds better when using the OSS API, and has lower latency (generally because that avoids any sound mixing as per an earlier diagram).
OSS related technology is generally the way to go for best sound.

But wait, where do sound servers fit in?

Sounds servers were initially created to deal with problems caused by OSSv3 which currently are non existent, namely sound mixing. The sound server stack today looks something like this:

As should be obvious, these sound servers today do nothing except add latency, and should be done away with. KDE 4 has moved away from the aRts sound server, and instead uses a wrapper API known as Phonon, which can deal with a variety of back-ends (which some in themselves can go through a particular sound server if need be).

However as mentioned above, ALSA's mixing is not of the same high quality as OSS's is, and ALSA also lacks some nice features such as per application volume control.

Now one could turn off ALSA's low quality mixer, or have an application do it's own volume control internally via modifying the sound wave its outputting, but these choices aren't friendly towards users or developers.

Seeing this, Fedora and Ubuntu has both stepped in with a so called state of the art sound server known as PulseAudio.

If you remember this:

As you can see, ALSA's API can also output to PulseAudio, meaning programs written using ALSA's API can output to PulseAudio and use PulseAudio's higher quality sound mixer seamlessly without requiring the modification of old programs. PulseAudio is also able to send sound to another PulseAudio server on the network to output sound remotely. PulseAudio's stack is something like this:

As you can see it looks very complex, and a 100% accurate breakdown of PulseAudio is even more complex.

Thanks to PulseAudio being so advanced, most of the wrapper APIs can output to it, and Fedora and Ubuntu ship with all that set up for the end user, it can in some cases also receive sound written for another sound server such as ESD, without requiring ESD to run on top of it. It also means that many programs are now going through many layers before they reach the sound card.

Some have seen PulseAudio as the new Voodoo which is our new savior, sound written to any particular API can be output via it, and it has great mixing to boot.

Except many users who play games for example are crying that this adds a TREMENDOUS amount of latency, and is very noticeable even in not so high-end games. Users don't like hearing enemies explode a full 3 seconds after they saw the enemy explode on screen. Don't let anyone kid you, there's no way a sound server, especially with this level of bloat and complexity ever work with anything approaching low latency acceptable for games.

Compare the insanity that is PulseAudio with this:

Which do you think looks like a better sound stack, considering that their sound mixing, per application volume control, compatibility with applications, and other features are on par?

And yes, lets not forget the applications. I'm frequently told about how some application is written to use a particular API, therefore either OSS or ALSA need to be the back-end they use. However as explained above, either API can be used on either back-end. If setup right, you don't have to have a lack of sound using newer version of Flash when using the OSS back-end.

So where are we today exactly?
The biggest issues I find is that the Distributions simply aren't setup to make the choice easy on the users. Debian and derivatives provide a Linux sound base package to select whether you want OSS or ALSA to be your back-end, except it really doesn't do anything. Here's what we do need from such a package:

On selecting OSS, it should install the latest OSS package, as well as ALSA's ALSA API->OSS back-end interface, and set it up.
Minimally configure an installed OpenAL to use OSS back-end, and preferably SDL, libao, and other wrapper libraries as well.
Recognize the setting when installing a new application or wrapper library and configure that to use OSS as well.
Do all the above in reverse when selecting ALSA instead.

Such a setup would allow users to easily switch between them if their sound card only worked with the one which wasn't the distribution's default. It would also easily allow users to objectively test which one works better for them if they care to, and desire to use the best possible setup they can. Users should be given this capability. I personally believe OSS is superior, but we should leave the choice up to the user if they don't like whichever is the system default.

Now I repeatedly hear the claim: "But, but, OSS was taken out of the Linux Kernel source, it's never going to be merged back in!"

Let's analyze that objectively. Does it matter what is included in the default Linux Kernel? Can we not use VirtualBox instead of KVM when KVM is part of the Linux Kernel and VirtualBox isn't? Can we not use KDE or GNOME when neither of them are part of the Linux Kernel?

What matters in the end is what the distributions support, not what's built in. Who cares what's built in? The only difference is that the Kernel developers themselves won't maintain anything not officially part of the Kernel, but that's the precise jobs that the various distributions fill, ensuring their Kernel modules and related packages work shortly after each new Kernel comes out.

Anyways, a few closing points.

I believe OSS is the superior solution over ALSA, although your mileage may vary. It'd be nice if OSS and ALSA just shared all their drivers, not having an issue where one has support for one sound card, but not the other.

OSS should get suspend support and anything else it lacks in comparison to ALSA even if insignificant. Here's a hint, why doesn't Ubuntu hire the OSS author and get it more friendly in these last few cases for the end user? He is currently looking for a job. Also throw some people at it to improve the existing volume controlling widgets to be friendlier with the new OSSv4, and maybe get stuff like HAL to recognize OSSv4 out of the box.

Problems should be fixed directly, not in a roundabout matter as is done with PulseAudio, that garbage needs to go. If users need remote sound (and few do), one should just be easily able to map /dev/dsp over NFS, and output everything to OSS that way, achieving network transparency on the file level as UNIX was designed for (everything is a file), instead of all these non UNIX hacks in place today in regards to sound.

The distributions really need to get their act together. Although in recent times Draco Linux has come out which is OSS only, and Arch Linux seems to treat OSSv4 as a full fledged citizen to the end user, giving them choice, although I'm told they're both bad in the the ALSA compatibility department not setting it up properly for the end user, and in the case of Arch Linux, requiring the user to modify the config files of each application/library that uses sound.

OSS is portable thanks to its OS abstraction API, being more relevant to the UNIX world as a whole, unlike ALSA. FreeBSD however uses their own take on OSS to avoid the abstraction API, but it's still mostly compatible, and one can install the official OSSv4 on FreeBSD if they so desire.

Sound in Linux really doesn't have to be that sorry after all, the distributions just have to get their act together, and stop with all the finger pointing, propaganda, and FUD that is going around, which is only relevant to ancient versions of OSS, if not downright irrelevant or untrue. Let's stop the madness being perpetrated by the likes of Adobe, PulseAudio propaganda machine, and whoever else out there. Let's be objective and use the best solutions instead of settling for mediocrity or hack upon hack.

Tuesday, May 12, 2009

Will Linux ever be mainstream?

Posted by insane coder at Tuesday, May 12, 2009

Constantly different sites and communities always discuss the possibility of Linux becoming mainstream and when the mainstreaming will take place. Often reasons are laid out where Linux is lacking. Most reasons don't seem to be in touch with reality. This will be an attempt to go over some of those reasons, cut out the fluff from the fact, and perhaps touch on a few areas that have not been gone over yet.

One could argue with today's routers that Linux is already mainstream, but let us focus more on full blown computer Linux, which runs on servers, workstation, and home computers.

When it comes to servers, the question really is who isn't running Linux? Practically every medium sized or larger company runs Linux on a couple of their servers. What makes Linux so compelling that many companies have at least one if not many Linux servers?

Servers are a very different breed of computer than the workstation or home computer. "Desktop Linux" as it's known is the type of OS for average everyday Joe. Joe is the kind of guy who wants to sit down and do a few specific tasks. He expects those tasks to be easy to do, and be mostly the same on every computer. He doesn't expect anything about the 'tasks' to scare him. He accepts the program may crash or go haywire in the middle, at which time it's just new cup of coffee time. Except Desktop Linux isn't for every day Joe ... yet.

Servers on the other hand are designed primarily for functionality. They have to have maximum up time. It doesn't matter if the server is hard to understand, and work with, and only two guys in the whole office can make heads or tails out of it. It's okay that the company needs to hire two guys with PhDs, who are complete recluses, and never attend a single office party.

Windows servers are primarily used by those that need special Windows functionality at the office, such as ActiveDirectory, or Exchange so everyone has integrated Outlook. Some even use Windows as HTTP Servers and the like. Windows is less known for working, but being great for those specialized tasks, or servers which don't need those two PhD recluses to manage. Even guys who have never written a piece of code in their entire life can manage a Windows server - usually. Microsoft always tries to press this latter point home with all their get the facts campaigns.

The real fact is though that companies on their servers need functionality, reliability, and countability. While larger companies would prefer to replace every man with a machine which is guaranteed to last forever and not require a single ounce of maintenance, they would rather rely on personnel than hardware. Sure, when I'm a really small business, I'd rather have a server I can manage myself and have a clue what I'm doing, but if I had the money, I'd rather have expert geeky Greg who I can count on to keep our hardware setup afloat. Even when geeky Greg is a bit more expensive than laid-back Larry, I'm happier knowing that I have the best people on the job.

Windows servers while being great in their niches, are also a pain in the neck in more generalized applications. We have a Windows HTTP/FTP server at work. One day it downloads security patches from Microsoft, and suddenly HTTP/FTP stop working entirely. Our expert laid-back Larry spent a few hours looking at the machine trying to find out what changed, and mostly had to resort to using Google as opposed to any knowledge of how Windows works. Finally he sees on some site that Microsoft changed some firewall settings to be extra restrictive, and managed to fix the problem.

Another time, part of the server got hacked into, and we have to reinstall most of it. For some reason, a subsection of our site just refused to work, apparently a permission problem somewhere. On Linux/Apache, permission problems are either a setting in Apache or on the file system, easy to find. Windows on the other hand, with their oh-so-much-better fine grained permission support seem to have dozens if not hundreds of places where to look for security settings. This took our Larry close to two weeks to fix.

Yet another time, a server application which we wrote in-house ran flawlessly on both Linux and Windows XP. However, when we installed it on our Windows Server 2003 server, it inexplicably didn't work. It's no wonder companies use Linux servers for many server tasks. There's also a decent amount of server applications a company can purchase from Red Hat, IBM, Oracle, and a couple of other companies. Linux on the server clearly rocks, even various statistical sites agree.

Now let us move on to the workstation and home computer segment, where we'll see a very different picture.

On the workstation, two features are key, manageability, and usability. Companies like to make sure that they can install new programs across the board, that they can easily update across the board, and change settings on every single machine in the office from one location. Granted on Linux one can log in as root to any machine and do what they want, but how many applications are there that allow me to automate management remotely? For example, apt-get (and its derivatives) are known as one of the best package managers for Desktop Linux, yet they don't have any way to send a call to update to every single machine on a network. Sure using NFS I can have an ActiveDirectory like setup where any user can log into any machine and get their settings and files, but how exactly do I push changes to the software on the machines themselves? Every place I asked this question to seems to have their own customized setup.

One place SSHs into every single machine individually, and then paste some huge command into the terminal. Another upgrades one machine, mirrors the hard drive, then goes to each machine in turn and re-images the installed hard disk. One place which employs a decent number of programmers wrote a series of scripts which every night download a file from a server and execute it. Another, also with an excellent programming task force, wrote their own SSH based application which logs into every machine on the network and runs whichever commands the admin puts in on all of them, allowing real time pushing of updates to all the machines at once.

Is it any wonder that a large company is scared to have Linux on all their machines or that it really is expensive to maintain? We keep toting/hearing how amazing X is because of the client/server setup, or these days in regards to PulseAudio, let us start hearing it for massive remote management. And remember not to limit this just to installing packages, we need to be able to change system files remotely and simultaneously, with a method which becomes standard.

The other aspect if of course usability, and by usability I mean being able to use the kind of software the company needs. Now for some companies, documents, spreadsheets, and web browsers are the extent of the applications they need, and for that we're already there. Unless of course they also need 100% compatibility with the office suites used by other companies.

What about specialized niches though? That's where real companies have their major work done. These companies are using software to manage medical history, other clientèle meta-data, stocks (both monetary and in-store), and multitudes of other specialized fields. All these applications more or less connect to some server somewhere and do database manipulation. We're really talking about webapps in desktop form. Why is every last one of these 3rd party applications only written for Windows?

The reasons are probably threefold. If these applications worked in any standard browser, we're really providing more functionality in them than should be exposed to the user. Do you want the user to hit stop or the close button in the corner of their browser in middle of a transaction? Sure, the database should be robust and atomic enough to handle these kinds of situations, but do we want to spoon-feed these situations to the users? We also certainly don't want general system upgrades which would install a newer version of the browser to break one of the key applications being used by the company. To solve this problem requires a custom browser, bringing us back to square one when it comes to making this a desktop application.

The next reason is known as catch-22. Why should a generic company making an application bother with anything than the most popular OS by a landslide? We need way more Desktop Linux users for a company to bother, but if the companies don't bother, it's unlikely that Desktop users will switch to Linux. Also, as I've said before, portability isn't difficult in most cases, but most won't bother unless we enlighten them.

Lastly, many of these applications are old, or at least most of their code base is. There's just no incentive to rewrite them. And when one of these applications is made in-house, it'll be made for what the rest of the company is already running.

To get Linux onto the workstation then, we need the following to take place:

Creation of standardized massive management tools
Near perfect interoperability of office suites
Get ports of Linux office suites to be mainstream on Windows too
Get work oriented applications on Windows to be written portably
Make Linux more popular on the Desktop in all aspects

We have to stop being scared of Open Source on closed sourced Operating Systems, if half the offices out there used Open Office on Windows, they wouldn't mind running Open Office on Linux, and they won't have any different interoperability issues that they don't already have.

We also need to make portability excellence more the norm. These companies could benefit a lot from using Qt for example. Qt has great SQL support. Qt contains a web browser so webapps can be made without providing anything unnecessary in the interface. Qt also has great easy to use client/server support, with SSL to boot. Also, Qt applications are probably the easiest type to make multilingual, and the language can be changed on the fly, which is very important for apps used world wide, or for companies looking to save money by hiring immigrants. Lastly, Qt is easier to use than the Win32 API for these relatively basic applications. If they used 100% Qt, the majority of the time, the program would work on Linux with just a simple recompile.

For the above to happen we really need a major Qt push in the developer community. The fight between GTK, wxWidgets, and Qt is going to be hurting us here. Sure, initially Qt was a lot more closed, and we needed GTK to push Qt in the right direction. But today, Qt is LGPL, offers support/maintenance contracts, and is a good 5-10 years ahead of GTK in breadth of features supplied. Even if you like GTK better for whatever reason, it really can't stand up objectively to Qt from the big business perspective. We need developers to get behind our best development libraries. We also need to get schools to teach the libraries we use as part of the mainstream curriculum. Fracturing the community on this point is only going to hurt us in the long run.

Lastly, we come to Linux on the home computer. What do we need on a home computer exactly? They're used for personal finances, homework, surfing the web, multimedia, creativity, and most importantly, gaming.

Are the finance applications available for Linux good enough? I really have no idea, perhaps someone can enlighten me in the comments. We'll get back to this point shortly.

For homework, I'd say Linux was there already. We have Google and Wikipedia available via the world wide web. Dictionaries and Thesauruses are available too. We got calculators and documents, nothing is really missing.

For surfing the web we're definitely there, no questions asked.

Multimedia we're also there aside from a few annoyances. I'll discuss this more below.

For creativity, I'm not sure where we are. Several years back, it seems all the kids used to love making greeting cards, posters, and the like using programs such as The Print Shop Deluxe or Print Artist. Do we have any decent equivalents on Linux?

Thing is, a company would have to be completely insane to port popular home publishing software to Linux. First there's all the reasons mentioned above regarding catch-22 and the like. Then there's nutjobs like Richard Stallman out there who will crucify the company attempting to port their software to Linux. For starters, see this article which says:

Some of the most important projects on our list are replacement projects. These projects are important because they address areas where users are continually being seduced into using non-free software by the lack of an adequate free replacement.

Notice how they're trying to crush Skype for example. Basically any time a company will port their application to Linux, and it becomes popular enough on Desktop Linux, you'll have these nutjobs calling for the destruction of said program by completely reimplementing it and giving it away for free. And reimplement it they do, even if not as effectively, but adequate enough to dissuade anyone from ever buying the application. Then the free application gets ported to Windows too, effectively destroying the company's business model and generally the company itself. Don't believe they'll take it that far? Look how far they went to stop Qt/KDE. Remember all those old office suites and related applications available for Linux a decade ago? How many of them are still around or in business? When free versions of voice chatting are available on all platforms, and can even interface with standard telephones, do you think Skype will still be around?

Basically, trying to port a popular application to Linux is a great way to get yourself a death sentence. If for example Adobe ever ported Photoshop to Linux, there'd be such a massive upsurge in getting the GIMP or a clone to have a sane interface, and get in some of those last features, Photoshop would probably be dead in a year.

And unless some of these applications are ported to Linux, we'll probably never see niche applications as good as their Windows counterparts. Many programmers just don't care enough to develop these to the extent needed, and some only do so when they feel it's part of a holy war. Thus giving us a whole new dimension to the catch-22.

Finally, we come to gaming. Is Linux good enough for companies to develop for? First glance, and you think a resounding yes. A deeper look reveals otherwise. First off, there's good ol` video. For the big games today, it's all about graphics. How many video cards provide full modern OpenGL support on Linux? The problem is basically as follows. X Windows a system designed way back when with all sorts of cool ideas in mind, where the current driver API is simply not enough to take full advantage of accelerated OpenGL. You can easily search online and find tons of results on why X is really bad, but it really stands out when it comes to video.

NVidia has for several years now put out "evil drivers" which get the job done, and provide fantastic OpenGL support on top of Linux. The drivers though are viewed as evil, since they bypass the bottom 1/3 of X and talk straight to the Kernel, and don't fully follow the X driver API. And of course, they're also closed source. All the other drivers today for the most part communicate with the system via the X API, especially the open sourced drivers. Yet they'll never measure up, because X prevents them from measuring up. But they'll continue to stick to what little X does provide. NVidia keeps citing they can't open source their drivers because they'll lose their competitive advantage. Many have questioned this, as for the most part, the basic principals are the same on all cards, what is so secret in their drivers? When in reality, if they open sourced their drivers, the core functionality would probably be merged into X as a new driver API, allowing ATI and Intel to compete on equal footing, losing their competitive advantage. It's not the card per sè they're trying to hide, but the actual driver API that would allow all cards to take advantage of themselves, bypassing any stupidity in X. At the very least, ATI or Intel could grab a lot of that code and make it easier for themselves to make an X-free driver that works for X well.

When it comes down to it, as tiny as the market share is that Linux already has, it becomes even smaller if you want to release an application that needs good video support. On the other hand, those same video cards work just fine in Windows.

Next comes sound, which I have discussed before. The main sound issue for games is latency, and ALSA (the default in Linux) is really bad in that regard. This gets compounded when sound has to run through a sound server on its way to the drivers that talk to the sound card. For playing music, ALSA seems just fine to everybody, you don't notice or care that the sound starts or stops a moment or two after you press the button. For videos as well, it's generally a non-issue. In most video formats, the video takes longer to decode than it does to process sound, so they're ready at the same time. It also doesn't have to be synced for input. So everything seems fine. In the worst case scenario, you just tell your video player to alter the video/audio sync slightly, and everything is great.

When it comes to games, it's an entirely different ballpark. For the game not to appear laggy, the video has to be synced to the input. You want the gun to fire immediately after the user presses the button, without a lag. Once the bullet hits the enemy and the user sees the enemy explode, you want them to hear that enemy explode. The audio has to be synched to the video. Players will not accept having the sound a second or two late. Now means now. There's no room for all the extra overhead that is currently required.

I find it mind boggling that Ubuntu, a distribution designed for average Joe, decided to make the entire system routed through PulseAudio, and see it as a good thing. The main advantage of PulseAudio is that it has a client/server architecture so that sound generated on one machine can be output on another. How many home users know of this feature, let alone have a reason to use it? The whole system makes sound lag like crazy.

I once wrote a game with a few other developers which uses SDL or libao to output sound. Users way back when used to enjoy it. Nowadays with ALSA, and especially with PulseAudio which SDL and libao default to outputting to in Ubuntu, users keep complaining that the sound lags two or more seconds behind the video. It's amazing this somehow became the default system setup.

Next is input. This one is easy right? Linux surely supports input. Now let me ask you this, how many KDE or GNOME games have you seen that allow you to control them via a Joystick/Gamepad? The answer is quite simply, none of them do. Neither Qt nor GTK provide any input support other than keyboard or mouse. That's right, our premier application framework libraries don't even support one of the most popular inventions of the 80s and 90s for PC gamers.

Basically, here you'll be making a game and using your library to handle both keyboard and mouse support, when you want to add on joystick support, you'll have to switch to a different library, and possibly somehow merge a completely different event loop into the main one your program uses for everything else. Isn't it so much easier on Windows where they provide a unified input API which is part of the rest of the API you're already using?

Modern games tend to include a lot of sound, and more often than not, video as well. It'd be nice to be able to use standard formats for these, right? The various APIs out there, especially Phonon (part of Qt/KDE) is great at playing sound or video for you. But which formats should you be putting your media in? Which formats are you ensured will be available on the system you're deploying on? Basically all these libraries have multiple backends where support can be drastically different, and the most popular formats, such as those based on MPEG standards don't come standard on most Linux distributions, thanks to them being "non free". Next you'll think fine, let us just ship the game with uncompressed media. This actually works fine for audio, but is a mess when it comes to video. Try making a pure uncompressed AVI and running it in Xine, MPlayer, and anything else that can be used as a Phonon back end. No two video players can agree on what the uncompressed AVI format is. Some display the picture upside down, some have different visions of which byte signifies red, blue, and green, and so on.

For all of these reasons, the game market, currently the largest in home software, has difficultly designing and properly deploying games on Linux. The only companies which have managed to do it in the past are those that made major games for DOS back in the day, where there also was no good APIs or solutions for doing anything.

Now that we wrapped it all up from the actual applications side of things, let us have a look at actual usability for the home user.

We're taken back to average Joe who wants to setup his machine. He's not sure what to do. But he hears there are great Ubuntu forums where he can ask for help. He goes and asks, and gets a response similar to the following:

Open a terminal, then type:
sudo /etc/init.d/d restart
ln -s /alt/override /bin/appstart
cd /etc/app
sudo nano b.conf

Preload=yes
ctrl+x
yes

Does anyone realize how intimidating this is? Even if average Joe was really Windows Power User Joe, does he really feel safe entering commands with which he is unfamiliar?

In the Windows world, we'd tell such a user to open up Windows Explorer, navigate to certain directories, copy files, edit files with notepad and the like. Is it really so hard to tell a user to open up Nautilus or Dolphin or whatever their file manager is, and navigate to a certain location and edit a file with gedit/kwrite?

Sure, it is faster to paste a few quick commands into the terminal, but we're turning away potential users. The home user should never be told he has to open a terminal. In 98% of the cases he really doesn't and what he wants/needs can be done via the GUI. Let us start helping these users appropriately.

Next is the myth about compiling. I saw an article written recently that Linux sucks because users have to compile their own software. I haven't compiled any software on Linux in years, except for those applications that I work on myself. Who in the world is still perpetuating this myth?

It's actually sad to see some distributions out there that force users to recompile stuff. No I'm not talking about Gentoo, but Red Hat actually. We have a server running Red Hat at work, we needed mod_rewrite added to it the other day. Guess what? We had to recompile the server to add that module. On Debian based distros one just runs "a2enmod rewrite", and presto the module is installed. Why the heck are distros forcing these archaic design principals on us?

Then there's just the overall confusion, which many others point out. Do I use KDE or GNOME? Pidgin or Kopete? Firefox or Konqueror? X-Chat or Konversation? VLC or MPlayer? RPM or DEB? The question is, is this a problem? So what if we have a lot of choices.

The issue arises when the choice breaks down the field. When deploying applications this can get especially nightmarish. We need to really focus more on providing just the best solution, and improving the best solution if it's lacking in an area, as opposed to having multiple versions of everything. OSS vs. ALSA, RPM vs. DEB, and a bunch of others which are base to the system shouldn't really be around these days.

The other end of the spectrum is less important to providing a coherent system for deploying on. But it does confuse some users. When I want to help someone, do I just assume they use Krusader as a file manager? Should I try to be generic about file managers? Should I have them install Krusader so I can help them? This theme is played over in many variations on most Linux help forums.
"Oh yes, go open that in gedit."
"Gedit? What's Gedit?"
"Are you running KDE or GNOME?"
"GNOME"
"Are you sure?"
"Wait, are GNOME and XFCE the same thing?"

What's really bad though is when users understand there's multiple applications, but can't find one to satisfy them. It's easy when the choices are simple or advanced, you choose the one more suited to your tastes for that kind of application. But it gets really annoying when one of those apps tries to be like the other. Do we need two apps that behave exactly the same but are different? If you started different, and you each have your own communities, then stay different. We don't need variety when there is no real difference. KDE 4 trying to be more like GNOME is just retarded. Trying to steal GNOME's user base by making a design which appeals more to GNOME users but has a couple of flashy features isn't a way to grow your user base, it's just a way to swap one for another.

Nintendo in the past couple of years was faced with losing much of their user base to Sony. For example, back in the late 90s, all the cool RPGs for which Nintendo was known, had all sequels moved to Sony hardware. However, instead of trying to win back old gamers, they took an entirely different approach. Nintendo realized the largest market of gamers weren't those on the other systems, but those that weren't on any systems. The largest market available for targeting is generally those users not yet in the market, unless the market in question is already ubiquitous.

That said, redesigning an existing program to target those who are currently non-users can sometimes have the potential to alienate loyal users, depending on what sort of changes are necessary, so unless pulling in all the non-users is guaranteed, one should be careful with this strategy. Although a user base with non paying customers is more likely to have success with such a drastic change, as they aren't dependent on their users either way. Balance is required, so many new users are acquired while a minimal amount of existing users are alienated.

To get Linux on home computers the following needs to take place:

We need to stop fighting every company that ports a decent product to Linux
We should write good programs even if there is nothing else to compete with on Linux
We shouldn't leave programs as adequate
We need a real solution to the X fiasco
We need a real solution to the sound mixing/latency fiasco, and clunky APIs and more sound servers isn't it
We need to offer tools to the gaming market and try to capture it
Support has to be geared towards the users, not the developers
Stop the myths, and prevent new users installing distros that perpetuate them
Stop competition between almost identical programs
Let programs that are similar but very different go their own ways
Bring in users that don't use anything else
Keep as many old users as possible and not alienate them

Linux being so versatile is great, and hopefully it will break into new markets. As I said before, many routers use Linux. To be popular on the desktop though, it either has to get users to desktops that currently don't have any, or manage to steal users from other desktops while keeping the ones they have. Becoming Windows isn't the answer, as others like to point out, but competing toe to toe in all existing areas while obtaining new ones is. In some areas, we may just have to throw away existing users to an extent (possibly eliminate X), if we want to grab everyone else out there.

Speaking of versatility, has everyone seen this? Linux grown technology does in many ways have the potential to beat Windows and the rest in a large way.

Everyone remember Duke Nukem Forever? Supposedly the best game ever, since it has unprecedented levels of world operability within the game? Such as being able to go to a soda machine, put in some money, press buttons, and buy a drink. With Qt, we can provide a game with panels throughout it where a user can control many things within the game, and there'd be a rich set of controls developers can easily add. Imagine playing a game where you go checkout the enemy's computer system, the system seeming pretty normal, you can bring up videos of plans they plan on. Or notice a desktop with a web browser, where you yourself can go ahead and login and check your E-Mail within the game itself, providing a more real experience. Or the real clincher. You know those games where plot-wise you break into the enemy factory and reprogram the robots or missiles or whatever? With Qt, the "code" you're reprogramming can be actual JavaScript code used in game. If made simple enough, it can seem realistic, and really give a lot of flexibility to those that want to reprogram the enemy's design. We have the potential to provide unprecedented levels of gameplay in games. If only we could get companies to put out games using Qt+OpenGL+Phonon, which they will probably not even consider looking at until Qt has Joystick support. Even then we still need to promote Qt more, which will make it easier to get companies to port games to Linux...

I think Ubuntu has some good ideas for adopting home users, but could be improved upon in many ways. Ultimately, we need a lot of change in the area of marketing to home users. There's so much that needs to be fixed and in some areas, we're not even close.

Feel free to post your comments, arguments, and disagreements.

Insane Coding

Monday, June 30, 2014

Memory management in C and auto allocating sprintf() - asprintf()

Memory Management

sprintf() and asprintf()

Wednesday, December 14, 2011

Progression and Regression of Desktop User Interfaces

Saturday, October 30, 2010

This just in, 20% of enterprises and most IT people are idiots

Thursday, June 3, 2010

Undocumented APIs

Monday, October 19, 2009

Why online services suck

Thursday, June 18, 2009

State of sound in Linux not so sorry after all

Tuesday, May 12, 2009

Will Linux ever be mainstream?

Insane Coding Sites

Blog Archive

Search This Blog

Please Add This

Monday, June 30, 2014

Memory Management

sprintf() and asprintf()

Wednesday, December 14, 2011

Saturday, October 30, 2010

Thursday, June 3, 2010

Monday, October 19, 2009

Thursday, June 18, 2009

Tuesday, May 12, 2009

Insane Coding Sites

Blog Archive

Search This Blog

Subscribe To

Please Add This