State of the GNU/Linux Desktop 2009 Part 1/4: Multimedia
Debian GNU/Linux 4.0 contained over 283 million lines of code, and the Linux kernel development mailing list is of such high message density that even Linus Torvalds himself reads only a fraction of the messages that daily flood the development process.[0][1] Unless one wants to spend their entire lives trawling untold hundreds of mailing lists, they cannot keep track of what is being developed. My intention is to give a summary of what interesting features are being developed in the Free Desktop, especially involving Linux, that I’m aware of to the un-obsessed. My intention is not to give comprehensive technical reviews, however; only an introduction to each development.
Areas of important development in the overall Free Desktop ecosystem can be broken down into several main areas: multimedia, hardware support, infrastructural enhancements, and usability work, and I intend to cover each of these broad categories in the next couple of days.
To open, multimedia has always been the main weakness of the Free Desktop; from patchy and inconsistent audio stacks to unreliable and sluggish graphics subsystems, this is been a major beef for gamers and enthusiasts alike.
Gallium3D is a new API- and operating system-independent driver and purely shader-based stack for 3D video subsystems that has been undergoing much work and has been a major hotspot of thought for the free desktop, especially Linux. It will eventually replace Mesa, the current OpenGL implementation providing 3D acceleration on X.org. While at first glance it appears to be mainly an under-the-hood change, it does have much relevancy to the end-user. As well as allowing more rapid support of graphics hardware and less buggy drivers because of a more clean codebase, it will also provide driver-neutral video acceleration.[2] A usable but early version should be part of the Mesa 7.5 release.[3]
On the note of video acceleration, this is also an active area of development. Xv, or the X-Video Extension, was the classical method of this and is still frequently used. However, despite (and also because of) its maturity, Xv is hardly ideal as it only accelerates resizing and color manipulation in video and does not concern itself with decoding video. This makes it mostly worthless for modern video. XvMC, or the X-Video Motion Compensation Extension, is a later extension that also leverages the graphics card to offload some parts of MPEG-2 decoding. This is mostly worthless for modern systems, however, as not only are there many bugs with XvMC but also because of its lack of support for more modern video compression systems.[4] Fortunately, the Free Software community has not been idle; Intel’s VA-API, or the Video Acceleration API, is currently the favored player by some. Not only is it an actively developed library and standard supporting a variety of video decoding tasks in modern codecs such as MPEG-4. Natively it is only supported by the Intel Poulsbo and several S3 Chrome graphics chipsets, but it can use NVIDIA’s reportedly excellent VDPAU (Video Decode and Presentation API for UNIX) and AMD/ATI’s XvBA (X-Video Bitstream Acceleration) libraries as back-ends allowing any program supporting VA-API to accelerate decoding on most hardware.[5]
Kernel Modesetting is an important feature that has been under development for several years now, and was finally released for Intel graphics chipsets in Linux 2.6.29.[6] “Mode-setting” refers to the act of activating the graphics card on a display with a certain resolution and color mode, and prior to kernel mode-setting this was a complex dance between kernel and userspace that lead to much instability when switching virtual terminals, starting the X.org server, and resuming the system from a state of suspension. Eventually, kernel mode-setting will also lead to a flicker-free boot process as seen in Fedora Plymouth.
As substandard as the graphics subsystem of the Free Desktop has been historically, the audio stack has been… worse. The Open Sound System was the historical API for handling audio on all major UNIXes for years, but eventually development went proprietary and the open source version began to suffer from severe dust collection and a lack of modern features such as hardware audio mixing. Linux 2.6 was therefore released with its own sound API: ALSA (Advanced Linux Sound Architecture). Unfortunately, ALSA is to this day notoriously messy in both its API and in its documentation.[7] The situation is however convoluted by the frequency of OSS apps forced to use ALSA’s emulation layer. The end result is… messy and buggy. PulseAudio is a sound server designed to repair the situation. The theory is that instead of having applications directly work with the soundcard, requests are routed through PulseAudio allowing not only better application coexistence but also more advanced capabilities such as playing audio across a network. For a while, its adoption was hampered by high latency, bugs, and incompatibility with all ALSA applications, but it is now the de-facto standard for audio on Linux.[8] However, it is worth noting that OSS has since gone open-source again, and although I have not used it myself, reportedly it is quite superior to ALSA.[9]
Media playback, especially MP3 and DVD playback, has been another area in need of attention for years. However, recently the situation here has improved in part due to the emergence of GStreamer, a fairly solid and extremely powerful multimedia framework. Prior to this, xine and MPlayer were the main multimedia frameworks; while good applications in their own right, they tend to be suboptimal for writing new players. Fluendo was then founded to sell fully legal and licensed multimedia codecs for it, filling an important hole in the Free Desktop ecosystem. In addition to this, they released an MP3 playback codec for free, and are currently attempting to get the rights to develop a legal DVD playback library.[10][11] Don’t get too excited yet, though; as of 2008, they were still attempting to get it approved by the DVD consortium.[12]
Wayland is, to me, one of the most interesting developments here. The X.org display server implementing the X11 display protocol is notoriously buggy and heavy compared to what is needed. Essentially, Wayland is a compositing display server that throws out everything that isn’t needed on a modern system. This is of course most immediately useful on mobile devices where the overhead incurred by a full-blown X11 server would be impractical, but it also brings a refreshing simplicity to a standard desktop system as well. Perhaps most interestingly, Wayland also allows running multiple full-blown X11 sessions within a Wayland window.[13]
EDIT: There appears to be a nice write-up of X.org’s recent work here.