Many readers are most likely already familiar with Moore's law. In the 60ies, Gordon Moore predicted that the level of integration of transistors on silicon wafers would double ever year. In other words, every year, the industry would be able to double the amount of transistors on a 1 cm chip with respect to the previous year - an exponential rate of growth. Incredibly, this law is still valid today, even though the time span is now considered to be 18 months. At the same time, the price of chips has diminished at an equally exponential rate. This unstoppable growth has allowed general purpose computers to take the place of special, customized electronics, which is advantageous in several ways: it reduces the Time To Market (an expression near and dear to management, which indicates the time between the conception of an idea and its launch on the market), and at the same time lowering the risk and cost of development. Recent cellular phones offer an example of this: using a microprocessor-based system allows the manufacturer to offer useful features such as phonebooks, agendas, and more and more often, applications such as web browsers. At the same time, it is possible to diversify the product by changing, a part from the price, just the software (and possibly the plastic case...). It is impossible at this point to not note that the attention is shifting from the hardware to the software, which is becoming the true protagonist of new embedded systems. This easily explains the growing interest in a market sector that was untill recently considered a niche: Embedded Operating Systems. For any program of a certain complexity, it is necessary to have a tried and true base platform, in order to be able to concentrate on a higher level of abstraction, removed from matters such as memory management, or interaction with the hardware. This affermation shouldn't be misconstrued, however, as embedded systems have existed for a long time: that which is changing is the level of abstraction that they offer. The computing power available today has quite notably made more subtle the difference between embedded operating systems, and those which were traditionally found only on workstations.
It is at this point that the curtain opens on a new competitor in the arena of operating systems: Linux. The availability of the above-mentioned resources, together with the philosophy of Open Source make for a timely mix, that has rejuvenated the Unix revolution, begun some 20 odd years ago. The idea of unix on an embedded system ought to have purveyors of proprietary systems "quaking in their boots", particularly if the Unix in question is Linux. We will see right away the reasons. In the first place, the availability of source code is of immeasurable value for those who develop embedded applications: it means not trusting one's business in the hands of another company (think of the economic losses sustained by those who were dependent on proprietary software afflicted with the Y2K bug, produced by software companies which had failed years before). Secondly, the stability of Linux is another fundamental attraction: can you imagine the consequences of a "blue screen" on an medical device? Third, Linux is a complete operating system: it offers all of the typical unix services, and more: preemptive multitasking, multi-threading, memory management with seperate and protected addressing, shared memory, inter-process communication, networking, uniform management of peripherals, multiple file systems, dynamic loading of device drivers, standards compliance... We also mustn't fail to underestimate the value of the enormous amount of software available in source form (programs, libraries, languages, compilers, debuggers, ...), that can increase a developer's productivity. Finally, one must consider that using a traditional platform obviates long periods of study. And if one also notes that in all this discussion, the term 'royalty' never comes up...
That which is described above is the panorama within which the creation of etlinux is set. etlinux is a project brought to fruition by Prosa, a firm that developed developed and supported only Open Source solutions, with the goal of demostrating the utility of Linux on systems of modest power. At the moment, to run etlinux, all that is necessary are: 386SX processor, 2 megs of ram, and 2 megs of disk space (and both normal hard drives and the more compact flash disks or disk-on-chip's will work). In this configuration a Tcl interpreter, a small mail server, and an equally small web server are all included. An embedded system has special needs, very different from those of a desktop system. The development of etlinux reflects these needs, and it is informative to examine the solutions utilized. The availability of source code proves to be quite useful also from the student's point of view.
For our discussion, we will make use of the 1.1 version of etlinux (at the moment under development towards 1.2), based on libc5, as this C library most accurately reflects the initial design decisions, and also because it is still the best version if one wants to obtain the maximum resource savings. In any case, the general architecture discussed is still valid (in version 1.2 the principal innovations regard the C library, updated to glibc2).
The kernel used is 2.0.38, with a variety of modifications introduced to reduce the memory footprint. The changes made may be enabled/disabled by means of the normal kernel configuration mechanism. The operations regard, on one hand, reducing the sizes of several kernel data structures, and the other, the removal of inessential systems. For example; the number of filesystems that may be mounted at the same time has been reduced by changing the constant NR_SUPER from 64 to 4 in include/linux/fs.h, the number of character and block devices was lowered, only the first four serial ports are available, the size of the kernel message buffer was cut, and the maximum number of concurrent tasks reduced to 32 from 512 (NR_TASKS in include/linux/tasks.h). Many other modifications of this nature were performed, but the largest savings was obtained by removing the video console, with a savings of around 90 KB. Miquel van Smoorenburg's serial console patch has also been applied (an almost universally adopted solution in the embedded world). The memory savings obtained by working on the kernel, even though noteworthy, only pertain, however, to one part of the system. It is, infact, possible to work on the adaptation of base applications (init, mount, ifconfig, ...) with excellent results. All these programs are written for use on desktops and servers, and as a consequence include much more functionality and options than are actually necessary in an embedded system. Furthermore, one needs to keep in mind that in the binary file that constitutes an ELF executable, there is a very complex header, which has the delicate responsibility of preparing the operating environment before passing effective control to the main function. Every executable file carries a copy of this information, and because of this, these headers are duplicated many times, occupying precious disk space (above all if the disk is a 2MB flash disk). Consider, for example, that for a program such as the classic "Hello World", which has several hundred bytes of code, the smallest executable obtainable with gcc occupies about 2.5 KB, for an overhead of about 2.4 KB, or more, in real-world programs. Whence the fundamental idea behind etlinux: to use an interpreted language as a "motor"; it is possible in this way to globalize basic functionality in the executable of the interpreter, making it available as primitives of the language, and using scripts as applications. For etlinux, the choice of the language fell to Tcl, a scripting language developed by John Ousterhout (1), which is exceptionally easy to learn, and easily extensible through it's C API. The idea of centralizing functionality within an interpreter is hardly new - actually, it is at the heart of the history of Unix: think of the shell, which is nothing more than an interpreter. The originality of our approach lies in the utilization of a general-purpose language, much more powerful than a common shell, and in a higher level of integration of the commands. For example, we integrated commands such as: mount, ifconfig, route, uudecode, uuencode, into the Tcl interpreter ... in addition to having made available many i<system calls> like dup, fork, exec, wait, kill, pipe, nice, reboot, sync, chmod, umask, mknod. The integration of the system calls makes it possible to write many programs which were traditionally written in C, in Tcl ; maybe the most representative example of this strategy is the implementation of init in Tcl. Init is the first user-space process to be launched at boot time on a Unix machine, and it is responsible for the creation of all the successive processes (daemons, shells, ...). The Tcl version of init used in etlinux occupies only 3705 bytes, slightly more than an ELF header alone. Other applications that demonstrate the efficiency of this approach are the web server (3236 bytes, with support for Tcl CGI's and IP based access control), and the mail (SMTP) server (4841 bytes).
It is interesting to note how the utilization of an interpreter presents new opportunities for optimization: for example, when it's necessary to create a new process to execute a Tcl script, it's possibly to avoid the use of the classic pair of system calls fork
- exec (where the first creates a copy of the current process, and the second replaces the image with that of a new executable), which are used by the shell, for instance, to execute the user's commands. This is done by taking advantage of the interpreter's ability to execute a script from a file (command source), naturally, after having created a new process with fork. This method has the advantage of not loading the interpreter a second time, saving memory and processor time. The memory savings is derived from the initial sharing of all pages of memory, even those for data, that are of the type copy-on-write (several kernel tables which are different from process to process aren't shared), whereas with exec only those pages relative to the code, and the data pages would be created in memory again. The higher execution speed is instead derived from avoiding the overhead in exec and of the initilization performed by the executable. This technique is used throughout the system. The increase in performance is significantly represented by its use in the web server for the CGI's written in Tcl: the execution time of a simple CGI which otherwise took roughly 2 seconds, was reduced to almost nothing (386SX with 2 MB of RAM).
It is necessary to keep in mind that having an interpreted language also allows rapid prototyping, and easy customization of applications. Infact, it is very simple to add new functionality to init or the web server, which would otherwise be very laborious. We feel that the ability to customize the system in a short period of time is a critical feature of an embedded operating system.
It is exactly with the extensibility and ease of creating customized versions of the system in mind, that another central component of etlinux was developed: automatic packages management. The package mechanism makes it possible to choose which components to include in the final operating system. For example, it's possible to choose whether or not to include a shell, the web server, or any other element. The entire system is based on the use of suitable Makefiles and standard unix commands, like make, sed, grep, ... In any case, the user doesn't have to deal with any details: all that's necessary is to edit a text file, specifying which packages to include, and to type the command make. This will automatically create a directory tree which reflects the final filesystem of ETLinux. All that remains is to transfer the created directory, with its subdirectories, to the disk (or to the Disk On Chip) of the embedded system, and it will be ready to boot! This system is very flexible, as modifications may be made to the sources of the packages, or new packages may be added. Furthermore, it is possible to keep the largest packages, like libc, precompiled, in order to avoid excessive waiting for compiles.
etlinux is a constantly evolving platform. As previously mentioned, a version which utilizes GNU glibc 2.1.2 is under development, which will allow the productive use of multi threaded programs. At the same time, to enable the development of distributed applications, CORBA support has also been included. There are a variety of other new things in the works, all with the goal of making etlinux the platform of choice for developing embedded systems on Linux.
The principal objection that has always been raised against Open Source by businesses, the lack of rapid and qualified technical support, has been demonstrated to be completely unfounded. Infact, the number of organizations that offer commercial support at all levels is growing. The advantages of adopting a free platform such as Linux, are by now so obvious that it isn't difficult to foresee that this expanding phenomenon is destined to gain an ever more substantial share of the market. etlinux is an attempt to bring to fruition the benefits of Linux and Open Source software in the embedded systems space.
- Tcl and the Tk Toolkit, by John Ousterhout, Addison-Wesley, 1994, ISBN 0-201-63337-X
Marco Pantaleoni collaborates with Prosa in the area of embedded systems. In his free time, he works on the development of VHLL interpreted languages such as elastiC (http://www.elasticworld.org). He may be reached at firstname.lastname@example.org.