First, I want to give a small explanation on the backgrounds of this document. There are several parts which lead to my advocating of Linux in the corporate environment.
First of all, it is already four years since I discovered Linux. It is only recently however that I really started using Linux itself. I used some GNU tools on the DOS and OS/2 platform, but only through recently expanding my storage could I install Linux. I printed some manuals, subscribed myself to the Linux Journal, I try to read the Linux Gazette frequently. Well, I consider myself almost a fan of the first hour of Linux.
Secondly, since the beginning of 1997 I have worked in a traditional mini/COBOL/database environment and I have noticed that the people who use these systems, find a lot in such an environment : easy to control and operate, you need only one person to program, background operation etc. The other side of the coin is that these proprietary systems are expensive. You pay every year an expensive maintenance contract or you pay an expensive price for reparation and upgrading.
My third reason, last but not least, is that I have never liked Windows in any of its incarnations since 1990. It generated GPF's for unknown reasons in 1990 and eight years later, it still does. It forces people in buying expensive hardware, which then cannot be utilised efficiently (if you don't want to crash).
These three reasons have lead me to the writing of three document, which I want to be published via the Linux Gazette. The reasons for this is that I found that the Linux Gazette is also read by people who have other system backgrounds than only DOS or Linux, and this is crucial for the objective that I want to reach.
This objective is in its essence the same as Linus Torvalds says, and it is :World domination. However, I have my own reasons to believe that world domination will not be attained only through the PC, workstation and Internet applications market. I believe that Linux has the potential to compete in the corporate marketplace. Alas, there are still a lot of holes to be filled in before this will come true. However, I also think there is enough potential among the Linux enthusiasts to make this dream come true.
The following text consists of three parts in which I was trying to order my ideas about what Linux further needs to attain the stated goal.
In the first part that I wrote, I am trying to compare Linux systems with mini- and mainframe-computers that I know and their architectures and I want to make an appeal to people who might be interested in developing Linux for large systems. I posted it on several c.o.l.* newsgroups, but I did not receive much response (only 1 person seemed interested).
This document should be thoroughly cleaned up and restructured. The main reason that I send over the Internet as is, is that I want to know the amount of response it generates. If there is no interest whatsoever, then the project will be cancelled. If you made it through here, please read on. Any ideas to have a good working title or something like that, are always welcome.
This document doesn't have the status of HOWTO. If I would assign it a status, then it would be something like an RFC, although not that official.
I apologise if things are not always clear. I need to document some parts with graphics to provide a clearer understanding. It should probably also be created as an sgml-file, to have more processing power.
Although this paper is sent to different linux newsgroups, it should be best to try to pick just one newsgroup to communicate about this document.
This document is by no means complete. It attempts to define a framework to develop and deploy Linux as a mainframe operating system. If any idea's in this document have duplicates somewhere else in the Linux development community, I would be glad to know of them, so that
This document is for the moment completely my own responsibility and my own copyright. It may be distributed everywhere, but I am the only one who may change it. Please, send questions and suggested changes to my email-address jurgen.defurne at scc.be. All trademarks acknowledged.
I intend to put much time into this project. I have a fine, regular job working daytime as a COBOL programmer, so time should be not really a concern.
The ideas in this document are a reflection of my own experiences in working with computers and things that I have read about in a whole bestiary of publications (magazines, books, RFC's, HOWTO's, The Web, Symposium records, etc...). The basis is this : Linux is highly scalable. For me, it has proven to be far more scalable than any MS product. I run Linux on the following systems :
Some of these systems are interconnected, others not (yet). With the use of telnet, X and TCP/IP it is possible to use these systems together, to run tasks on different systems etc. But I want more. What I would really like is that these interconnected systems can be viewed as one single system, with a common address space, and where their individual resources are added together to form a more powerful computer. The main target would be to make it possible to introduce Linux in environments where traditional minicomputers are used for data-entry and data-processing. This may sound like pretty ambitious goal. I don't know if it is. What I do know is that these are environments where high availability is a top priority (Note 1).
Another reason to do this project is the fact that in the beginning of the year Tandem has built a mainframe computer using 64 4-way SMP systems, NT, their own interconnection software and Oracle Parallel Server. Why shouldn't we be able to do something alike ?
This document must describe not only software, but also hardware and system procedures. I hope to revise it very regularly. I would like it to contain links to used source code, schematics, construction plans, all used sources and a history and possible planning of the project. It should also give people who want to make money from Linux the possibility to do this on a professional level. That is, they should be able to help companies with processing requirements to assess their needs, give advice on required hardware, install and implement the system and provide service, maintenance and education.
I haven't had a regular programming education. I am an electronics engineer. After school I got into microcomputers and programming and I broadened my education with courses on business organisation and industrial informatics. My experiences in the mini/mainframe world date back from as recently as januari 1997. At first I got to work in WANG VS (Note 2) environment, now I am still working as WANG programmer, but the WANG's have a duty as front-end input processors to the mainframe (Bull DPS8000/DPS9000) and as legal document processing systems. In my first job, the WANG VS minicomputer was used more as production mainframe system.
Now, what do these systems have in common ?
The main difference between the mini and the mainframe is in the operation of the system. The four main tasks that have to be done on a computer system are administration, exploitation, production maintenance and development. On a mainframe these tasks are done by different people, on a mini these tasks can be done by one person, or shared, but you don't need full time personnel for the different tasks (except for programming, that is). The system running on a mainframe can be sufficiently complicated that some tasks or operations may only be done by some trusted personnel.
Operating the system comprises the following tasks :
Basically, the ability to handle tasks efficient and fast. If you want to know more about the chores of operating systems, there is enough literature available (see literature list). The basic problem in running a large computer system is the difference between batch-operations and interactive or real-time operations. You want batch programs as fast possible to be executed and you want for the other kind a fast response time. The basic problem with PC's versus mini/mainframe computers is that the IO structure of the PC is very primitive. This is starting to change, first with VESA, now with PCI, but it still comes nowhere in the neighbourhood of a minicomputer. Basically, these systems always have a separate internal processor (or more than one) on the IO bus to handle data transport between devices and the memory. With I2O, this should become available to the PC world, but it is still proprietary and not available to Linux and/or Open Source developers.
Tasks compiled for x86-architectures tend also to use more memory. Let's take some examples from minicomputers and mainframes I know about and have access to documentation.
| System | Main memory | Clock | Bus size | Users supported | 
|---|---|---|---|---|
| WANG VS 6 | 4 Mb | 16 MHz | 16 bit | 32 | 
| WANG VS 6120 | 16 Mb | 20 MHz | 32 bit | 253 | 
| WANG VS 6250 | 64 Mb | 50 MHz | 32/64 bit | 253 | 
| WANG VS 8000 | 32 Mb | N.A. | 32/64 bit | 253 | 
| BULL DPS9000 | 2 x 64 Mb | N.A. | N.A. | N.A. | 
| BULL DPS8000 | 2 x 32 Mb | N.A. | N.A. | N.A. | 
These systems are smaller than PC's in terms of memory, yet they support more users and tasks than a PC would do. I wouldn't use my Toshiba portable to support ten users on a database. Yet, that is what the WANG VS 6 is (was) capable of, with the same characteristics.
This is for the moment my main criticism of standard PC's and their software : they are extremely inefficient. The first inefficiency comes from the methods used to lower the price of a PC : the CPU is responsible for data transport between devices and memory. You have DMA available, but it isn't very efficient. The second inefficiency comes from the software mostly used on PC's : it takes up much space on disk and in memory.
A third inefficiency is in the software itself : it has so many features, but these aren't used much. The more features in the software, the less efficient it becomes (Note 3).
There is another thing to be learned from mini/mainframe environments : keep things simple. I don't think the current desktop/GUI environment is simple. It doesn't have a steep learning curve, but basically what you have are super souped up versions of what are basically simple programs. When writing programs or designing systems, one should always keep in that after a certain point it costs more effort to add more functionality to a program, while this functionality decreases efficiency.
In the area of parallel computers there is the Beowulf system and associated libraries. Their basic target is parallel processing for scientific purposes, while my purpose is business data processing. As I see it, some of their goals walk parallel with mine, especially in the areas of existing bottlenecks : the network, distributed file access, load balancing etc. However, the way business programs are run differs from scientific computing. MPP is also more in the way of creating a computer to run really big tasks, while on a business machine you have logins from users for data querying, transactional processing, batch processing of incoming data, preparing outgoing data, establishing communication with other systems. In this sense, what we are looking for is not to distribute one task over several computers to speedup processing, but to serve up adequate processing power, data manipulation facilities and information bandwidth for a large number of users. These goals need different OS support than MPP.
I have studied the Beowulf structure (a Beowulf HOWTO is available on the Internet). The Beowulf structure works is a MPP system in which only one computer effectively runs the application. All other nodes in the system are slaves to this one CPU. This is why the Beowulf system is only partially suited to attain my goal.
We need to start with a set of completely defined Linux operated computers, from now on called CPU's, which are somehow connected to each other by means of an abstract communications layer or CL. This CL can be implemented using serial connections, Ethernet, SCSI or anything else that we can devise to make CPU's talk to each other. A CPU may be a single-way computer or a multi-way SMP computer.
I think the end point should be to view the system as one single entity. To do this, the following requirements should be met :
One of the fundamental changes in the OS should be the way exec() operates. When exec() starts a new process, this could be on any CPU. The original links need to be preserved and processes should end in the same way as always.
Interprocess communication is straightforward I think. What I would like to know is if it is worthwile to strive for a system view in which all memory is mapped into one address space ? (Idea behind it : provide every CPU with the same view of the system : it's OS, followed by the memory pools of all other CPU's mapped into the same address space). This is what NUMA (non-uniform memory access) is about. Can the Linux community attain this subgoal, or does it need to much specialised resources ?
Some key parts of Linux should be redesigned or replaced by fault-tolerant parts. The largest part which comes to mind is the file-system. A few months ago I had a nasty experience. A connector on the cable of my SCSI subsystem had a defect, with the consequence that the system of a sudden completely froze while I was busy using X-Windows. The trouble with e2fs is that on these occassions the whole filesystem gets corrupted. This should be made more sturdy.
The other part is that the system may not freeze on these occasions. It should be possible to provide a bare minimum of functionality, eg. that the kernel takes completely over and switches to text mode to provide diagnostic information or tries to create a core dump.
Another problem that I have encountered is the lack of reliability when a harddisk drive gives trouble. What happened to me whas that on using an old SCSI drive the kernel and/or e2fs started to write strange messages when I tried to use the disk. When the system encounters problems with devices, the problems should be logged, the operation should be stopped and informative messages should be displayed.
Other key features in the area of HA should be the tolerance of the complete system when a CPU is missing. A CPU may only be added when it passes the self test completely and finds out that everything is working fine. When a CPU quits while being in the system, there should be possibilities to restart processes which have been interrupted. For this one should provide the programmer with features to help with this problems : a transactional file system, checkpoint functions (other ?).
The last idea I think of is maybe the possibility of swapping a complete task between two CPU's. A task consists of CODE and DATA. You don't need to save CODE. DATA can be completely swapped to harddisk. If you have a way to transfer the process information from one CPU to another, then it should be possible to reload CODE and DATA and restart the process on another system.
There are two targets. The first is the creation of an extension which combines several Linux PC into one system. Users and processes should get a same view of of the complete system as one system. This should also mean that certain administrative chores should depend only on centrally stored and shared information.
The second one is to add more and better managed fault tolerance, preferably more interactively managed.
Well, this is it. I hope that people ask sane questions, that I don't get flamed and that it raises enough interest to advance Linux to a higher level.
Ths reference list is clearly not finished. I need to obtain
more details about some works.
The Linux High Availability White Paper.
The Beowulf HOWTO
The Parallel Processing HOWTO.
Andrew S. Tanenbaum, Design and Implementation of Operating Systems
Note 1.
Note 2.
Note 3.
In the second part I am trying to develop an architecture to extend Linux into a parallel processing system, not for numerical processing like Beowulf, but for administrative dataprocessing.
The goal of this document is to establish the components which should comprise the project which was mentioned in the previous document (Linux mainframes). To do this, a description of the boot sequence will be given, together with the possible failures and the solutions.
Before attempting this, however, I want to give a short summary of the guidelines which should lead us toward the goal of Linux systems which can be deployed in corporate environments.
Minicomputers and mainframes provide reliability and high processing power. The reliability is largely obtained in two ways. The first one is in the design of the system, the second one is the existence of a thorough support department with online help and specialised technicians. The emphasize in this document is on the hardware side of the system.
High processing power is obtained in several ways. They involve the use of cache-memory, wider data-paths, increasing clock frequency, pipelining processing and efficient data-transfer between memory and IO.
On the reliability side the system is dependent on hard- and software. If we are to use currently available parts (motherboards and cards) then the only thing we can influence is the way systems are assembled. Care should be taken to avoid static discharges, by using anti-static mats and bracelets.
On the software side we have the Linux operating system which is very reliable, with reports of systems running for months without erroneous reboots.
However, hardware can fail and in this respect I think that there still needs work done on Linux. If the error is not in the processor or the system memory, then a running system should be able to intercept hardware errors and handle them gracefully. If at all possible, system utilities should be available to test the CPU, the system memory, the cache and the address translation system.
The Linux High Availability White Paper documents clustering of small systems. Later on in this document, some other techniques will be proposed.
Processing power comes on several levels. On the first level, that of the CPU and the main memory we can't do much. With current motherboards with bus speeds of 66, 75 and 100 Mhz, we get data transfer speeds between memory and CPU of 264 MB/s, 300 MB/s and 400 MB/s. These should be sufficient for most applications. Memory is cheap, sizes of 64 to 128 MB should also give headroom for large applications.
The largest problem with standard motherboards is that all IO needs to be handled by the CPU or else by a slow DMA system. This means that a large part of the operating system is being used by device driver code. In mini/mainframe systems this is not the case. All IO is handled by separate IO-processors. These IO-processors implement the device drivers and as such free a large part of the central processor.
To relieve the central processor of this burden, there are three solutions. The first one is being implemented by the I2O consortium. It defines standards for intelligent IO-boards on the PCI bus. These boards can transfer the requested data themselves to the main memory of the CPU. The only problem is that as far as Linux is concerned, I2O is proprietary.
I think that two other solutions should be possible. The first, and probably easiest, is to use an SMP motherboard and program the operating system so that one processor is completely responsible for all IO, and the rest of the CPU's do the real work. Another idea is in the absence of SMP use two motherboards, run one with an adapted version of Linux to handle all IO and use the other one to run only applications. The only trouble here is which system will be used to interconnect the motherboards. Especially in the case of mass storage devices, you want to stream the data from the device as fast as possible into the memory of the application. Currently, this means using the PCI bus in one way or another.
Since we, as Linux users, have no sight on the design process of motherboards, reliability should be obtained through good standards of assembly and by implementing redundancy.
To obtain more processing power, the main CPU should be relieved as much as possible from IO. This could be implemented by using SMP or by interconnecting motherboards.
Based on the previous ideas, using several motherboards interconnected by a high-speed network could give us the following benefits :
To obtain these benefits when the system is assembled, some operating system changes need to be provided. It is possible to interconnect computers and make these work in parallel, but all administration must be manually accounted for. So, what we need when the system is booted, is not a vision of several separate systems, but only one system.
When booting the system, all nodes start in the usual way : installed hardware is identified, necessary drivers are run, a connection to the network should be made, NFS drives should be mounted, local file systems should be checked and mounted.
In the case of a normal system, all background processes would be started and users should be able to log in on the system.
When the system should be seen as one complete system, the boot sequence should be modified at this point. Resources which are normally only accessible on one node, should be shareable throughout the system. To build a common view, every node should have access to a common file system. In this file system the directories /dev, /etc and /proc should be accessible by every node.
The directory /dev contains all shared devices. The directory /proc provides access to system structures which should be shared by every node. The directory /etc contains the necessary files to control the system :
Every operating system on every node must be adapted to work via these shared directories.
To control the creation of this shared system, one node will have to be designated 'master'. After the initial boot sequence, every node will have to wait for the master to initialize the network. This initialization can proceed in the following way :
Started processes fall apart in two categories. Local processes run on the nodes which contain the resources that the process needs access to eg. getty, fax drivers, etc. Global process are independent of hardware and should be able to run on any node in the system.
Any node should also be able to start a new process on the system. By using a load balancing system, all started processes must be evenly divided over all nodes.
If a master fails while the system is up and running, then the basic coordination of the system is gone. To overcome this problem, a backup master must be defined. This backup master needs to keep an updated copy of all master system information. If the real master should fail then all nodes in the network should block themselves until the backup master has come up. The system should provide dynamic management of nodes. This means that nodes must be attachable by using system calls. This goes via the master, which then adds the system on the network. If a node must be detached, then none of its resources should be in use, otherwise the call fails.
If a node fails when in use then this surely will pose problems. A failure can show itself on the network (network interface problem, processor error) or local. If a process uses a remote device, it will do this by means of messages which are sent over the interconnection network. In the case of malfunction, the addressed node won't (can't) answer anymore. The OS must block the process until the malfunction is removed.
If there are problems in critical parts of the system, device drivers or system processes should not blow-up the system or interfere with user processes, but they should have the means to correctly report the problem and block the processes which are using the particular resource. If the malfunction is on a local level (device) then the device driver can return a message stating the error.
The most critical part in the system is the interconnection network. This should be tested and tuned according to system demands. If possible, a fast protocol should be used instead of TCP/IP.
The view every node has of the system should be the same. Devices must be shareable accross the interconnection network. The OS should be extended so that the exec() function, which is basic for starting processes, executes on a global level.
Reliability should be built-in and configurable on several levels. A message-based protocol is needed to share devices across the interconnection network.
Basically, there are for the moment two interconnection systems which can be used of the shelf.
The first is Ethernet. Based on the money to spend, you can assemble systems with 10 Mbit, 100 Mbit or 1 Gbit networks. Increasing bandwith means increasing processing power. To obtain the maximum of your bandwidth, the ideal is using an SMP motherboard in which one CPU takes care of all network-to-memory data transport.
The second one which attracts interest in the Linux community, is the SCSI interface. Using modern SCSI cards, up to 16 motherboards could be connected together to provide for parallel processing.
This is the third part. I have compiled some cases where I have participated to highlight some points that need more support in Linux.
Through several enhancements (Beowulf, Coda FS, Andrew FS) Linux gets more and more powerful. But how powerful is powerful really ? Linux is announced and used in more and more places, but there is a serious lack of numbers on the capacity of Linux in different environments and configurations.
This is however a crucial point. In many environments, Linux gets introduced through the reuse of PC's (which is in itself a good point). There are however other environments where the introduction of new hard- and software depends on the provision of hard numbers for acquisition, deployment, education, maintenance, infrastructure and depreciation of systems. This can range from a small office which only needs to cough up the required cash up to a financial institute which has large dataprocessing and communication needs.
In some of these areas Linux hasn't probably even touched anything because those people use computers as a means to an end. The computer itself does not stir their imagination. They have tasks to be done and the computer is their instrument to complete those tasks faster and more precise. These are the environments which are lured into buying MS products. I know however several people which work in various different Wintel environments and none of them are satisfied. Some complaints :
Lock up of course : power users lock up more easily their PC, because they use a lot of applications next to each other.
Unexplainable configuration changes : you enter your office and your application does not start. Reason : some ASCII text file has reverted to a previous state (I had this one several times with the TCP/.IP 'services' file).
MS Office for Windows 95 : You can not seem to use Word for large documents (this is a complaint from a user in a large company).
Windows NT : can not be deployed in situations where older applications need access to older and/or proprietary hardware.
I am sure anyone who has ever used the system, knows other bugs.
I think that one of the reasons why Linux isn't more employed in these environments is that it is mostly deployed using a single type of configuration existing of an IA32 CPU, a PC AT architecture, IDE/SCSI disk subsystem, an Ethernet NIC and standard serial devices. This makes it very easy to use Linux in the following places :
These are technical solutions for technical problems, implemented by technical people. However, for some places, some pieces are still missing and there are places where Linux could be used, but where it is not. The usability of Linux still depends too much on the technical skill level of the user. This should not be necessary. Companies should be able to deploy Linux quick, efficient and flawless. Introductory courses should be provided. This will mostly mean migrating from Windows knowledge to Linux knowledge. People should be made to understand that there are three pillars in the usage of a computer system and/or program :
On the system level these should be integrated transparently and tightly. A user shouldn't need to go through heaps of paper and manuals to find something quick, so menu driven is probably the best answer for this, with good context sensitive help. I even think that from the point of view of the user, things should be accessible under a heading 'Applications' where all his production programs should reside, and a heading 'Maintenance' where operational, administrative, system maintenance and diagnostic programs are located.
If we want Linux systems to be used more in environments where people are not concerned with their computer per se, but as a means to do their job, then support will have to grow on several levels. To project these levels, I will present some cases more or less detailed. These cases present environments where I have worked, customers which needed support, people I know.
With this I mean the family sized company which provides some basic services (grocer, plummer, carpenter, etc...). At most two persons are responsible for handling all administration. This consists mostly of two parts : accounting and handling of incoming/outgoing messages. The first part of the problem is providing this environment with a suitable accounting package which is applicable for the country where the company resides.
The second part of the problem is handling all incoming and outgoing messages. This requires access to three channels : phone, fax and e-mail (if there are any other options then these are probably too expensive for this environment). Depending on the situation, there could be constraints on the usage of the channels (eg. no channel should block another channel, when answering the phone, the fax and e-mail should not be prohibited and/or prohibit each other). The configuration could probably be extended using a PABX card in the system, to provide extended telephony services via Linux.
Like it or not, but these people have become accustomed to using WYSIWYG word processors and spread sheets, so the least that must be done is provide them with this functionality. There are at least two good packages available for Linux in this respect. Another thing that should be provided is a customer database which is closely linked to the former package. Creating new documents and using fill in documents from a user entry should be a must. Creation and insertion of simple graphics should be an available option too.
If we consider at most two people then the system could be configured using two workstations of the same capacity, where some tasks are shared between each other, or it could be done using one more powerful system, which provides all services, and one cheap PC workstation, configured as an X-server.
File- and print-services, bookkeeping, inventory control
The company where I first worked from 1990 to 1991 had a Novell Netware system installed. We used the system to provide printservices for Mac- and PC-systems, as a repository for all kinds of drivers and diagnostic software and as a shared database via the bookkeeping and inventory control program. Everybody who needed access to the network had his or her own PC or Mac. We mostly used DOS back then, although with the introduction of Win 3.0 some people migrated to it. Everybody had access to a phone and there was one central fax in the administrative department. We installed and maintained PC's and Mac's for graphical applications. These applications provided output for typesetting printers (mostly via Postscript) or plotters. The supported applications where Adobe Photoshop, Aldus Pagemaker and AutoCad. We were also a reseller of the bookkeeping package that was used on the network.
The printing could be spooled to several large laserprinter, a high-speed dot-matrix printer and a photographic typesetter.
File services under Linux are probably the easiest of problems. I networked, recompiled, linked and started a small TCP/IP network using two computers in less than an hour. NFS is very comprehensive, as are telnet and other TCP/IP services. If you need to provide only a central server, then the following things need to be done :
For the workstations the following needs to be done
The main difference between Novell and NFS is in the administration. On a Netware server, all administration is kept central to the server. The only thing which needs to be done on a workstation is load an IPX driver at boot time. On a TCP/IP workstation, some administration is kept centrally and some administration is kept locally. This makes the process of maintaining and updating the network more laborious.
Installing print services under Linux is generally much harder than under Netware. This is because all settings are to be added manually using a text editor in the file printcap. But, since this is a very structured file, with a rather small set of commands, why hasn't any body ever written a dialog system to scan printcap and present the user with an overview of available printers and the possibilities of adding and modifying printers and their settings ? This would be a great step forward in installing printers. Filters for different types of printers could be presented, so that the configuration on the network could be simplified (as an aside, RedHat provides such a system).
The other part of printing is the operation of the queues. The lpd system provides only command line control. But since this system is also understood very fine, why haven't there been any attempts to rewrite the lpd system for menu-driven operation ? After all, entering a command or pressing a function key can invoke the same behaviour. All queues and printers can be presented to the user, with the possibility of providing more details.
The accounting program was written in Clipper and did not use Btrieve. This means that all access to the data in the files generated a lot of traffic over the network. This was alleviated by segmenting the network in three parts so that the accounting department didn't interfere with the other departments. The whole package ran under DOS. In the course of years, the company which programmed the package made in 1994 the transition from Clipper to FoxPro, and only as recent as 1997 they made the transition from DOS to Windows (with the DOS version still being sold and supported).
This presents us with a case of providing support for migration of xBase dialects to Linux, while adding value to these languages through transparent client/server computing. There should also be support for people migrating from these DOS-based systems to Linux. There are a whole lot of programmers who work alone and who make a living by writing and maintaining small database applications for SOHO users (using xBase and several 4GL tools which run under DOS). Providing incentives and support for these people to migrate and to help their customers migrate could give a double benefit to Linux. The key lays of course in the way that support for these tools becomes available under Linux or that conversion tools become available under Linux.
Printing support under Un*x and hence Linux has always strongly been oriented at typesetting. Providing support for Postscript should not be a problem under Linux. Adding a typesetter should be as easy as installing a printer on a server or on the network via a print server. There are already some strong graphical packages available for Linux. In this case, migration is a question of importing and/or converting graphical files and showing the user how to do the tasks he does normally with the new application.
Plotting and/or cutting should be the same as printing. The application program is responsible for translating it's own internal drawing database into a format that can be used by the addressed peripheral.
Drafting departments are a case where networking and central storage are really put to the test. It consists of a drawing database, which is a front-end to the drafting programs. User should be able to look at drawings, create, edit, delete and print drawings and collect usage statistics about drawings. In addition, only one user should be able to edit a drawing or part of a drawing at one time, and it should be possible to see who is editing what. If this all sounds like using a file system, then you are right. The difference is that you only use one type of file. I worked on one system in the previous case. It was written using Clipper as a front-end. I know of other environments where Autocad is used, but under a WinNT network, and there are some companies who deliver complete turnkey solutions consisting of powerful minicomputers and proprietary workstations for real high-end drafting work.
Providing the incentive to migrate to Linux consists in providing a powerful server with large storage to accomodate all the drawings and a fast network to deliver them to the workstations. All workstations should be tuned to the max to deliver the utmost in graphic display and manipulation. Of course, utilities are necessary to convert the original drawing database and all the drawings. Networking should be flawlessly, and the program which uploads the drawing should provide an indication of the time necessary to get the file and where it is in the process.
This pertains to my previous job : a small transport company, which had ten years ago decided to implement a computer system to automate several tasks and to keep a database of all done transportations. They had taken WANG VS, which was back then a successfull system, with many advanced features. Custom software had been developed by an outside company first, by an in-house programmer later. The system contains a very comprehensible fax package, which can be used by anyone, but with strong security features. All outgoing messages are put in one queue, where the operator can change their times and/or priorities. All communication with the minicomputer is via terminals or via emulation cards on PC's. Accounting is also done on the minicomputer, but the two systems are not linked. The system is also equipped with a background task which controls batch tasks in a queue.
There are many medium-sized companies which still use minicomputers and who have a problem shedding them, due to their highly specialized software. Migration from a Un*x system to a Linux system should not pose as much problems as migrating from a completely proprietary system to Linux.
The main problem with these mini-computer systems is their high maintenance cost. That should be the most pressing reason to migrate, although Y2K could also be an incentive (not so with WANG VS, which is fully Y2K compliant).
To provide the same functionality a DBMS package should be available which provides a data dictionary, a screen design package and a COBOL74 compiler with preprocessor to translate simple SQL SELECT statements. There are several packages available. One package aids in the migration from WANG PACE (the WANG DBMS) to Oracle (at the moment Oracle has only announced porting Oracle to Linux), while Software AG has tools to port WANG PACE applications and screens to ADABAS. On part of the compiler, where I work currently the porting is done from WANG to HP-UX using Microfocus Cobol. The security features of the database package should at least contain rollback recovery. The provided file-system should absolutely not be e2fs. Reliability should be favored over speed. When the power fails the file-system it self may be damaged, but these damages should be simple to clean-up. Damages in transactional files are to be repaired with the rollback option.
On the hardware side, I noted that SCSI II provided enough speed to handle some 20 users, but ... this was a system with a specialized IO-processor to handle all data transfers between main memory and all peripherals. To know how Linux fares in this, benchmarks should be run and numbers should be provided. In our last configuration (a 50 MHz CPU with 64 Mb), under a heavy load, our response time was under 10 s.
Fax support must be provided to interactive applications, but also to batch applications.
Batch processing of all tasks should be supported. Some programs can be started, used to enter selection data and then launched at will in the background or in the foreground at a time and day the user can enter. cron is fine for highly skilled people, but not for your data-entry clerk, so you need a front end which asks the date, time and repetition rate of your job. The application itself should be able to provide the required parameters.
This company builds cash register systems using mostly common PC hardware and one piece of proprietary hardware which interfaces to a magnetic card reader, a bar code reader, a money drawer and a keyboard/display/pricing printer. The cash register is connected via a network to a server which provides an inventory and a price list. Upon booting, the cash register connects to the network and loads its OS from the server. Every server has the possibility to connect at night to a central database to update its pricelists and to order items which are getting out of stock.
For the cash register, a multi-user, multi-tasking OS is clearly overkill, while in the case of the server, multiple cash-registers could connect via the network to the server. The cash register would benefit, though, from multi-threading.
Software development for servers and departmental systems is usually done with a 4GL tool, with a higher-level language only for those parts which 4GL does not support.
The production environment of this company consists of 5 WANG VS minicomputers, used for data-entry, data-preprocessing and to connect agencies remotely through a telephone line. It consists also of a Bull mainframe system with two CPU's, 128 Mb memory, 240 Gb of on-line storage capacity, a transaction processing system consisting of a network database and a screen editing and runtime program. All this is controlled using JCL and COBOL-74. TCP/IP is implemented between all systems.
Replacing the minicomputers with Linux systems should be relatively straight forward. Since no WANG PACE is implemented on these, only migration of the COBOL-74 programma's needs to be done. Data entry and remote connection could be done using telnet and/or serial connections. Transferring data between mainframe and other systems is no problems. All this happens using FTP.
Now, let us think really BIG! Could a case be made to build a system using Linux, which can replace a mainframe computer, given the specs above ? As said above, more numbers and benchmarks are needed on Linux and its implementations to know how powerful Linux can be.
These cases resemble the SOHO, but additionally need very specialized software to support their job. This software is mostly written by very specialised companies (niche software). What would they need in terms of software and maintenance to be convinced to migrate to Linux ?
One of the answers is surely that they can migrate their existing applications easily and that conversion of their source code is supported by tools and API's which provide the same (or better) functionality than their old tools.
Configuration of these systems may be more specialized. Normally the user would only use his system (enter customers, query the system). All administrative and configuration chores could be left to the implementor. The applications themselves are already as user-friendly as they can be, due to their specialised nature.
I have presented several real-world cases, where Linux IMHO could be used. In most cases there are two recurring themes.
The first is the need for migration support from other platforms to Linux. This support spans a whole range, varying from multi-platform compilers over database migration, up to replacement user applications.
The second is the need to provide more user-friendly administration and operation. This may be as well through character-based dialog boxes as through GUI systems. In any case their access should be more centralised.
Other themes which pop up are the following :