NOTE: This article is intended for smaller platforms and does not address larger types of systems such as data warehousing (100-500TB type of systems). That is another topic altogether.

January 2000

Purchasing Memory for a UNIX System

Configuring the hardware for a UNIX system is nearly an act of some sort of dark sorcery. The incredible amount of research required into putting together an entire system takes time. The column will not go into any great detail about building entire systems but one aspect of systems, memory. The difficult part about initial memory purchase actually is not the purchasing but figuring out precisely how much. On certain platform types this is not an issue due to cost (many intel architectures for example have relatively cheap memory). Such is not the case for high end RISC types of architectures. These can cost all the way up to 3,000 dollars for 512MB of memory (depending on the type). Additionally, most often, when budgetary and political constraints begin to take shape (as they always do) during the pricing, memory is almost always scrutinized and the "do we really need this" will inevitably arise.

The Not So Special Cases

Right off the top there are a variety of processes that do not require an intense amount of resources unless they are put into a special situation. In the case of say an internal DHCPd and DNS server, it is unlikely that you would need a great deal of memory. The same goes for file, print and static or low overhead HTTPd services. These types of services do not incur a great deal of overhead and standard matching memory to processor type of configurations will get you by without great difficulty.

Databases and Networks

In my own experience the number one system killer is networking. Occidental denial of service from other systems or appliances that, for lack of a better way to put it, royally suck. Connection timeouts, ICMP flooding, print server outages and bone-heads driving back-hoes through fiber optic cables 2,000 miles away from your site. Unfortunately, these types of problems will pervade (at one level or another) for some time to come. The best way to beat them is to not just have a system that can "handle the load" but handle the load and then some. A classic case in point, older versions of NFS were not exactly the strongest of implementations around (depending on the system of course). I noticed on certain systems when an imported filesystem would drop the system would nearly freeze up regardless of whether or not something like automounter was running. Granted, getting more memory would not have helped that system, but, having extra memory on a newer one gave the system more resources in general and as such, user memory space was less affected when something of that nature occurred.

Nearly parallel to network problems (and many times intrinsically related) are databases. As time has worn on some of the newer databases (Open Source versions excluded) have become somewhat inefficient to the point of acting like some sort of insidious DOS attack. If there is enough memory, many databases will actually completely reside within it which appears to be somewhat inefficient from a systems perspective [ 1 ]. The really great catch with databases is that they are normally accessed via some network interface. The potential for memory problems in this kind of situation is enormous. Luckily, I have never really had it happen, yet.

Multi Purpose Systems

Despite the push towards "one system for this and one system for that" architectures, many systems still have to run a lot of services even if the system does one specific thing. When building a system and deciding on memory, all of the possible factors, a few unknowns and extra padding have to be tossed in. For example, last year, the standard database server spec for memory was roughly 1GB per processor of RAM. Many database systems also had web interfaces to the database(s) so now, one has to account for that, data may be coming from sources external to both the web interface and database itself. In addition, you would need extra horsepower for monitoring, database changes and security. Remember, the spec (in most cases) is designed for optimum running performance, not optimum running performance and finsuid at once.

Of course, the above study is based solely on some rather large systems I maintained. The large commercial databases for example had nearly 16GB of data live on a single server.

Workstation Architectures

Workstations (or home UNIX systems) can be real easy or hard depending on the experience of the user. I knew when my last computer finally keeled over and died precisely what I wanted, what I could take away and where I could fudge. I also had the added benefit of now traversing to my umpteenth purchase/configuration of a system, it was not much of a problem. believe it or not, the user of a specific workstation will most likely know exactly what they need and can tell you exactly what they want (obviously there will be a big difference). Most of the time, memory is not a big deal with UNIX workstations either because it is cheap (like UNIX systems on intel) or because it simply may not be required. The exceptions to the rule of course are graphically intense applications like engineering tools and high end graphics workstations like SGI's Octane.

Piling it Altogether

The hard part is piling it all together. The average base big UNIX box memory is normally correct. As an example, an HP9000 with 512MB of Base RAM. Then, start adding for that massive database, anywhere up to an additional gig of memory, then make room for user processes and a little padding, the next thing you know you are sitting at 2GB on a single processor machine. Sound unrealistic? Not for a big iron UNIX system.

The Art of Guesstimation

You can obviously see how difficult estimating memory can be. It is a great big depends, but, when all else fails, go out to a newsgroup and ask around. You can get plenty of different perspectives and insight as to what you may or may not need. In addition to those are specification lists, some companies are gratuitous with them while others under power (many PC game companies have a bad habit of doing so). All of the information is there free for the taking.

Footnotes and Comments:

  1. In fact it is more efficient since the overhead involved for retrieving data from disk can have worse effects. Additionally, the actual data may not be in memory but pointers to the data instead or control information etc.