NTQ
New Telecom Quarterly
The Evolution of the Interactive Broadband Server, Parts 1 and 2
 
Joan Van Tassel, Ph.D., Consultant
Steve Rose, Pangrac & Associates
 
 
 
Original Articles Appeared
1Q96 & 2Q96
 
Originally Published by Technology Futures, Inc.
 
 
 
 
The Evolution of the
Interactive Broadband
Server-Part 1

Joan Van Tassel, Ph.D. and Steve Rose

Dr. Joan Van Tassel is the author of Advanced Television Systems: Brave New TV, published by Focal Press (1996). She writes for The Hollywood Reporter, WiReD, and other publications on the emerging communications infrastructure. Prior to entering graduate school, she was a documentary filmmaker, known for her work on social issues. Dr. Van Tassel has written for a variety of nonprofit public information campaigns, including the "Friends Don't Let Friends Drive Drunk" campaign. She is the recipient of the prestigious Kenny Rogers Media Award. an Emmy nomination. and many public service advertising awards. Dr. Van Tassel received her Ph.D. and M.A. from the Annenberg School of Communications at the University of Southern California.

Steve Rose is an interactive broadband server consultant with Pangrac & Associates. He started his career in the 1970s in video, became the chief engineer of a small Maui cable company, and ended the 1970s designing cable television automation equipment for CRC Electronics. CRC provided the tape automation equipment for the first Wamer Oube system. CRC was later purchased by Texscan, so during the 1980s, Steve was a computer guy, working with multiuser and networked computers. In 1991, ATC (later Time Warner Cable) asked him to figure out how to provide video on demand for a system with a million subscribers, which launched him on a new career. In 1994, he worked with CableLabs on their Media Server RFI, visiting various server manufacturers and giving a tutorial at the resulting Cabletabs meeting.
 

This is Part I of a two-part article discussing video enabled servers for metropolitan areas. The division is based on two architectural approaches for building interactive broadband servers. In the first part, we discuss developments leading to the creation of these devices, and then turn to servers that are constructed by aggregating conventional single bus computers with other necessary components.

In Part II, we cover servers based on massively parallel architectures and describe what we feel is the most appropriate model for future architectures. While we've have tried to be objective. we do have a point of view that we will develop in the second article.

This article is predominantly hardware oriented. Although common software platforms are important, the nature and scale of the task at hand make the selection of appropriate hardware more immediate to the successful deployment of an interactive broadband server.

 

 

 

What is an interactive broadband server (lBS)? It is a device that delivers many different
kinds of data and provides many simultaneous services. These may include:
 

In addition, the lBS must accurately track, account for, store, and bill for all services while providing for network management! Other terms used to describe a fundamentally similar device have been Video Server, Media Server, and Metropolitan Media Server.

It is clear that the IBS is at the heart of the new interactive broadband businesses that companies in the cable television. telephone, computer, and wireless industries plan to launch. Attractive programs, services, and applications such as high-speed Internet access over cable, video on demand, interactive shopping and games, and many others depend on the availability of a reliable, cost-effective lBS.

The primary difference between an interactive broadband server and a large conventional computer server, used in many organizations for client/server purposes, is the ability of lBS to provide thousands of simultaneous isochronous data streams. Isochronous ("same time") refers to data streams that are time-sensitive and must be delivered continuously without interruption or they become incoherent. An example of isochronous data streams would be real-time video and audio which are retransmitted as soon as they are received, such as a live television signal.

Stakes and Stakeholders

Many different interests are watching the evolution of lBS. Cable, wireless cable, telephone, and computer companies all want to provide new programs, services, and applications to their customers. Equipment suppliers and content providers hope to provide products, when dependable standards for them are available. Finally, regulators and consumer groups want to clarify such issues as universal access, rate structure, privacy, and security.

The high level of interest is a consequence of the anticipated size of the video on demand market. Existing markets are substantial. The advertising revenues for broadcast and cable television were about $30 billion in 1995, according to estimates by the Television Advertising Bureau. The National Cable Television Association reports that cable revenues were $26 billion in 1995. Conservative estimates from Satellite Business News magazine indicate there will be 12 million to 13 million direct broadcast satellite (DBS) subscribers by 2000. Assuming an average bill of $40 a month, satellite delivery revenues would be more than $6 billion by the end of the decade. Wireless cable revenue is expected to be $600 million in 1996. Finally, Paul Kagan & Associates report that the videocassette rental market earned $10 billion in 1995, and sell-through of videos to consumers was about $6 billion.

Packaged programming for other stand-alone devices also brings in significant revenue. Consumers spent $6 billion for interactive games in 1995 says research firm DataQuest and about $8 billion to $9 billion for dedicated game players. The computer is an increasingly profitable venue for video material. There are now 25 million CD-ROM-equipped multimedia machines in consumer hands worldwide and more than 5,000 titles. DataQuest estimates that the market for CD-ROM games is now about $660 million and growing rapidly.

Based on these figures plus a large grain of salt, the video market was more than $60 billion in 1995. While that is less than the telephony market (about $100 billion annually) and even less than the income of power utility companies (about $200 billion), it is still a substantial enough figure for the various stake-holders to concern themselves with the design, deployment, and implementation of an interactive broadband platform, of which the server is a central element.

Requisite Early Developments

In order to understand the evolution of interactive broadband servers, it is helpful to understand the environment which made it possible, and the elements designers had to work with in building them.

Broadband Network Environments

Broadband networks have evolved over the last 10 years, making it possible to deliver a separate video data stream to each connected subscriber. All the new interactive broadband network technologies offer similar deliverable bandwidth expansion and the ability to carry data upstream and downstream. Most of the innovations have come from the cable and telephone industries, each of which has tried to preserve as much as possible of their existing infrastructure. For the cable industry, it is coaxial cable; for the telephone industry, it is twisted pairs of copper wires. In both cases, the final link into subscriber's homes represents the greatest investment.

The Cable Industry -- Hybrid Fiber/Coax: Time Warner Cable pioneered a technology in the late 1980s, which has become known as hybrid fiber/coax (IIFC). It divides existing coax-based systems into neighborhoods of about 500 to 2,000 subscribers each and sends an individual optical fiber from the headend to each neighborhood. No one in the neighborhood is more than four amplifiers away from the optical fiber. By minimizing the number of cascaded amplifiers and upgrading them, it is possible to more than double the bandwidth delivered to each subscriber -- from about 500 Megahertz currently to as much as 1.2 gigahertz, while reducing system noise.

Due to the reduced noise and direct connection to each neighborhood, getting information back from subscribers becomes practical and fundamentally one-way systems become two-way systems. The HFC architecture is being widely adopted by cable companies and some telephone companies, and won the group at Time Warner Cable an Emmy in 1994.

HFC is significant because it allowed companies to consider delivering custom material to each household. The greater bandwidth provided by the HFC architecture led the press to refer to "the five hundred channel cable universe." However, this characterization missed the real innovation, which was that operators could deliver 500 different programs simultaneously to each neighborhood node of 500 subscribers. Put another way, the new technologies make it possible to advance from delivering 50 of the same channels to 50.000 viewers to offering an individual channel to each of 50,000 interactive viewers.

Further, because of the way that HFC expands the available bandwidth, cable operators can provide hybrid service: New digital programs can be delivered to individual households over the new bandwidth from a server and long-term storage, while leaving intact the old analog services on the existing bandwidth. This means no changes for subscribers unless they choose to take advantage of new services.

Telcos-ADSL and Fiber-to the-Curb (FTTC): The telephone industry had a different asset to protect: a network of twisted pairs of copper wire that took more than 100 years and $1,500 per household to construct. This infrastructure required a different approach. as the bandwidth a twisted pair can support and the distance it can transport a high-bandwidth signal are greatly restricted, as compared with coaxial cable.

As a result, telco designs focused on sending one video signal at a time from the central office to the subscriber over the twisted pair. An example is Asymmetrical Digital Subscriber Line (ADSL) technology, that uses a technique which trades reduced upstream bandwidth for much greater downstream bandwidth. As the limitations of ADSL became apparent, phone companies focused on carrying fiber optic cable deep into the neighborhood, an architecture called Fiber-to-the-Curb. Each FITC node serves about 20 subscribers over existing twisted pairs from the curb to the home.

Wireless Systems: There are two wireless infrastructures that deliver television. Wireless cable systems (sometimes called MMDS for multichannel, multipoint distribution service) cover about a 35-mile radius and are not likely to become interactive. One reason is that MMDS is typically promoted as a low-cost alternative to wired cable service. In addition, MMDS would need to invest in a cellular-based return path to make two-way communication feasible.

By contrast, LMDS (local multipoint distribution service) systems use a cellular approach, where each transmitter reaches a defined area as small as two or three miles. Nonadjacent cells can carry different content just as with cellular telephony, greatly increasing the effective deliverable bandwidth to the overall service area. Wireless data return from subscribers to the cell site is also possible, because the return path from the viewer back to the cell is so short that it needs only a four- to six-inch antenna and a little power, using the cell structure already in place.

ATM Switching

Asynchronous Transfer Mode (ATM) is the first protocol for data transport that allows the mixing of voice, video, audio, and data signals on the same circuit. Switches that incorporate ATM technology have the ability to switch any input circuit to any output circuit at lower costs, higher bandwidth, and greater speed than any preceding technology. The emergence of ATM provided the mindset for thinking about signal switching for video servers.

Digital Video Compression

Digital video compression allows the transmission of four to twelve programs in the same bandwidth required by one analog channel. Once an analog video signal is digitized, several techniques can be applied to reduce the amount of bandwidth necessary to transmit it. Compression techniques remove information that cannot be perceived by the human eye, encode redundant information so that it is transmitted only once, and then reconstitute the original image at the receiving end. However, the greater the amount of information that is discarded, the more likely it is that compression artifacts will become noticeable.

Several techniques of compression have been developed, including discrete cosine transform (DCT). wavelet, vector quantitization, and fractal schemes. However, the dominant family of standards was developed by the Moving Picture Experts Group (MPEG) based on DCT.

Other Important Developments

Robotic Storage Libraries: To compete with the video rental business, an equivalent number of titles must be offered. Since many will be infrequently requested, they must be stored off-line in an automated library. These libraries have been developed for the computer industry, ranging in size from jukeboxes that hold a few hundred optical disks to room-size robots that hold thousands of tape cartridges and optical disks.

Powerful Single Bus Computers: In the computer industry, a battle has waged for years between harnessing the power of tens to thousands of processors running independently in parallel linked by a communication mesh, versus connecting a limited number of processors running on a single bus. Parallel processors have gained a reputation for being difficult to program. And, the rapid development of ever more powerful single bus processors has allowed them to match the computing capacity of any existing parallel processor during its lifetime. Most successful computer servers are single bus, single processor designs. Some use a single bus, but multiple identical tightly-coupled processors, and are referred to as Symmetric Multiprocessors (SMP).

Real-Time Encryption: Protecting intellectual property, when it is represented as a digital data stream, requires that the stream be encrypted in real time so that each instance of the stream can be separately protected. Isochronous digital video streams pose a particular problem due to their high data rate.

Error Correction Codes: There are two points during isochronous stream delivery that errors must be anticipated and corrected in advance. The first high-potential point occurs when the data is played from the hard disk arrays. Hard disk drives are electromechanical, so they can be expected to fail more often than electronic equipment. Their failures are critical due to loss of data and/or interruption of service. As a result, a group of standards for increasing storage reliability, collectively known as RAID (Redundant Arrays of Independent Disks), has evolved which allows disks to be grouped so that the failure of a single drive does not affect the output of the array. The penalty for this reliability is typically an increase of 25% in the number of drives required.

The second error-prone point occurs during network transport. Anticipating these errors involves sending enough redundant information that almost any errors caused by transient noise or interference can be fixed on the receiving end. This anticipatory error correction is critical for isochronous data, where there isn't enough time to detect an error and ask for a retransmission of the corrupted data. The procedure is called Forward Error Correction (FEC).

Digital Modulation: In order to transmit digital information over analog channels (important for HFC and wireless networks), the digital information must be changed in format. Digital information has only two levels representing 0 and 1. Modern techniques increase the number of levels through phase and amplitude modulation of each cycle to achieve bit efficiencies of up to six to eight bits per Hertz. Typical of these is 64 QAM (Quadrature Amplitude Modulation), which uses 64 unique combinations to transmit (up to) six bits per Hertz.

Network Management: As computer networks have grown more complex, the difficulties of monitoring and controlling the equipment that constitute the network have grown. Software and standards now allow central management of networks that span the globe. Chief among the current standards is the Simple Network Management Protocol (SNMP).

Business Support: The last few years have seen the creation of reliable computer software to accommodate various complex forms of billing. Cable television billing typically includes a flat monthly fee, fixed monthly additions for additional services, plus individual billing for special events. Telephone billing includes flat monthly charges, plus billing on the basis of utilization on a minute-by-minute or second-by-second basis, plus billing on behalf of secondary companies (e.g., long distance providers). Telemarketing services require immediate credit verification and real-time interaction with financial and fulfillment (inventory and shipping) systems. Finally, just as neighborhood shopping centers must monitor the sales of associated businesses to enable billing on a percentage of revenue, so must virtual shopping centers on the network track sales. All of these complex billing models will be required in an interactive broadband network.

Conventional Interactive Broadband Servers

Defining the Task

Regardless of its architecture, an lBS must carry out a precise set of tasks. In addition, it must perform them reliably and cost-effectively. These functions are:
 

A Brute Force Server Model

In order to conceptualize the cost and complexity of a metropolitan lBS. we take a brute force approach (see Figure 1). This model is a heuristic device, not an actual (or even possible) design for a real-world server.

 
Part 1, Figure 1: Brute Force Server Model (Source: Steve Rose)
 
We assume a population of 1,000,000 subscribers, 200,000 simultaneous streams, and a library of 300 movies. Each movie requires about three gigabytes of storage. resulting in about one terabyte of memory to store all the movies. Although it would be too expensive to actually store this content with RAM chips, we will store the movies in RAM, using 16 megabit DRAMs for comparison purposes. This storage will require about 500,000 RAM chips.

To sort each resulting four megabits per second (Mb/s) stream to the viewer requesting it, we hypothesize a switch made of 100 input by 100 output crosspoint switch chips. To actually deliver each stream to the requester, we will need a routing switch with 200,000 inputs (the number of streams) and 1,000,000 outputs (we have to be able to switch any stream to any subscriber). With our 100 x 100 switch chips, we will need 2.000 times 10,000 chips, or 20,000,000 switch chips -- 40 times more than the number of RAM chips!

It is clear that the big problem is not storage, but switching. The conclusion that emerges is: The cost of storage increases linearly with capacity, and the cost of switching increases geometrically with capacity.

Assumptions

Compressed digital video streams have data rates varying from 1.2 Mb/s to 9 Mb/s. with 4 Mb/s being a typical choice for an excellent quality video plus audio signal. The definition of excellent quality is that the image is equivalent to Super-VHS or Hi8 video and CD-quality multichannel audio. In the remainder of this article, calculations will be based on 4 Mb/s constant bit rate MPEG-2 encoding.

Assuming that an average movie lasts 100 minutes, it will consume three gigabytes of storage. The estimates of the maximum number of subscribers using interactive digital services simultaneously have ranged from 7% to 40% of total subscribers. We will use a figure of 20% to represent the peak capacity design point for our calculations.

We will also assume that the subscriber population for a single server will be between 20,000 and 100,000 subscribers and the typical size is 50,000 households. 'This results in a server design that must be capable of back-to-front throughput of 16 gigabits per second (Gb/s) to 80 Gb/s throughput of 40 Gb/s is needed to serve a 50,000 household system.

Storage and Importation of Content into Server

How should library content be stored? Server manufacturers' designs for library storage range from a single tape drive to complete robotic tape libraries. Due to their high cost, robotic libraries are generally proposed as an option rather than a necessity. Typically, successive generations of storage equipment offer ever-increasing capacity per tape or disc. Ironically, when greater capacity allows more than one program to be stored on a single tape or disc, then the so-called improvement becomes a liability. Library storage media allow only one program to be accessed at a time; if a request arrives for another title on the same medium while the first one is still loading, the second title cannot be read until the first one is finished. Thus, the only reason to store a second title on a tape or disc should be as a physical backup for the primary copy of the title stored elsewhere.

Reassemble Data into a Stream: Generation

It takes Herculean engineering, especially in the software arena. to turn a conventional computer into a server that can produce isochronous video streams. In spite of the difficulties, many of the major computer vendors have done so. Hewlett-Packard, Sun, Silicon Graphics, and DEC all see the problem as one of I/O (input/output) bandwidth and have designed servers based on conventional computers. This type of server typically has a single CPU or several tightly-coupled identical processors, a high-speed bus, an array of associated hard disk drives, and substantial I/O (disk I/O on one side and network I/O on the other). The server will generate from 16 to 125 video streams of video from a single copy of a program.

When more streams are required, these stream generators are grouped. Since each has its own hard disk storage, a title which is popular enough to generate demand for more streams than one unit can provide must be replicated to as many units as necessary to meet the demand. In fact, since many titles can be stored on one stream generator, it must be managed so that a popular title doesn't block access to other titles stored uniquely on that unit, by consuming all of its output streams.

In a 50,000 subscriber area, the peak server load will be 10,000 streams. This means that, to satisfy peak load requirements, we will need 100 stream generators, each with a 100 stream capacity. First-run movie demand, as indicated by box office receipts, shows that one title can account for as much as 40% of demand. With a peak server load of 10,000 streams, a single title could account for 4,000 of the streams.

Assuming that we will allow up to 80% of the streams on one unit to come from a single title, that title will have to be replicated on at least 50 of the stream generators (4,000 streams divided by 80 streams per generator). At three gigabytes per copy, at least 150 gigabytes of storage will have to be devoted on the system to that title.

If the operator decides to store it on every server, the storage requirement rises to 300 GB. In a large metropolitan system with 100,000 subscribers, this redundant storage would expand to at least 1.5 terabytes (TB), regardless of whether it was on one large, centrally located server cluster or 10 smaller geographically dispersed clusters.

Encryption

Encryption protects the privacy and security of both downstream delivery to customers and upstream communication coming from them. Surveys of actual and potential users of interactive services consistently find that security and privacy are high on the list of consumers' concerns. In addition, when interactive broadband systems are used for telecommuting so that employees can work at home, these issues are extremely important to the employer companies. For the system operator, security is crucial to protect the system's assets-programming-from signal pirates and to make sure that only the subscriber who pays for the delivery of the signal is able to decode it. The logical place to apply encryption to a stream is at the point of its generation.

Stream Sorting, Routing, and Multiplexing

As demonstrated by the brute force model, generating streams is the easy part, compared with sorting them so that the correct stream is delivered to the subscriber requesting it. The switch which sorts the streams may be thought of as having as many inputs as there are streams to be generated under peak conditions, and as many outputs as there are subscribers. Any input can be connected to any output, so the size of the switch is proportional to the number of inputs times the number of outputs.

The volume of information in video streams, the high transport speed required. and the ability to switch all types of information has caused most server architects to incorporate ATM switching into their designs to sort and route all the streams and to multiplex streams going to the same neighborhood. In addition, ATM makes interconnection with other services using the ATM protocol straightforward (for example, long distance service providers).

Encoding for Transport: Monitor Quality of Service, FEC, and Modulation

The final point to monitor the quality of the digital signal is just before forward error correction (FEC). Following quality monitoring, a processor generates redundant information that enables the FEC. The modulation procedure begins with encoding the digital signal to an analog signal, which improves the modulation efficiency (bits per Hertz), then to impress an RF carrier. This results in an analog RF signal suitable for distribution over an HFC or wireless system.

Conventional designs usually place FEC and quality monitoring functions within the modulator. As a result, an "intelligent modulator" that accepts standard ATM input and provides sufficient processing power to accomplish the additional information-modifying tasks is needed for each output channel.

Typically, the modulator puts 27 Mb/s of data on a six MHz chunk of spectrum (the space allotted for a conventional analog channel). This allocation allows for six 4-Mb/s compressed digital streams plus overhead (four Mb/s, divided into 27 MHz equals 24 MHz, with three MHz capacity left over for overhead).

Operations Support and Billing

Each element of the server must provide standard management information. This information, carried on a separate network, is processed by an independent computer running network monitoring and management software, and is referred to as the Operations Support System (OSS). It is crucial for operators to be able to flag error conditions immediately and to access remote operation of the management system. When an error occurs, it is inconvenient and time-consuming to have to go to the physical premises to begin to diagnose and solve the problem.

Another separate control computer, or sometimes multiple computers arranged in a hierarchy, direct the operation of the stream servers and switch. This system manages content and directs normal operation of the system. Billing information is sent from the control computer to a separate billing computer system. For the sake of reliability, an isolated machine is used to bill for services. This physical separation of the billing data from the content also ensures that access to the content server cannot be used to hack billing data. The billing system is referred to as the Business Support System (BSS).

Part 1, Figure 2: The Conventional Trial Size Server (Source: Steve Rose)
 

Additional Considerations for Interactive Broadband Servers of Conventional Architecture

Library Storage Versus Primary Storage

The speed of importing content from storage into the stream generator is a key variable in server design because it determines how quickly the viewer can see requested material. The delay between request and fulfillment is called latency.

The critical speed for importation is real time (e.g.. four Mb/s). If the content can be downloaded at the same or a faster rate than the rate of delivery to the subscriber, then it is possible to begin to deliver the content shortly after beginning the download. However, if the import rate is much slower than real time, then the entire program must be loaded before delivery to the subscriber can begin.

For example, a dominant vendor's product takes three times longer than real-time to download from a single tape drive to the stream generator. This speed means that if a customer requests a 100-minute movie that is not already stored in the stream generator, it will be five hours after loading begins from tape before the movie can even begin to be delivered to the subscriber. A further consequence of the 3x real-time import speed and the single drive occurs when the system is first installed. At that time, if the server has a capacity of only 200 hours of content (120 movies), it will take 600 hours to load them into the stream generators. This is 15 work-weeks of eight-hour work days, or five weeks of 24-hour days!

A rapid import rate confers important advantages that consumers like, such as "VCR functionality." If the material is imported from the library at a real-time rate, the subscriber can be allowed VCR functionality (except fast forward). If it is imported at or above the fast forward rate, the subscriber can receive the requested program within seconds of the time that the download begins-with full VCR functionality.

It is not possible to implement fast forward functionality directly from the library by jumping ahead in the material because the read element of storage devices (e.g. CD-ROM, DVD, or tape drive) addresses only one stream at a time. If the library read element jumped ahead to support a viewer's request for fast forward, there would be a gap in the material stored in the hard disk array. If a second customer were to request the same program, it would not be possible to deliver it until the first viewer was done, as the copy stored on hard disk (which can support multiple viewers) would be incomplete.

The nature of the transfer from the library to hard disk storage makes an enormous difference in the nature of the server. If the transfer occurs at a rate equal to or faster than real time and the server supports immediate delivery, then the library becomes the primary storage of the system, and consumers have immediate access to the full content of the library.

If the transfer is slower than real time or immediate delivery is not supported, then the only titles which may be offered to subscribers for immediate consumption are those already loaded in hard disk storage. This usually represents the difference between 100 or 200 and thousands of titles.

Stream Generator Size

Although 100 is a large number of isochronous streams from the perspective of conventional computer technology, it is a drop in the bucket relative to the needs of a metropolitan interactive broadband server. In a 50,000 subscriber area, 10,000 peak streams means 100 conventional stream generators will be needed to fulfill subscriber requests fur video. These stream generators each occupy from one-fourth of a rack to three full racks (about the size of a phone booth), so it will require 25 to 300 racks of equipment to generate video streams. And 50,000 subscribers is only a modest cable system, or one hub of a metropolitan area.

ATM Switch Limitations

The ATM switches used to sort the video streams bring their own set of problems. The ATM protocol is quite rigid, which means that stream generators must produce fully standards-compliant output. This compliance adds considerably to the cost of the switch. Another difficulty is that even a large ATM switch is small when used for compressed digital video. For example, a large 16 Gb/s ATM switch, even if it could be fully utilized, would provide about one-third of the 40 gigabit bandwidth needed by a 10,000 stream server. The sheer volume of video data makes it necessary to partition each server complex into multiple independent servers and switches. This partitioning is expensive, inefficient, and difficult to manage.

Another limitation is the nature of the traffic, which is largely unidirectional because so much of the information is the downstream delivery of high bandwidth video on demand to subscribers. The design of ATM switches assumes that there will be approximately the same amount of traffic in both directions, and each downstream channel is paired with an upstream channel.

It would seem logical to reverse some of the upstream channels and use them to support the downstream traffic. Unfortunately, when an ATM switch is wired this way, it triggers SNMP (Simple Network Management Protocol) error messages. Disabling the error messages removes the ability to manage and monitor the switch, which is then operating in the 'crossed-fingers" management mode.

As a result, upstream and downstream channels must remain paired, and almost half of the I/O capacity of the switch (and in some cases, some of its throughput bandwidth) goes to waste. This results in up to twice as many switches being required.

Intelligent Modulator Size

If each modulator outputs six streams, it means that 17 modulators will be required for each neighborhood to meet the 100 stream anticipated peak demand (6 x 17 = 102). In a headend with 100 neighborhoods, 1,700 modulators of this type are required. One brand of modulator fits four on a standard equipment rack, the smallest modulators fit about 22. Thus, the system operator needs 77 to 425 equipment racks to provide for just the downstream traffic of a 10,000 stream server. In fact, control and signaling channels for each household call for additional modulators and demodulators, accounting for at least 10 more racks of equipment.

Scalability

The fundamental problem with conventional designs is that they don't scale. They can be made larger by full replication of small servers, but this creates substantial redundant (and expensive) storage. In addition, it creates new problems of headend design and cost of operation.

For example, our considerations so far have led us to conclude that the conventional interactive broadband server designed for a 50,000 subscriber area calls for between 25 and 300 racks of equipment for stream generation, and 77 to 425 racks for downstream modulation and associated processing. Switching equipment is relatively small, requiring about 10 to 20 racks of gear. The sum of these requirements amounts to between 112 and 745 racks of equipment.

If we allow 12 square feet per rack including service access isles, and 15% for office space, air and power conditioning equipment, then a facility for 50,000 subscribers will occupy from 1,600 to 10,280 square feet. While these areas are less than huge, they are substantial when compared with the customary' 600 square feet used by the headend of an average 50,000 subscriber cable system. For a 500,000 subscriber system, the total area would be about an acre, full of electronic equipment. Zoning approval delays and ongoing real estate costs must be factored into the planning for a conventional lBS.

Part 1, Figure 3: The 256-Rack Headend (Source: Steve Rose)
 

 

Now, let us turn to the problem of power that the conventional design requires. In the design plans for an actual trial-size facility of an interactive broadband server, estimates of power requirements for the stream generators, switches, and modulators added up to 200 watts per video stream. This figure means that a 50,000 subscriber server putting out 10,000 digital streams would require two megawatts. Even assuming there might be some economies that could be employed to reduce the usage to 50 to 100 watts per stream, it would still result in a monthly bill of $70,000 to $140,000 including air conditioning. 1

System Reliability

The need to compete with videocassette rental outlets results in a strange paradox: The digital service that brings in the least revenue (per megahertz of bandwidth) demands the greatest system reliability. Video on demand service requires that three gigabytes of data be delivered over a 100-minute period for a total gross revenue of as little as $0.99.

In order for the viewer to receive a coherent picture, the delivery must be nearly perfect. What is transported is not digital video, where a single error results in a bad pixel that disrupts a tiny portion of the picture for 1/60th of a second. Rather, the data is a compressed digital data stream, where a modest error rate can affect a large part of the screen for up to several seconds or even disrupt the session altogether. If the disruption results in a complaint from the subscriber, it costs more than $1.00 to handle the phone call.

Even without the loss of the revenue from the movie, there is an extremely narrow profit margin because the net revenue from the $0.99 to $4.00 charge is so low. For example, the net revenue from a licensed $0.99 movie is typically less than $0.02! One phone call wipes out the profit from over 50 plays. Any disruption of service will offset any possible profit. The server itself contributes little to overall system reliability; however, it affects the new digital services from which operators hope to derive additional revenue. The server must be designed to deliver signals without disrupting service, in spite of the failure of any one of its components.

Conventional servers are sometimes designed with RAID technology to accommodate the failure of single hard disks. However the remainder of the server is susceptible to single point failures. The sheer amount of equipment and the complex interfacing of equipment from multiple suppliers called for by the conventional approach multiplies the probability of failures and the difficulty of correcting them when they occur

The evolution of the conventional lBS has been essentially linear. Each function has been addressed by adding on a new layer of hardware and software.

The design begins logically with storage and stream generation. Then, the switch is added for sorting. routing, and multiplexing. Appended downstream modulators encode the output for transport. Then, the need for end-to-end management results in the overlay of an operational support system. Finally, business requirements demand an additional system to monitor usage, store the data, and charge customers for service.

 

 
Part 1, Table 1: Conventional Server Design Solutions
 
Importing Content Slower than real-time from tape
Generating isochronous streams Ganged single bus computers
Sorting and routing streams ATM switch
Multiplexing output streams ATM switch
Quality monitoring, FEC Intelligent modulators
Facility Design (50K subs) 112 to 745 racks of equipment
Network management (OSS) Separate computer support
Billing, other business (BSS) Separate computer support
Reliability Problematic: Vulnerable to single points of failure

Table 1 recaps traditional approaches to the design of interactive broadband servers. (Source: Van Tassel & Rose)

 

1 Here is how the numbers play out. Electrical usage takes place whether or not the operation is actually generating streams; that is, the equipment is always on drawing electricity. Each of the 10,000 streams consumes l00 warts of power. Thus, at any instant, the system is using a megawatt of electricity. Over the course of an hour, this consumption becomes a megawatt hour, or 1,000 kilowatt hours for all the streams. At $ 0.10 per kWh (1,000 x .10 = $100), the systems consume $100 an hour. Given 24 hours in a day, electrical power alone will cost $2,400 per day, or $72,000 per month. Lighting and air conditioning the equipment and staff will double the electricity needed by the headend.

 

 

The Evolution of the
Interactive Broadband
Server

Part 2-The Cool Hairy Onion

Why a Cool, Hairy Onion?

In the 1970s, discussions about the most efficient configuration for the structure of future computer networks centered around a metaphor close to our title: the hairy smoking golf ball. The image derived from the idea that, in order to minimize delay, all processors must be as close as possible to each other-hence the image of a sphere. The limits imposed by the speed of light meant the ball had to be small, like a golf ball. As the size was reduced, it was recognized that the power density would increase within this confined space, heating up enough to smoke. Finally, as all the communications lines were squeezed together by the shrinking surface, the ball would begin to appear 'hairy'.

We are at a similar point with respect to interactive broadband platforms as these early designers were when they considered future computers. However, the complexity of the technology has increased so that the simple golf ball processor of yesterday's computers has become today's onion, with richly connected layers discriminated by function and responsibility.

This paper covers the design of the media server because it is the brain of viable, efficient, and cost-effective, large-scale, two-way networks. As Al Kovalick, the media server architect at Hewlett-Packard, noted, "People ask 'why work on video servers?’ The answer is that if I solve that problem, I've solved everything." "Everything" includes large-scale, two-way broadband networks of all kinds, whether they are constructed and operated by a cable, telephone, or computer company. The content provided may be interactive television programming, video on demand, interactive shopping, or-now a subject of significant current discussion -- video-enabled Internet and World Wide Web traffic. Indeed, the quantity of information that must be stored, reconstituted, sorted, and delivered overwhelms any other current application, by about three orders of magnitude.

The relatively small populations served by each Internet Service Provider and the geographical dispersion of system hardware has obscured the fact that the design of broadband networks is application independent. The rapid delivery of large amounts of data, especially time-sensitive material, to an enormous number of participants demands the same equipment, including the lBS, regardless of the type or purpose of the communicated content.

As more people begin to use the Internet, the same issues that arose in discussions of interactive television systems will come to the fore. Such aspects include the design of storage, servers, switching, security and encryption, error detection and correction, bandwidth management, and business support. In particular, the issue of bandwidth distribution becomes critical.

Understanding the Problem

We have a remarkably short attention span. In 1993 and 1994, the rallying cry was "Interactive Video on Demand.". In 1995, VOD became passe, and it became popular to dismiss the whole concept. This new attitude is coincident with the failure of many telco and cable tests to deliver promised services, with the notable exception of Time Warner Cable's Full Service Network in Orlando.

The new bandwagon has become "video over the Internet." What people have not seemed to recognize is that interactive television and video over the Net are the same thing. For example, consider quality of service. There is no difference in the compression schemes used to deliver digital video via the Internet, ADSL, or cable systems. A given level of quality requires the same data rate regardless of the means of delivery. It is almost silly to state this, since the means of Internet delivery is generally via telephone or cable. People will tolerate a low data rate, poor quality presentation via their computer for as long as it remains a novelty or is provided free.

Planning for the level of demand also remains the same. For interactive digital video, cable providers have assumed that peak demand would fall between 7% and 40% of subscribers, with 20% commonly used as the design point. If this is a reasonable estimate for demand for these services, then it makes little difference how the services are delivered: Total peak bandwidth demand equals the number of subscribers times the bandwidth per subscriber. As we pointed out in Part One a typical cable headend area serving 50,000 subscribers. at 20% peak demand allowing four Mb/s (megabits per second) per stream. the total headend output bandwidth is 40 Gb/s.

MCI announced (with pride) in March 1996 that it is upgrading its Internet backbone from 45 Mb/s (DS-3 rate) to 155 Mb/s (OC-3 rate) to keep up with demand for bandwidth. This major step forward at relieving Internet congestion is their backbone bandwidth for the entire Internet. It does not compare well with the requirement of 40 Gb/s for a community of (only) 50,000 subscribers!

Note that the incremental cost of bandwidth from a headend or central office to its area of service is basically zero, whereas backbone bandwidth is scarce and expensive. When the nature of the video traffic to subscribers is generally the same content repeated at slightly different points in time, it is inefficient and costly to use backbone bandwidth for this purpose.

What is required is a local server, which receives one copy of the content by the least expensive means available, stores it, and reconstitutes it on demand for any subscriber. It doesn't matter if you are talking about a 'regional server" concept for cable, or "video enabled Internet services" --material must be cached as close to the consumer as possible to make the system practical economically and technically. The obvious exception for which backbone bandwidth must be consumed is real-time data of any sort. However. real-time data requires only a single stream per event. If the event is to also be made available on a delayed basis, then it too needs to be cached locally. We will return to this issue of the location of stored material and the server that retrieves it at the end of this article.

An Overview of the Integrated IBS: The Cool Hairy Onion

Any large-scale media server is inherently complex because it must address the full range of tasks listed above (e.g., importing, generating, switching. etc.). However, note that each of these functions involves relatively simple information processing; in effect, highly-effective bit shuffling. The integrated design differs from the linear design by placing almost all of the various processing in a structurally central location. While both designs incorporate layers of processing, in the integrated approach, these layers are functional and conceptual rather than physical. In this section, we briefly describe each functional layer. In later sections, we will cover them in greater detail.
 

 
Part 2, Figure 1: The Integrated Approach to lBS Design (Source: Steve Rose)
 
The Core and Layer 1: Library (Primary) Storage

At the innermost core of the onion is the system's content, stored on several different types of media. There are two levels of storage:
 

At least two robotic mechanisms will have access to the full library, so that no one mechanism can malfunction to produce a single point failure.

The library can store any type of material, including text, data, graphics, compressed audio, animation, and video. It will include pre-produced, read-only material, such as movies. Each title is stored individually to enable random access (access to a second title on the same medium would be blocked during a read of the first title). It will also incorporate read/write media which might be on tape or recordable optical disc. Incoming data streams intended for later regeneration, such as live events, Internet downloads, news groups, or other content not available on high-capacity Digital Versatile Disc (DVD), are routed and written directly to this part of the library. These tapes or discs can be served by the same robotic storage system that accesses the read-only media.

Layers 2 and 3: Caching Storage

The hard disk storage array acts as a cache for the primary storage of the library. In a medium-to-large server, this will involve hundreds to thousands of disk drives, arranged in redundancy protected subarrays (with RAID 5 and a five-drive array, a 25% redundancy factor ensures that no single drive failure will affect data reproduced from that array). With an appropriate architecture, the data stream coming from primary storage for a given title is striped immediately across all drive arrays, allowing worst case access time to the material within seconds.

The RAM cache is not a separate storage area. Rather. it is implemented in each of the many stream generating processors-about 32 megabytes of RAM per processor. One second of compressed video data can be stored in about half a megabyte, so each processor has about a 60-second RAM cache associated with it.

From the imported content, the server generates a coherent stream. In the integrated server, this is done with an array of multiple interconnected processors. They have sufficient processing power to allow them to reassemble the individual chunks of content into a contiguous isochronous data stream, addressing and routing them to the correct output for the subscribers requesting them. Each stream can be reassembled at speeds at least as fast as standard rates for video data.

Layer 4: Switching

The switching layer comes next. However, there is no switch; the switching function is carried out by the same processors that generate the streams, addressing and routing them to appropriate destinations. For example, a subscriber's telephone call might be routed bidirectionally via SONET interconnect to a local exchange or interexchange carrier. At the same time, chunks of content representing an individual stream from one title might be routed unidirectionally to the I/O processor in the next layer connected to the requesting subscriber.

Layer 5: Multiplexing, Encryption, and I/O

The multiplex layer combines the individual data streams destined for one output. The streams are combined in a manner which maintains their isochronicity and labels them in a way that allows them to be identified individually by equipment at the subscriber's premises. Given a data rate of four Mb/s, an OC-3 rate port (155 Mb/s) can carry about 32 simultaneous compressed video data streams plus overhead. This layer also maintains information about individual subscriber requests, and provides additional multiplexed stream services such as forward error correction (FEC). It also demultiplexes incoming subscriber data.

The interconnected processors have sufficient capacity to perform additional tasks, including encrypting each stream, which must be done in real-time on a stream-by-stream basis.

 

Layer 6: Transduction

The final layer consists of transducers which translate the internal server communication to the optical, RF, and electrical standards of the outside world. For the first time since retrieving content from storage, the data has left the server and additional hardware is needed:
 

Management: OSS and BSS

Connecting all layers are the operations support system (OSS) and business support system (BSS) communications. The OSS, which enables system management, frequently uses a separate, relatively slow-speed network connecting all of the server elements. It reports on status conditions and allows changing the operating modes of individual components. System management is responsible for reporting and handling system element failures and preventing bottlenecks during periods of peak demand. The BSS tasks are typically undertaken by yet another separate slow-speed data network. It tracks system utilization on a subscriber-by-subscriber basis and enables billing by almost any parameter:
 

In the integrated model, these functions are also carried out by the array of interconnected processors that generated the stream and switched and multiplexed them. Since all these information processing functions have occurred within the confines of the integrated server and complete knowledge of that processing already resides there, it makes sense to keep network management, the OSS, and business support systems in the server as well, with data spooled out to the appropriate record-keeping MIS network.

The Brains Behind the Integrated lBS

As we have seen, nearly all of the functions carried out by an lBS are entirely controlled and ultimately accomplished by what we have identified as a massively interconnected processor array-the MIPA.

(The only exceptions are the read-only, read/write. and hard disk storage at the center of the onion, and the modulation and transduction at the periphery.) The MIPA consists of many identical processors, each with its own RAM memory, embedded in a high-speed communication matrix or mesh. The MIPA approach results in a truly scalable lBS, with an attendant reduction in size and power consumption.

We believe the case for the MIPA is compelling. Consider the fact that a server with multiple processors, whether ganged processors connected by a bus or mounted on a matrix, is, essentially, a micro network. For any network, given ceteris parabis conditions, the longer the distance, the slower the link. Thus, proximity buys carrying capacity -- bandwidth. When nodes are very close, high speeds can be achieved at modest cost. For these reasons, and the others listed below, we think some form of MIPA architecture is the only way to produce the thousands of simultaneous streams required by even a medium-sized network when digital services are offered to more than a small number of test customers:

    1. It provides the interface bandwidth between the storage elements and the server.
    2. It is inherently segmented to provide the composite throughput unavailable in any other architecture.
    3. It is effective as a switch, to sort the streams to their destinations. It can assemble the streams into any desired format, including ATM. This capability offers enormous savings over using an external switch.
    4. It can provide the necessary output multiplexing.
    5. Since only a portion of the processing horsepower is consumed in generating and sorting streams, there is power left over to provide FEC, encryption, and quality of service monitoring. Thus, the modulators can be reduced in complexity to simple line cards and installed in the same box, rather than requiring space-grabbing, power-guzzling intelligent modulators.
    6. Combined with carefully thought out reliability strategies, a MIPA will also handle the requirements for operations system support.
    7. With appropriate firewalls and security to impede hacking, a MIPA can manage the business system support as well. Since all aspects of the session are accomplished within the MIPA, it already has the needed information to track the services that are requested and provided, so BSS becomes a simple function. Most architectures call for a super reliable system to be used for BSS -- but a MIPA, with its inherent redundancy, is able to be built to be the most reliable possible computer!
However, MIPA architecture is not the whole answer to building IBSs. As we shall see, the design of the layers of storage is crucial.

Storage Design: The Heart of the lBS

Primary Library Storage

Library storage is physically separate from the server. It holds the assets of the on-demand system in readiness for use. Preproduced material is likely to be distributed on read-only media, such as DVD or write-protected individual magnetic tapes. DVD involves the lowest cost of reproduction (as a stamped, rather than recorded medium) and the highest distribution bandwidth (air freight). The discs and tapes must be loaded manually into a physically secure robotic storage system. (A substantial level of protection is required by the studios who own the content to protect their assets against piracy.)

Read/write media (now tape or hard disk and most likely recordable optical disc in the future) are also part of the library for storage of content for later download or playback. Library storage will accommodate hundreds to tens of thousands of titles available on demand to a subscriber. Our design defines library storage as primary storage, with all other stages considered as caches. We place the primary storage library at each server site, and suggest a transfer rate from primary storage which is fast enough to accommodate fast forward functionality. This means that the number of titles which may be offered in real time to subscribers is determined by the capacity of the archival library, rather than the hard disk storage capacity of the server, as it is in most conventional architectures.

Caching and the Least-Recently-Used (LRU) Strategy

To make an integrated lBS with on-line reliability, the hard disk and RAM memory must be implemented in a very specific manner. Both function as cache memory' that is managed automatically using a "Least-Recently-Used" (LRU) algorithm.

Let us first consider the advantages of caching and LRU. As we observed in Part 1, the fact that many customers want the same material poses great problems for conventional server design. For example, as many as 40% of the audience may want to see the same movie; similarly, a majority of Internauts may want to access the Netscape website at some time during their on-line session. Unavoidable redundant storage occurs because popular material must be stored on each group of connected single processors because they don't access a shared library. In addition, a single title cannot entirely dominate the limited number of streams output by each processor group, or customers won't be able to receive anything but the most popular content.

These difficulties mean that both storage and transport bandwidth must be closely managed relative to anticipated demand. Some harried executive whose job depends on the quality of his or her decisions must estimate how popular each type of material will be (which may vary by area of service, time of day, and day of the week, or what the reviewer said last night), and make an educated guess as to how much demand there will be for the content. In a conventional system that doesn't offer real-time replication, these guesses must be made significantly in advance. The selection is irrevocable, as both storage and throughput bandwidth will be allocated, based on these estimates (guesses). By the time the true demand is known, at the period of peak consumption, there will be no bandwidth available to reallocate resources. The more popular the executive believes some specific material will be, the fewer choices will be available to customers, since each redundant copy occupies the storage required for additional material. Incorrect decisions result in content not being available to all viewers, which can lead to dissatisfaction with the system and a return to rental store habits.

By contrast, LRU is an invisible hand that will manage the system perfectly, even though it has no awareness of content. The actual second-by-second popularity of requested content will determine where it is stored, ensuring that the most frequently asked-for material will be immediately ready for customers, while the least popular material will be available after a short latency

.

The means of implementing LRU is caching. The hard disk arrays are a title by title cache for the primary library storage; the RAM is the immediate cache for the hard disks. Without a cache, a program must be reread from storage every time it is requested. It is a characteristic of primary storage devices such as tape machines and optical discs is that they transfer only one stream at a time. Thus, if this stream is to feed the server directly, then only one customer can be served: One storage area, one stream, one customer.

With caching, if the requested portion has been so recently read that it is still in RAM, it is served from RAM with no need to access the hard disk. When the RAM cache area is full, and a new portion of the program must be read, the oldest segment of data in RAM is erased by the new content that reuses that area of memory. The same process holds for the hard disks. The most recently requested material is still on the disks; as newer material requires storage space, material unused for the longest time is erased and the new content is stored in that area of the disk.

Suppose there is both a hard disk cache and a RAM cache. The first customer is served by a stream that goes from the primary storage to the hard disk, to RAM, and out to the consumer. The second, third, and fourth customers (up to the maximum number of customers the system will support) will be served from RAM, which allows multiple access within at least a fraction of a second, or from hard disk, which allows multiple access within at least the entire period a movie is in demand: One storage area, two caches, many streams, many customers.

Note that at any moment, the 60 seconds of material cached in one processor’s memory is the most recently requested individual seconds from all of the material stored on the array. They are not 60 contiguous seconds from a single title or file. This seems strange at first; however. contiguous seconds from the same material are stored on successive arrays. Thus, each second cached in a single array controller is from a different title or from widely separated seconds from the same material. As the relative popularity of particular content shifts moment by moment, the holdings of the cache will change automatically. The contents of the cache are always determined by the instantaneous demand for the seconds stored on the array, and never has to be "managed" externally.

This multi-access capability of RAM might lead a designer to consider making all storage RAM. However, this strategy has been tried by at least one vendor, and its cost is too great, approximately 100 to 400 times as much as hard disk storage. By utilizing primary storage and a double-cache hard disk/RAM system, the operator can maximize the cost/capacity ratio for the optimum level.

Visualizing the Operation of LRU Caching

In a hypothetical system, the server has enough RAM to hold about 300 minutes of content. Suppose a customer requests a hit movie ranking #1 in popularity or access to the Netscape site. Since both these choices are very popular, they are already stored in RAM and can be sent to the subscriber immediately. The storage of the material in RAM is determined automatically by the popularity of the title and the specific system. Now suppose another customer requests the movie ranking #23 or the Elfnet homepage. These are viewed often enough that they are already cached in hard disk storage. so they are sent directly from there, without generating a library request. As they are streamed out, they are moved second by second from disk to RAM. But since demand for them is relatively low, the RAM in which they are stored is likely to be reused for other material before another request for this content is received.

A third customer requests an obscure Iranian film, movie #874, or the homepage of their old college friend in Poughkeepsie. No one has requested these materials for 10 months (or perhaps never before). The film is stored only in primary storage. and this personal homepage will have to be downloaded from the cousin's server in Poughkeepsie. If a connection cannot be made immediately while the user is on-line, the homepage will have to be stored on a read/write medium, awaiting downloading at the user's convenience. In the case of the movie, after a delay of a few seconds for robotic retrieval from permanent, primary library storage and delivery to the playback unit, the first segment of the actual film is transferred, and then the consumer starts receiving the requested material.

According to the LRU algorithm, all of hit movie #1 and the Netscape website will probably stay in RAM for some time -- as long as they retain their popularity. However, the less popular materials, like the cousin's homepage and the Iranian film, will probably be replaced in RAM within minutes, or even seconds, by more popular content. This infrequently-requested content will also be replaced on the hard disk drives by more recently requested material, whenever it is not being used and the space is required. All this is accomplished without any person guessing at the likely demand for provided content. The system itself automatically retains the second-by-second demand for all material, allowing the marketing department to track requests for information, titles, and activities, by time and by consumer demographics.

In this manner, the most popular content is automatically buffered in RAM or hard disk without any active management. If there were a five-minute spurt of demand for a movie right after the news, only that five minutes would remain in the RAM cache, rather than the entire movie. This "pig in the python" continues to move as time passes -- the period of the movie in the RAM cache would continue to change in real time. Individual viewers might jump into or out of the cache as they used VCR-like controls. New popular periods of the movie might emerge with large viewer clusters. LRU ensures that it all happens automatically.

Hard Disk Storage

Dr. Foaud Tobagi of Stanford invented a technique he called "wide striping," which we refer to as vertical striping. It is one of two closely-related striping techniques used for storing material across multiple hard disk arrays. The second technique we call horizontal striping.

Horizontal striping refers to the technique of spreading a file across a small number of hard drives (typically five) for the purpose of increasing storage bandwidth to more closely match processor bandwidth and to increase reliability. It determines the number of disks to be placed in any given RAID array when a fixed amount of content is spread across a variable number of disks in that single array. For example, at four Mb/s, a one-second (:01) allocation would correspond to .5 megabytes of storage. By definition, as a fixed amount of storage is spread over more and more disks, there is a diminishing return for each additional disk, as the seek time from one allocation unit to the next becomes the dominant factor. Two to eight disks seem to be optimal for RAID techniques of increasing speed and using redundancy to protect against a single device failure.

Vertical striping exists for the purpose of load balancing. It allows demand to be spread evenly across all of the horizontally striped arrays in the system. Vertical striping is accomplished by distributing small chunks of every title over all the arrays in the system, rather than storing the title on a single array. This technique requires a system which can reassemble output from all of the arrays into isochronous streams, which can be accomplished only by a massively interconnected processor array architecture. Since a fixed number of simultaneous streams can be supported by each array, the number of arrays and therefore the number of drives in a system is directly proportional to the number of subscribers: The greater the number of subscribers, the more hard disk arrays will be required.

Distributing content over all of the drives allows the system to support any number of subscribers from a single copy of a program. As we have seen with conventional server architectures, storing titles on individual drives or arrays results in uneven utilization of system capacity, the need for outrageously redundant storage, and interactive intelligent management of storage and replication. Vertical striping, LRU caching, and MIPA architecture avoid these difficulties. However, it is important that vertical striping be approached correctly so that it doesn't introduce unacceptable latencies between the time a title is requested and the time delivery begins.

Picturing Vertical Striping

In order to make it easy to visualize the way vertical striping works, we will look at the way a group of individual drives would be organized. Assume that we have about 150 subscribers, and expect a peak demand of 30 streams from our digital system. An inexpensive drive can support about six 4 Mb/s streams simultaneously, so we will use six drives (we'll explain the sixth drive later). We will use a vertical striping chunk size of one second, so that for a given title, the first second will be on drive one, the second on drive two, the sixth on drive two, the seventh on drive one, the eighth on drive two, and so on.

Picture the drives as a circle of wagons. each with six seats, moving past the loading point at a rate of one wagon per second (see Figure 2). Each wagon represents not a drive, but its capacity to accommodate simultaneous viewers during one chunk period. When someone wants to board (start receiving a data stream), they have to wait for an empty seat to come around. If there were only five wagons, and 30 seats to accommodate an anticipated peak load of 30 riders, the last rider might have to wait for the entire circle to pass to claim the last seat. By putting on an extra wagon, and managing the boarding, we can guarantee a seat will always be available almost immediately.

This ability to accommodate customers becomes much more important as the number of wagons increases. For example, with 9,000 riders and 1,500 wagons, the last rider would have to wait up to 25 minutes to board. By adding 20% more wagons, we can guarantee an empty seat on each wagon, and almost no boarding latency.

 
Part 2, Figure 2: Wagon Illustration (Source: Steve Rose)
 
So much for the wagon analogy. The important thing is this: In building a system, we can take a very good guess as to the anticipated total peak demand. Vertical striping allows us to build a system to meet this peak demand by distributing the demand evenly over all of our resources. Conventional designs incorporating localized storage of content force micromanagement of the system title by title, service by service, and moment by moment as the relative demand for different titles shifts. They require wasteful replication of material across as many storage elements as are required to meet the anticipated demand.

A Final Note on the Location of the Server

Schemes for the placement of the lBS range from a single, enormous, centralized facility to miniservers sited in neighborhood nodes. An example of the centralized approach was AT&T's original design that called for locating the server in Manhattan to serve headends around the nation.

Earlier in this article, we called for placing the server 'as close to the consumer as possible." However, we did not mean to advocate putting some kind of remote server in a box on the telephone pole outside each house. The optimal location for the server is at the point where no further cost reductions can be achieved by further distribution. This point is at the cable headend or telephone central office, beyond which there is no further expansion of bandwidth to the subscriber, and at which point the operator owns the entire bandwidth to the consumer. From the cable headend downstream to the household, the operator has paid for the entire bandwidth already, and loses income only when it is not used to provide services.

The economies of scale available with an appropriately constructed interactive broadband server imply that it may be smart to consolidate as many headends or COs as possible to be served by one lBS. This move toward geographical consolidation is well underway through a variety of interconnection schemes and other means. For example, in March 1996, Time Warner agreed to swap its systems in Hampton and Williamsburg, Virginia for Cox Cable's system in Myrtle Beach, South Carolina.

A Final Note from the Convergence Front

One interesting conclusion that can be reached from the foregoing discussion: The lBS is the switch. It is even possible to leave out the storage, and use the lBS as an exceptional broadband ATM switch. By the same token, ATM switch manufacturers are moving away from single processor switches at the high end, and toward matrix-based switches. With an appropriate design, it would also be possible to add storage to a matrix-based ATM switch, turning it into a MIPA-based Interactive Broadband Server!

In summary, there are several critical elements to the architecture of an appropriate, scalable, interactive broadband server, exemplified by the cool hairy onion model. First, the number of titles and the variety of material which may be offered is determined by the size of the archive, not the size of the server. Second, the storage is organized so that the server may be sized to meet aggregate anticipated peak demand for all services, rather than having to be micro-managed on a title-by-title or service-by-service basis. Third, a Massively Interconnected Processor Array architecture results in a server that offers genuine economies of scale, while at the same time offering the greatest possible reliability by taking advantage of inherent redundancy and avoiding single points of failure.
 

 

IBS Design Challenges, Revisited: Conventional Versus Scalable Approaches

 
 
Challenge  Conventional Approach Integrated Approach
Generating isochronous 

streams

Ganged single-processor or 

SMP stream generators

MIPA
Sorting, routing streams to 

subscribers

ATM switch MIPA
FEC, encryption, and QOS Intelligent modulator MIPA
Multiplexing and Modulating Mux / ATM, separate 

intelligent 6MHz modulator

Dumb 6 to 30 MHz modulator 
Importing and managing 

content

Library storage to primary 

storage, managed by 

anticipated demand.

Library storage is primary 

storage, subsequent layers are LRU managed cache, vertically striped.

OSS/BSS programming Separate system management 

computer and back office 

computer

MIPA
Headend design 112—745 equipment racks; 

Large facility, substantial 

power and A/C consumption.

25—45 equipment racks; 

moderate in size, power 

consumption, and A/C.

Design for reliability Protection against single drive failures. Difficult to maintain, many other single points of failure. Protection against many types of failure (should be no single point of failure). Easy to manage and maintain.

 
Part 2, Table 1: Approach Summary (Source: Van Tassel & Rose)
 
NTQ
New Telecom Quarterly
Technology Futures, Inc.
13740 Research Blvd., Suite C-1
Austin, TX 78750-1859
(800) TEK-FUTR (835-3887)
(512) 258-8898
(512) 258-0087 (fax)
www.ntq.com
info@ntq.com