| • Science | • People | • Locations | • Timeline |
Memory was accessed solely under the control of the memory control unit, or MCU. The MCU was a two-way, 256-bit/channel parallel network that could support up to eight independent processors, with a ninth channel for accessing "main memory". The MCU also acted as a cache controller, offering high speed access on the eight processor ports to a semiconductor-based memory, and handling all communications to the 24-bit address space in main memory, which could be built out of slower systems such as core memory. The MCU was designed to operate asynchronously, allowing it to work at a variety of speeds and scale across a number of performance points. At the fastest, it could sustain transfer rates of 80 million 32-bit words per second per port, for a total transfer capacity of 640M-words/sec. This was well beyond the capabilities of even the fastest memories of the era.
The main ALU/CPU was extremely advanced for its era. The design included four basic cores that could be combined to handle vector instructions. Each core included a complete instruction pipeline system that could keep up to twelve scalar instructions in-flight at the same time, allowing up to 36 instructions in total across the entire CPU. From one to four vector results could be produced every 60ns, the basic cycle time (about 16MHz), depending on the number of execution units provided. Implementations of this sort of parallel/pipelined instruction system did not appear on modern commodity processors until the late 1990s, and vector instructions (now known as SIMD) until a few years later.
The processor included 48 32-bit registers, a huge number for the time, although they were not general purpose as they are in modern designs. Sixteen were used for addresses, another sixteen for math, eight for index offsets and another eight for vector instructions. Registers were accessed externally using a RISC-like load/store system, with instructions to load anything from 4-bits to 64-bit (two registers) at a time.
Most vector machines tended to be memory-limited, that is, they could process data faster than they could get it from memory. This remains a major problem on modern SIMD designs as well, which is why considerable effort has been put into dramatically increasing memory throughput. In the ASC this was improved somewhat with a lookahead unit that predicted upcoming memory accesses and loaded them into the ALU registers invisibily, using a memory interface in the CPU known as the memory buffer unit (MBU).
The "Peripheral Processor" was a separate system dedicated entirely to quickly running the operating system and programs running within it, as well as feeding data to the main CPU. The PP was built out of eight "virtual processors", VP's, which were designed to handle instructions and basic integer math only. Each VP included its own program counter, and the system could thus run eight programs at the same time, limited by memory accesses. Keeping eight programs running allowed the system to shuffle execution of programs on the main CPU depending on what data was available on the memory bus at that time, attempting to avoid "dead time" when the CPU was waiting on memory.
The PP also included a set of sixty-four 32-bit registers known as the communications register (CR). The CR put the "Peripheral" in the PP, and was the main storage system for state information between the various parts of the ASC; the CPU, VPs, and channel controllers.
When ASC machines first became available in the early 1970s they outperformed almost all other machines, including the CDC STAR-100The STAR-100 was a supercomputer from Control Data Corporation, one of the first machines to use a vector processor for improved math performance. Unfortunately a number of basic design features of the machine meant that its "real world" performance was m and under certain conditions matching the infamous one-off ILLIAC IVThe ILLIAC IV was one of the most infamous supercomputers ever, destined to the last in a series of research machines from the University of Illinois. Key to the ILLIAC IV design was fairly high parallelism with up to 256 processors, used to allow the mac. However only seven had been installed when the famous Cray-1The Cray-1 was a supercomputer designed by a team including Seymour Cray (who did the vector register technology) for Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Description The Cray-1A weighed 5. 5 tons was announced in 1975. The CRAY dedicated almost all of its design to sustained high-speed access to memory, including a 1M-64-bit-word semiconductor memory and a 5x faster 12.5ns cycle time. Although the ASC was in some ways a more interesting and expandable design, in the supercomputer world outright speed wins, and the Cray was simply much faster. ASC sales ended almost overnight, and Texas Instruments decided to exit the market entirely, instead of attempting to build a new version to compete.