FASTRA II: the world’s most powerful desktop supercomputer
Specifications
Hardware overview
Hardware assembly
Software overview
Hardware overview
Case: Lian-Li PC-P80 Armorsuit
The PCP80 case, which was also used in the FASTRA I build, provides a massive amount of working space and offers 9 expansion slots at the back of the case. Although the graphics cards in FASTRA II do not directly fit into these slots, it provides big ventilation gaps for releasing exhaust heat from the cards. The case had to be modded slightly for this project, by drilling holes for attachment of the GPU rack screws.
Motherboard: ASUS P6T7 WS Supercomputer
The Asus P6T7 motherboard is the only workstation motherboard available that has seven full-size PCI Express slots. The X58 chipset is connected to two additional NForce 200 chips that distribute PCI Express bandwidth between the seven slots.
CPU: Intel Core i7 920
Managing 13 GPUs simultaneously requires heavy multithreading on the CPU side, requiring a multicore CPU. As nearly all computational load is shifted to the GPUs, we opted for the CoreI7 920, which is highly affordable, while allowing for future upgrade possibilities.
Memory: 6×2GB Corsair DDR3 1333
Having as much RAM as possible is crucial for our type of applications. 12GB is usually sufficient to load large 3D volumes (e.g. 1024×1024x1024) completely in memory. We would have loved more memory though. Compared to a “real” supercomputer we have only a tiny amount of memory at our disposal. The Corsair memory has decent timings, at an affordable price point. Remarkably, the total amount of GPU memory in the FASTRA II system is about the same as the total amount of system RAM. It is not strictly necessary that all GPU memory is backed up by an equal amount of system RAM, though.
Harddisk: Samsung Spinpoint F3 1TB
At first this choice may seem somewhat strange. The Spinpoint harddrive is not very fast compared to more expensive models such as the WD Raptors or Solid State Disks. However, we observed that in our case, disk access is not a performance bottleneck at all. In particular, having a single harddisk improves the airflow through the case and keeps the system very tidy.
Power Supply: Thermaltake Toughpower 1500W + 3x Thermaltake PowerExpress 450W
The Thermaltake Toughpower already proved itself in the FASTRA I design. It has four PCI Express x6 and four PCI Express x8 connectors and powers four of the GTX295 cards. However, this power supply cannot power all graphics cards simultaneously. As the bottom of the case is already occupied, we decided to use the special VGA power supply offered by Thermaltake, which fits into a drivebay. Each PSU has connectors to power two graphics cards, but we use only one. This PSU takes one drive bay, as opposed to its bigger 650W brother, which takes two drive bays.
Graphics Cards: ASUS ENGTX275 + 4x ASUS ENGTX295 (2PCB) + 2x ASUS ENGTX295 (1PCB)
As the system has been assembled over a rather long period of time, two types of GTX295 cards have been used. We started with a series of dual-PCB cards (part no. 90-C3CGX0-K0UAY00T). The more recent single-PCB cards (part no. 90-C3CGX5-K0UAY00T) generate less heat, which can also be obeserved clearly in ourheat camera images. For the single-GPU card, which is connected to the screen, we opted for the GTX275, which is newer than the GTX285 and almost as powerful. This card has to be a single-GPU card for technical reasons, restricting us to a 13-GPU system (and not a 14-GPU one).
Flexible PCI Express risers: Adex Electronics PE-FLEX16 gen. 2 risers
The flexible risers from Adex Electronics can be ordered in different lengths, and allow for sufficient flexibility to connect all seven dual-slot graphics cards to the tightly spaced PCI Express slots of the motherboard.
GPU suspension cage: Custom Design
A strong cage is required to keep all cards in place above the motherboard. In collaboration with Tones.beand the firm LASERTEK N.V., a cage was designed and manufactured out of aluminium that meets the requirements.
Hardware Assembly
The Belgian computer shop Tones.be provided assistance and support during this project, and performed the assembly of FASTRA II. They managed to deliver a very clean build, despite the vast number of power and riser cables involved.
Software overview
Operating System: Linux, CentOS 5.3
We selected CentOS because it provides a stable environment that doesn’t need much maintenance. Instead of the standard CentOS Linux kernel, we used a custom 2.6.29.1 kernel.
Tomography Code: C++ and MATLAB 2009b
We use portable C++ for the core functionality of our software. In Windows, we useMicrosoft Visual Studio 2005, and on Linux, the C++ code can be compiled using the GNU C++ compiler. We’ve also developed a front-end for MATLAB. MATLAB has an easy to use interface and thus allows rapid prototyping of new algorithms. All GPU code is developed using the NVIDIA CUDA framework, a C-like programming language that allows for efficient programming of the NVIDIA GPUs.
No comments:
Post a Comment