5/03/2010
4/20/2010
Results for SHA1 & MD5 on HD5870 and new version of ighashgpu
AMD HD 5870 - MD5 : 3185M/sec
translate.googleusercontent.com/translate_c?hl=ru&ie=UTF-8&sl=ru&tl=en&u=http://www.golubev.com/blog/%3Fp%3D20&rurl=translate.google.com&usg=ALkJrhhhUMCtO7Sn4NE0kFVTz-QZvCYxzA

1/03/2010
FASTRA2
FASTRA II: the world’s most powerful desktop supercomputer
Specifications
Hardware overview
Hardware assembly
Software overview
Hardware overview
Case: Lian-Li PC-P80 Armorsuit
The PCP80 case, which was also used in the FASTRA I build, provides a massive amount of working space and offers 9 expansion slots at the back of the case. Although the graphics cards in FASTRA II do not directly fit into these slots, it provides big ventilation gaps for releasing exhaust heat from the cards. The case had to be modded slightly for this project, by drilling holes for attachment of the GPU rack screws.
Motherboard: ASUS P6T7 WS Supercomputer
The Asus P6T7 motherboard is the only workstation motherboard available that has seven full-size PCI Express slots. The X58 chipset is connected to two additional NForce 200 chips that distribute PCI Express bandwidth between the seven slots.
CPU: Intel Core i7 920
Managing 13 GPUs simultaneously requires heavy multithreading on the CPU side, requiring a multicore CPU. As nearly all computational load is shifted to the GPUs, we opted for the CoreI7 920, which is highly affordable, while allowing for future upgrade possibilities.
Memory: 6×2GB Corsair DDR3 1333
Having as much RAM as possible is crucial for our type of applications. 12GB is usually sufficient to load large 3D volumes (e.g. 1024×1024x1024) completely in memory. We would have loved more memory though. Compared to a “real” supercomputer we have only a tiny amount of memory at our disposal. The Corsair memory has decent timings, at an affordable price point. Remarkably, the total amount of GPU memory in the FASTRA II system is about the same as the total amount of system RAM. It is not strictly necessary that all GPU memory is backed up by an equal amount of system RAM, though.
Harddisk: Samsung Spinpoint F3 1TB
At first this choice may seem somewhat strange. The Spinpoint harddrive is not very fast compared to more expensive models such as the WD Raptors or Solid State Disks. However, we observed that in our case, disk access is not a performance bottleneck at all. In particular, having a single harddisk improves the airflow through the case and keeps the system very tidy.
Power Supply: Thermaltake Toughpower 1500W + 3x Thermaltake PowerExpress 450W
The Thermaltake Toughpower already proved itself in the FASTRA I design. It has four PCI Express x6 and four PCI Express x8 connectors and powers four of the GTX295 cards. However, this power supply cannot power all graphics cards simultaneously. As the bottom of the case is already occupied, we decided to use the special VGA power supply offered by Thermaltake, which fits into a drivebay. Each PSU has connectors to power two graphics cards, but we use only one. This PSU takes one drive bay, as opposed to its bigger 650W brother, which takes two drive bays.
Graphics Cards: ASUS ENGTX275 + 4x ASUS ENGTX295 (2PCB) + 2x ASUS ENGTX295 (1PCB)
As the system has been assembled over a rather long period of time, two types of GTX295 cards have been used. We started with a series of dual-PCB cards (part no. 90-C3CGX0-K0UAY00T). The more recent single-PCB cards (part no. 90-C3CGX5-K0UAY00T) generate less heat, which can also be obeserved clearly in ourheat camera images. For the single-GPU card, which is connected to the screen, we opted for the GTX275, which is newer than the GTX285 and almost as powerful. This card has to be a single-GPU card for technical reasons, restricting us to a 13-GPU system (and not a 14-GPU one).
Flexible PCI Express risers: Adex Electronics PE-FLEX16 gen. 2 risers
The flexible risers from Adex Electronics can be ordered in different lengths, and allow for sufficient flexibility to connect all seven dual-slot graphics cards to the tightly spaced PCI Express slots of the motherboard.
GPU suspension cage: Custom Design
A strong cage is required to keep all cards in place above the motherboard. In collaboration with Tones.beand the firm LASERTEK N.V., a cage was designed and manufactured out of aluminium that meets the requirements.
Hardware Assembly
The Belgian computer shop Tones.be provided assistance and support during this project, and performed the assembly of FASTRA II. They managed to deliver a very clean build, despite the vast number of power and riser cables involved.
Software overview
Operating System: Linux, CentOS 5.3
We selected CentOS because it provides a stable environment that doesn’t need much maintenance. Instead of the standard CentOS Linux kernel, we used a custom 2.6.29.1 kernel.
Tomography Code: C++ and MATLAB 2009b
We use portable C++ for the core functionality of our software. In Windows, we useMicrosoft Visual Studio 2005, and on Linux, the C++ code can be compiled using the GNU C++ compiler. We’ve also developed a front-end for MATLAB. MATLAB has an easy to use interface and thus allows rapid prototyping of new algorithms. All GPU code is developed using the NVIDIA CUDA framework, a C-like programming language that allows for efficient programming of the NVIDIA GPUs.
11/18/2009
Multi-GPU Radeon HD 5000 series video card "Radeon HD 5970"



Radeon HD 5970 | Radeon HD 5870 | Radeon HD 5850 | |
Process rules | 40nm | ||
Core clock | 725MHz | 850MHz | 725MHz |
SP Number | 1,600 units × 2 | 1,600 units | 1,440 units |
Texture Units | 80 units × 2 | 80 units | 72 units |
Memory | 1GB × 2 GDDR5 | 1GB GDDR5 | |
Overclocking | 1,000 MHz | 1,200 MHz | 1,000 MHz |
Memory Interface | 256bit × 2 | 256bi t | |
ROP Units | 32 units × 2 | 32 units | |
Power Board (idle) | 42W | 27W | |
Power Board (peak) | 294W | 188W | 151W |












9/14/2009
9/13/2009
HD 5870 with 1024 and 2048 MiB VRAM
. | HD 4870 | HD 4890 | HD 5850 | HD 5870 |
---|---|---|---|---|
Technique | 55 nanometres | 55 nanometres | 40 nanometres | 40 nanometres |
Transistors | 0,96 Billion | 0,96 Billion | 2,15 Billion | 2,15 Billion |
Chip size | 260 mm² | 282 mm² | 338 mm² | 338 mm² |
Chip frequency | 750 MHz | 850 MHz | 725 MHz | 850 MHz |
Memory frequency | 900 MHz | 975 MHz | 1000 MHz | 1200 MHz |
Stream processors | 800 | 800 | 1440 | 1600 |
Shader performance | 1200 Gigaflops | 1360 Gigaflops | 2088 Gigaflops | 2720 Gigaflops |
TMUs | 40 | 40 | 72 | 80 |
Texel fill rate | 30.000 MT/s | 34.000 MT/s | 52.200 MT/s | 68.000 MT/s |
ROPs | 16 | 16 | 32 | 32 |
Pixel fill rate | 12.000 MP/s | 13.600 MP/s | 23.200 MP/s | 27.200 MP/s |
Memory interface | 256 Bit GDDR5 | 256 Bit GDDR5 | 256 Bit GDDR5 | 256 Bit GDDR5 |
Memory bandwidth | 115,2 GB/s | 124,8 GB/s | 128,0 GB/s | 153,6 GB/s |
Prices | - | - | $299 | $399 (1024M) $449 (2048M) |
As we could learn now, the HD 5870 will appear in several versions, one with 1024 and another with 2048 MiB texture memory, although it is still uncertain through the loss of official benchmarks if the bigger memory adds up at all. The doubled memory will result in 50 USD more in price. This means for the HD 5870 with 1024 MiB texture memory still to be at 399 USD, while you just find a < 400 USD on AMD's newest documents, so 449 USD for the 2048 MiB version.
9/10/2009
Radeon HD 5870 Spec!!
http://www.fudzilla.com/content/view/15433/1/
9/04/2009
System with 4 dual nVidia cards
Specifications
Hardware overview
Software overview
Hardware overview
Case: Lian-Li PC-P80 Armorsuit

An essential requirement for the case is that it must have at least eight expansion slots, two for each dual-slot graphics card. Nearly all PC cases have at most seven expansion slots, leaving the Lian-Li PC-P80 and Thermaltake Armor as the only options we could find. Both are excellent choices: we opted for the Lian-Li case mainly for availability reasons.
Motherboard: MSI K9A2 Platinum

We searched for a motherboard that has four physical PCI-Expressx16 slots and has double spacing between each of them. There are several boards in the market with four graphics card slots, but we could find only two that have the required double slot spacing: the MSI Quad Royal, which is now outdated and hardly available anymore, and the MSI K9A2 Platinum. We originally intended to choose a platform suitable for an Intel processor, either Skulltrail or NVIDIA 790i, but none of the available motherboards satisfied the slot requirements. Ultimately, we decided to go for an AMD chipset, as CPU performance is not a major concern in our case anyway. Note that the K9A2 motherboard has a very modest price tag and lacks several features found on more expensive motherboards.
CPU: AMD Phenom 9850

A quad-core CPU is essential in our system, as there is continuous communication going on between the CPU and GPUs. Spreading these tasks among several processor cores reduces latencies and improves the overall computation speed. We opted for the new Phenom 9850 CPU, which does not have the TLB bug that originally plagued AMD's quad-core product line.
CPU cooler: Scythe infinity

This CPU cooler provides more than enough cooling for the CPU. Since we are not overclocking the CPU, any decent cooler would have done the job.
Memory: 4x 2GB Corsair TWINX DDR2 PC6400

Having as much RAM as possible is crucial for our type of applications. 8GB is usually sufficient to load large 3D volumes (e.g. 1024x1024x1024) completely in memory. We would have loved more memory though. Compared to a “real” supercomputer we have only a tiny amount of memory at our disposal. The Corsair memory has decent timings, at an affordable price point.
Harddisk: Samsung Spinpoint F1 750GB

At first this choice may seem somewhat strange. The Spinpoint harddrive is not very fast compared to more expensive models such as the WD Raptors. However, we observed that in our case, disk access is not a performance bottleneck at all. In particular, having a single harddisk improves the airflow through the case and keeps the system very tidy.
Power Supply: Thermaltake Toughpower 1500W Modular PSU

The Thermaltake Toughpower was the only PSU we could find that has the necessary four PCI-Express x6 and four PCI-Express x8 connectors. An alternative would have been to go for several smaller PSUs, but this modular PSU keeps the system extremely tidy. It turns out to be more than capable of powering four dual graphics cards simultaneously, even after overclocking the shader cores on all eight GPUs by 20%.
Graphics Cards: 4x MSI 9800GX2

There is no dominating reason to choose MSI over competing brands for the graphics cards. We obtained a good offer from MSI for these graphics cards and decided to take it. In particular, this seemed to be a sensible choice in case we needed support, as we are also using an MSI motherboard.
Software overview
We selected Windows XP-64 as the operating system for FASTRA. There were three reasons for choosing this platform: first, we needed a 64-bit operating system, in order to utilize 8GB of RAM. Second, we expected fewer driver issues on Windows compared to Linux. Third, within the Windows product line, Windows Vista is not yet supported by the NVIDIA GPU Computing platform, leaving Windows XP as the only choice. For development, we use Microsoft Visual Studio 2008. The core functionality for our CPU code is written in C++ (Visual C++), while MATLAB is often used as a front-end for rapid prototyping. All GPU code is developed using the NVIDIA CUDA framework, a C-like programming language that allows for efficient programming of the NVIDIA GPUs.
http://fastra.ua.ac.be/en/specs.html
4 HD4870X2!! why impossible?
1. The chipset does not determine a motherboard's ability to support Four 4870X2 GPUs (or a total of 8 GPUs). The driver does. Of course the motherboard needs to have a minimum of Four PCIe Gen 2 x16 lanes.
2. NVIDIA did say that they tested a system with 8 GPUs (Four 4870X2s), however: "As an aside, AMD has already built a computer that has four 4870X2s in it. So it has eight GPUs; drivers will not be supporting eight GPUs at this point of time," Mr. Hook said.
http://www.xbitlabs.com/news/video/d...ics_Cards.html
3. There are MANY extremely good reasons for a system to be built with Four 4870X2s. Any type of application that follows a SIMD (Single Instruction Multiple Data) design can hugely benefit from the massive parallel processing capacities such a system could provide, even despite bandwidth concurrency and latency issues that would unfold as the PCIe Gen 2.0 x16 slots auto negotiated down to a bandwidth of 4 x 4 x 4 x 4. For instance I play a role in the security community and my specialty is cryptanalysis. Such a machine could have many benefits in my field of research. The programming technique used to unlock the potential of the GPUs is called stream programming or GPGPU (General Purpose Computations on GPUs). Oh, and did I mention such a system would have a potential of 9.6 TFLOPs? (Super computer status, indeed).
4. As of this day, due to driver support among other bandwidth PCIe related issues there is no way such a system would have any advantage as a "Gaming System" (aside of course from the Two out of the Four 4870X2s that you COULD use in a CrossFireX configuration). Also, the 4870X2 cards use double wide spacing so one needs to do one of two things. Either build/buy a case that can support more than 7 extension cards (the standard) OR remove the default fan/heatsink cooling system from the video cards and install a water cooling solution which would allow for far less room being occupied.
5. There are indeed MOTHERBOARDS that support such a system. One is the MSI K9A2 Platinum. There is another made by MSI that also works (AND READ CLOSELY TO THIS PART) ... the motherboards work as far as detecting the devices within the PCIe slots, assining them they're appropriate device addresses, and auto negotiating the 4 PCIe lanes down (for this is a non operating system/driver level function). Now, as of Catalyst 8.9 (the most recent available Radeon driver) 8 GPUs (Four 4870X2s) do not seem to be supported. Now you'll notice I just said "seem". Perhaps that was an incorrect choice of words for the most compelling evidence I have to present to this point is this:

The above is ONE of my system. I have tried two different motherboards, and while I've found there is some tricky programming I can do to access each card at an extremely low level, the driver from ATI most certainly does not support this configuration. Also, I've tried every available operating system ATI supports, many it does not, and I've even spent a great deal of time dissasembling the driver in order to find a "hacked" solution... I have had no luck yet. I did contact MSI and they verified that theoretically Four 4870X2s is possible. They agree also that it is a matter of ATIs driver support. In addition, the FASTRA project which uses 8 GPUs of the NVIDIA variety uses that exact same motherboard for there stream processing super computer that they use for a field of study called Tomography. NVIDIA's drivers DO seem to support such a configuration with there CUDA development platform.
Last I would like to say that I very sincerely hope that I'm incorrect for regardless of all that I've said I'm still looking for a possible solution. In addition, the (I can't remember if I said this already): there is absolutely no guarantee that intel's X58 chipset will support Four 4870X2s (8 GPUs) but there is certainly also no reason that it would not, just as there is no reason that they theoretically couldn't work on present day motherboards with the mentioned requirements.
Anyway, I didn't mean to be abrupt or "bash" anyone and I apologize if I came off that way. I'm simply regurgitating what I have both experienced and learned to this point.
Regards.
Creating A5/1 Rainbow Tables
The attack on the A5/1 is a reimplementation of the attack by THC, which was done in early 2008. Our approach differs slightly, as we use more common hardware to generate the tables, namely graphics cards with GPGPU capability and attempt to build a distributed infrastructure of nodes where each node donates both a small portion of diskspace for a part of the table and some kind of fast hardware for the generation of and lookup in its own table. We also took this project as a motivation to design and code a general purpose TMTO library. The attack itself is still the same and we owe THC much for their pioneering work. Also take a look at http://airprobe.org for information and software on the sniffing of GSM data.
[Link]
8/27/2009
VRIDGE X100 - PCI Express Expander

| ||||||||||||||||||
external connecter | PCI-Express x16 connecter |
---|---|
bus | PCI-Express x16 |
size | 131mm×69mm×14mm |
size | ![]() |
http://www.elsa-jp.co.jp/english/products/pes/vridge_x100_dual16/index.html