Showing posts with label Processor. Show all posts
Showing posts with label Processor. Show all posts

11/25/2010

Wire-speed processor

The PowerPC A2 is a massively multicore capable and multithreaded 64-bit Power Architecture processor core designed by IBM using the Power ISA v.2.06 specification. IBM calls products based on it PowerEN (Power Edge of Network) or a "wire-speed processor" and they are designed as hybrids between regular networking processors, doing switching and routing and a typical server processor, that is manipulating and packaging data. It was revealed February 8 2010, at ISSCC 2010.
Versions of processors based on the A2 core range from a 2.3 GHz version with 16 cores doing 65 W to a less powerful, four core version, using 20 W at 1.4 GHz. Each A2 core is capable of four-way multithreading. Each chip has 8 MB of cache as well a multitude of task specific engines besides the general purpose processors, such as XML, cryptography, compression and regular expression accelerators, four 10 Gigabit Ethernet ports and two PCIe lanes. Up to four chips can be linked in a SMP system without any additional support chips.
The chips are said to be extremely complex, and uses 1.43 billion transistors, on a die size of 428 mm² fabricated on a 45 nm process. The processors are in a late development stage and finalized products will be available at a later, unknown date. IBM says it will market the processors to customers.





reference : wikipedia -  http://en.wikipedia.org/wiki/PowerPC_A2
                 A Wire-Speed Power Processor: 2.3GHz 45nm SOI with 16 Cores and 64 Threads ? Presentation, IBM - http://www.power.org/events/2010_ISSCC/Wire_Speed_Presentation_5.5_-_Final4.pdf

9/09/2009

POWER7 vs Nehalem-EX

Preview The 8-core battle at the high end By Nebojsa Novakovic
Monday, 7 September 2009, 13:23

LAST WEEK we covered some details about the upcoming IBM POWER7 processor, which is expected to be the second shipping 8-core general purpose server CPU after Intel's Nehalem-EX.
And no, Sun's Niagara with its ultralight cores is not a general purpose CPU, so it doesn't count.
Just like Intel's ultra high-end server offering, POWER7, IBM's flagship CPU for 2010, is a huge die, large cache monster, immensely powerful on its own yet capable of being very well connected to many of its siblings to compose very large, well scaled multiprocessor systems.
How do these two processors compare? Well, both are 45nm process behemoths with 8 cores per die, each with out-of-order execution and some degree of internal multithreading. The Nehalem-EX is expected to have 8 cores with 2 threads each, running at anywhere between 2.66GHz and 3GHz at launch in the next 4 months, while the POWER7 will have 8 cores with 4 threads each, running at up to 4GHz at launch sometime in mid-2010. So, POWER7 should be faster and more powerful from the raw hardware resources point of view, but at the cost of being half a year later to market.
Looking at each core, the Nehalem-EX core can process up to 4 instructions - some simple, some complex - per cycle, and 4 floating-point (FP) operations per cycle. Not bad at all for what is the most powerful X86 core in business today. POWER7 can do up to 6 simple instructions per cycle, and up to 8 FP operations per cycle if running 4 fused multiply-adds. Again, the raw power of the POWER7 core is somewhat higher. But then, so was the POWER6, yet it fared badly in benchmarks.
The caches? Both are really cache-rich, so to say. Nehalem-EX's 8 cores have a shared pool of 24MB L3 SRAM cache with a fast kilobit-wide ringbus between the different cache segments to speed up access. On the other hand, POWER7 has 32MB of L3 eDRAM cache for its 8 cores. In either case, each processor core has its private low-latency 256KB L2 cache too.
How about memory? Nehalem-EX has 4 buffered DDR3 channels per chip, where, using on-board buffers, every channel splits into two actual 64-bit DDR3-1333 DRAM paths. If the buffers had the abilities like FBD AMB (Advanced Memory Buffer) chips, you might be able to do simultaneous read and write transactions on each channel, effectively doubling the bandwidth. Either way, you're looking at some 50GBps of memory bandwidth per CPU chip, not bad at all.
In the case of POWER7, though, there are two 4-channel DDR3 memory controllers, for a total of 8 channels of memory and a claimed 100GBps total memory bandwidth. Now, this definitely cannot fit into the rumoured common G34 socket with AMD's Magny-Cours or Bulldozer CPUs, as those only have 4 memory channels.
Neither would POWER7's proprietary multipath 360GBps (yes, GigaBytes not gigabits) connections to neighbouring CPUs, up to 32 of them, fit into the nearly 4 times slower 4-channel HyperTransport 3 setup on the AMD G34 socket. The Nehalem-EX 4-channel QPI interconnect, if running at 6.4GTps, would give you above 100GBps bandwidth to the other 4 neigbouring CPUs - yes, also three times slower than the POWER7, but still far from slow in reality. Also, the Nehalem-EX's symmetrical north-south-east-west QPI arrangement can scale to hundreds of sockets without extra glue logic. Look at the SGI - sorry, Rackable - UtraViolet and such systems coming soon.
Now, last but not least, the instruction set architecture, probably the most important point. POWER7 continues on the old POWER ISA architecture path, including the PowerPC-specific Altivec extensions that were in the POWER6. While PowerMac is no more, IBM still has sizable markets in mainframes, minicomputers and of course servers and clusters for the new CPU.
On the other hand, Nehalem-EX is, simply, 64-bit X86. A straight win there, whether you like the X86 or not. Everything runs, all the vendors have to use it, and there'll be a myriad of support chipsets, peripherals, software, drivers, apps, and of course every operating system out there, minus AIX and VMS, I guess. You'll even have dual-processor extreme workstations, some overclockable, with the dual "Beckton" Nehalem-EX CPUs for 16-core Skulltrail-followon monsters to appease gamers' wet dreams and engineers complex visulisations. Just like their server counterparts, many of these will be easily upgradeable to the expected "Eagleton" 12-core 32nm chips with 36MB cache a year or so later. Unfortunately, I don't think that we'll ever see a POWER7 workstation.
Why not? Well, I think workstations are important to enable access to a given architecture to as many developers as possible, resulting in more optimised and tuned code, and of course more apps at the end. Whatever raw performance gains POWER7 has, there will always be more effort put into X86 chip code tuning and optimisation.
Finally, the price. It's too early to talk about POWER7 prices, but, if the current trends are anything to watch, expect a Nehalem-EX to be at least 3 times cheaper than the POWER7 per total system CPU unit. I won't be surprised to see an even larger price differential.
That's all for now. As more details emerge, look for more coverage here. µ

POWER7 pros:
- absolute raw performance - CPU, memory, I/O
- immense scalability within the 32 socket limit
- committed large vendor behind despite a mostly single-platform environment (Power Linux didn't take off as expected).

Nehalem-EX pros:
- it is the fastest X86 chip at launch, and it is X86 so everything runs, workstation or server
- near-limitless scalability without custom wizardry, most of it easy to reach even with Windows
- much cheaper and comes out half a year earlier.

9/01/2009

AMD Magny-Cours, 12-core CPU was developed by two six core component



AMD 24 Day was held at the University of Standford University seminar Hot Chip 21, revealed the code-named Magny-Cours the 12-core processor micro-architecture design, will use the Multi-Chip Package Technology, the two six-core encapsulated in the same processor, the same time, improve memory broadcasting technology to reduce the memory latency to happen.
AMD Senior Fellow Pat Conway pointed out that the micro-architecture will soon be in 2010 the 12-core Opteron processor launch Magny-Cours will adopt 45nm process, which is composed of two six-core Istanbul and through the Multi-Chip Package Technology encapsulated in the with a processor, the situation is the same as the Intel Core 2 Quad, but the difference is that the core Magny-Cours 2 is used Hyper-Transport 3.0 protocol directly connected without the need to use FSB technology, like Intel, in the middle to pass North Bridge chip significantly raised the delay values.
As the process advances to the single-core Socket support 12, so 4 Way system will provide a powerful 48-core computing power, in the same volume of computing power will dramatically increase the next double.
Micro-architecture design, Magny-Cours is still based on the existing K10 micro-architecture, each one has a 6 Die core, each core has 512KB L2 Cache stars and share 6MB L3 Cache, and then connect through the Hyper-Transport Ports core of another one. In addition, Magny-Cours Support HT Assist technology, processors, memory addressing information can be stored in L3 Cache in, or about 1MB memory space, storage, memory system, addressing information, which will reduce the memory system delays from 120ns to reduce to only 50ns, but the L3 Cache size reduction will have the opportunity to reduce the hit rate, but Pat Conway said the HT Assist on the hit rate is not obvious.
Finally, Pat Conway said it will add a similar Opteron processors Intel Hyper-Threading technology, but the effect would be further strengthened, I believe will appear in next-generation micro-architecture Bulldozer.

[link]

8/27/2009

VRIDGE X100 - PCI Express Expander


1 PCI Express Host Card - 4 PCI Express slot board
Cost - $5,000

1. CASE

2. Host Card, Expander Board

3. Detail






chip-set Lucid HYDRA 100
external connecter PCI-Express x16 Connecter
power connecter ATX 24pin power connecter
ATX 8pin power connecter
power 4W (only chip)
bus connecter PCI-Express x16 slot
PCI-Express slot 4
OS Windows XP (32bit / 64bit)
Windows Vista (32bit / 64bit)
Linux
size 223.8mm×264.2mm
size

external connecter PCI-Express x16 connecter
bus PCI-Express x16
size 131mm×69mm×14mm
size


http://www.elsa-jp.co.jp/english/products/pes/vridge_x100_dual16/index.html

IBM, Power 7 Processor

IBM announced Power7 processor at the Hot Chips 21.
45-nm process, up to eight cores supporting 32 threads, 32-way server can be used to collect the 256 cores with 1024 threads. 32MB eDRAM cache, dual 4-channel DDR3 memory controller, memory bandwidth of 300GB / s or higher.
Insight64 analyst, Nathan Brookwood explained power 7 is one of the fastest CPU, and some items would exceed the Intel nehalremeul.
Power 7 is scheduled to be released officially next year.

IBM, "POWER7: IBM's Next Generation POWER Microprocessor", Hot Chips 21, Stanford Univ.
Power7 Spec. : 45nm, 8 cores, 32 threads, 32MB eDRAM cache, dual 4 channel DDR3, 300 GB/s Memory bandwidth.


http://www.hotchips.org/hc21/program/conference_day_two.htm

8/16/2009

Welcome to Chip Architect


[Link]

8/15/2009

Cheap and powerful MD5 Cracker

Marc Bevand, "MD5 ChosenPrefix Collisions on GPUs", Black Hat USA 2009 July 30, 2009


  • 4 Radeon HD 4870 X2 in single machine
  • 8 GPU
  • About $1500
  • Tatol of 6500 Mhash/sec (using IGHASHGOU s/w, 2400*4=9600Mhash/sec)

[Reference]

The Performance of Processors

[Current fastest processor]

  • HD 4870 X2 - 2400G Flops

[reference]