Mostrando postagens com marcador snapdragon 805. Mostrar todas as postagens
Mostrando postagens com marcador snapdragon 805. Mostrar todas as postagens

sexta-feira, 16 de janeiro de 2015

Samsung Galaxy Note 4 & Note Edge Review



Samsung's Galaxy Note line went from being a doubtful niche attempt to essentially the most popular line in Samsung's inventory in just four years. Naturally, Samsung's 2014 Note phablets were highly anticipated, and Samsung delivered accordingly. This time around, the direct successor to last year's Note 3, the aptly named Note 4, came accompanied by a very interesting attempt from Samsung to make a popular phone making use of its curved display technology, the Galaxy Note Edge. Very similar to the Note 4, except with a cuved extension of the main display replacing the right side of the device. 
Both phones excel in almost very aspect you can think of, and are definitely very worthy successors to last year's Galaxy Note 3. Those who are adventurous, and are looking for something new and different, will look to the Note Edge, while those who just want a traditional phablet, albeit in this case, the best one in the market, will go for the regular Note 4. 

To begin this review, let's look at how this year's Galaxy Notes fare in terms of pure specs, listed in the table below:

Galaxy Note 4 Galaxy Note Edge
 Body   153.5 x 78.6 x 8.5mm
 176g 
 151.3 x 82.4 x 8.3mm
 174g 
 Display   5.7" Super AMOLED QHD (2560 x 1440, 515ppi)   5.6" Super AMOLED WQXGA (2560 x 1600, 524ppi) with curved edge
 Storage & RAM  32/64 GB, 3GB RAM   32/64 GB, 3GB RAM
 Networks  GSM, HSDPA, LTE Cat. 6  GSM, HSDPA, LTE Cat. 6
 WiFi   dual-band 802.11 a/b/g/n/ac   dual-band 802.11 a/b/g/n/ac
Bluetooth  Bluetooth 4.1 LE  Bluetooth 4.1 LE
Camera (Rear)   16MP with OIS, LED flash, face detection and HDR
 4K@30fps (2160p) video, 1080p@30fps or 720p@120fps with video stabilization
  16MP with OIS, LED flash, face detection and HDR
 4K@30fps (2160p) video, 1080p@30fps or 720p@120fps with video stabilization
Camera (Front)  3.7MP with 2K@30fps (1440p) video   3.7MP with 2K@30fps (1440p) video
 OS  Android 4.4 KitKat w/ TouchWiz UI  Android 4.4 KitKat w/ TouchWiz UI
 Processor   SM-N910S: Snapdragon 805
 SM-N910C: Exynos 5433
 Snapdragon 805
 CPU  SM-N910S: Quad-core 32-bit Krait 450 @ 2.7GHz
 SM-N910C: Octa-core 64-bit bit.LITTLE (Quad-core Cortex-A57 @ 1.9GHz + Quad-core Cortex-A53 @ 1.3GHz)
 Quad-core 32-bit Krait 450 @ 2.7GHz
 GPU  SM-N910S: Adreno 420 @ 600MHz (337.5 GFLOPS)
 SM-N910C: Mali-T760MP6 @ 700MHz (204 GFLOPS)
 SM-N910S: Adreno 420 @ 600MHz (337.5 GFLOPS)
 Battery  Removable Li-Ion 3,200mAh
 Video playback: 14 hours
 Removable Li-Ion 3,000mAh
 Video playback: 12 hours
 Features  Heart rate and Sp02 sensor, fingerprint scanner, S Pen  Smart Edge screen, heart rate and Sp02 sensor, fingerprint scanner, S Pen


Design

This is probably the only aspect where the Galaxy Note 4 and the Galaxy Note Edge actually differ significantly. While the Galaxy Note 4 is the regular shape we've come to expect phones to have, with a flat screen on the front, the Galaxy Note Edge makes an interesting use of Samsung's curved screen technology. Instead of having a flat screen covering the front, the Note Edge's panel cascades down the right edge of the phone, replacing the entire right side of the device. While it makes for a very interesting phone, from an aesthetic point of view, what is even better is the added functionality that the curved display offers. The side screen can be used for many things, like quick notifications, controls, and app shortcuts. While it won't exactly revolutionize the smartphone experience, it is a convenient extra to have.

Lefties beware! Since the Note Edge curves down the right side of the phone, it is less convenient for lefties to use. In the future, we might be seeing phones that curve down both sides, but until then, this device is more appropriate for right-handed users. 

In any case, aside from the curved screen, the Galaxy Note Edge and the Note 4 are very similar. Taking a leaf from the Galaxy Alpha's book, the two phablets' sides are made of aluminium (or in the case of the Note Edge, three of the four sides), with the back panel made of plastic, textured to resemble leather. The back cover of the new Notes are still removable, and so is the battery. While I would've liked to see an all-aluminium design, the new Notes feel very good in hand thanks to the metal sides. The front of the devices resembles any recent Samsung device, with the usual physical home button sitting in between Task Switcher and Back capacitive buttons. Underneath the home button there is Samsung's swipe-based fingerprint scanner.

Overall, a very good design for the new Notes. The aluminium sides, especially, are a very welcome addition to the phablets' designs, especially coming from Samsung. And at long last, Samsung has found a very non-disruptive and useful way to implement their curved AMOLED screens on a smartphone.

Display

When it comes to displays, Samsung's flagships are always highly anticipated, as their AMOLED screens are always among the best in the mobile space. Both the Note Edge and the Note 4 feature Super AMOLED screens with a Diamond PenTile pixel matrix. Of course, the main difference here is that one has a flat screen and the other has a curved one.

The Note 4 features a 5.7" display, just like the Note 3, except that this time the resolution is bumped to a stunning 2560 x 1440 resolution, which results in a fine 515ppi. This pixel density is really starting to get close to the limit after which the human eye cannot process any further details, but still, so far there are still advantages to be had with the extra resolution.

The Note Edge features a curved 5.6" display with 2560 x 1600 resolution, which translates into a pixel density of 524ppi. In comparison to the Note 4, the extra 160 pixels form the extra edge display portion. That aside, the Note Edge should be identical to the Note 4's display, offering the same benefits of Samsung's AMOLED technology, like extremely saturated colors and stunning contrast.

Software & Features

Both the Galaxy Note 4 and the Note Edge run on an Android 4.4.4 KitKat build skinned with Samsung's TouchWiz UI. An update to Android 5.0 Lollipop should be coming out soon.

TouchWiz is known for its various software features, some useful, many useless. The same can obviously be said about the new Galaxy Notes' software. Features like Multi Window and Air Gestures continue to be implemented and refined in Samsung's new phablets. Despite how heavy the whole TouchWiz package is, the 3 GB of RAM and powerful processors should keep the new Notes running very smoothly, no matter what you throw at it. 

Of course, these being devices from the Note range, a very important part of the package is the S Pen, Samsung's active stylus, which is better than ever this time around. More than ever, the S Pen lends itself to making the Note experience much more interactive and productive than on any other device.

Processor & Performance

As every flagship device these days, the Galaxy Note 4 and Note Edge are powered by some of the most powerful processors currently available. There is a distinction to be made, however. While the Galaxy Note 4 is available either with a Snapdragon 805 or an Exynos 5433 processor, depending on the region, the Galaxy Note Edge uses only the Snapdragon 805.

The Snapdragon 805, built on a 28nm HPm process (which is starting to get old), consists of a quad-core Krait 450 CPU clocked at up to 2.7GHz. The CPU is 32-bit, so when Lollipop comes to the Snapdragon 805-powered Notes, the new OS's 64-bit support will be of no benefit. Alongside the CPU, there is an Adreno 420 GPU clocked at 600MHz and a dual-channel 64-bit LPDDR3-1600 memory interface, offering ample bandwidth at a peak 25.6GB/s.

Samsung's Exynos 5433 processor, used in some variants of the Note 4, but not on the Note Edge, is a totally different beast. Built on Samsung's cutting-edge 20nm HKMG process node, it consists of a big.LITTLE CPU configuration, with four high-performance Cortex-A57 cores clocked at 1.9GHz, and four low-power, low-performance Cortex-A53 cores clocked at 1.3GHz. When necessary, both CPU clusters can work together, essentially making it a true octa-core CPU. Backing up this beastly CPU is ARM's Mali-T760 GPU clocked at 700MHz, and to wrap it off (not so well), the system is fed by a 32-bit dual-channel LPDDR3-1650 memory interface capable of delivering up to 13.2GB/s of bandwidth. This is much less than what the Snapdragon 805's memory interface can deliver, and I'm not sure 13.2GB/s can cut it for such a high-resolution display. This could only pose a potential bottleneck issue when running bandwidth-heavy games at the Note 4's native resolution. 

All variants of the Note 4 and Note Edge carry 3 GB of RAM, which should be more than enough space, even with Samsung's heavy TouchWiz features eating up memory.

Battery

Considering the large, high-resolution display and the powerful processors, the Galaxy Note 4 and Note Edge require a large battery to keep them running for long enough. The Galaxy Note 4 has a large 3,220 mAh battery feeding it, which should be enough, despite the power hungry internals. However, probably because of the curved display portion, Samsung had to reduce the Galaxy Note Edge's battery size down to 3,000 mAh. While this is still a large battery, it means that the Galaxy Note Edge will hardly win any battery life tests.

Conclusion

The new Galaxy Note 4 and Note Edge are excellent devices, delivering the goods in pretty much every aspect you can possibly think of. Beautiful screen, attractive design, powerful hardware and, of course, the venerable S Pen all make the new Notes very worthy successors of the flagship Note line.

The Note 4 is not exactly revolutionary compared to last year's Galaxy Note 3, however, every aspect of it is improved in comparison, securing Samsung's advantage in the phablet market. The Galaxy Note Edge, however, is more like Samsung still experimenting ways to implement its curved displays, and is so far Samsung's best attempt at it. Without impairing the usability of the device, Samsung managed to implement its curved screen technology in a way that not only made for an aesthetically pleasing phone, but also added functionality that people might actually use, unlike previous attempts (Galaxy Round, I'm talking to you).

Overall, Samsung's 2014 Note devices are their best phablets ever, and probably the best in the entire phablet market, scoring high marks in every aspect you can think of.

sexta-feira, 12 de dezembro de 2014

Apple A8 vs Snapdragon 805 vs Exynos 5433 - Smartphone SoC Comparison (2014 Edition)

Smartphones have become just about the most important gadget in a person's life, since it has positioned itself as a sort of a does-everything device (recently, even measuring your heart rate). As such, it needs a powerful processor to keep things going smoothly, even with so many utilities and features baked in. Accordingly, smartphone processor performance has seen exponential growth over the last few years, blazing past even some older laptops at this point. This year of 2014 we have the latest and greatest ultra-mobile processors shipping in devices in time for the holiday season. Among the best competitors we have: Apple, with its A8 processor, found in the iPhone 6 and 6 Plus, Qualcomm, with its latest and greatest Snapdragon 805, and Samsung's Octa-core Exynos 5433 SoC. It's no doubt that all three processors are performance monsters, but which of them offers the best performance, and more importantly, which one is the most power efficient?

Firstly, let's see how these processors compare on paper:

Apple A8 Snapdragon 805 Exynos 5433
 Process Node   20nm  28nm HPM  20nm HKMG 
 CPU  Dual-core 64-bit "Enhanced Cyclone" @ 1.4GHz  Quad-core 32-bit Krait 450 @ 2.7GHz  Octa-core 64-bit big.LITTLE (Quad-core ARM Cortex-A57 @ 1.9GHz + Quad-core ARM Cortex-A53 @ 1.3GHz)
 GPU  PowerVR GX6450 @ 450MHz (115.2 GFLOPS)  Adreno 420 @ 600MHz (337.5 GFLOPS)  Mali T760-MP6 @ 700MHz (204 GFLOPS)
 Memory Interface   Single-channel 64-bit LPDDR3-1600 (12.8GB/s)  64-bit Dual-channel LPDDR3-1600 (25.6GB/s)  32-bit Dual-channel LPDDR3-1650 (13.2GB/s)


At least on paper, all three processors are extremely powerful and very competitive when it comes to power and efficiency. However, the three SoCs use extremely different approaches to achieve their performance. While Apple prefers to have a smaller CPU core count, whilst making the core itself very large to achieve a high performance with just two cores, Samsung's quantity-over-quality philosophy means that they chose to throw in a very large number of CPU cores (in fact, eight of them). Qualcomm sits between Apple and Samsung, offering four CPU cores with decent per-core performance. In practice, given that most applications do not scale performace very well beyond two cores, I personally prefer Apple's approach, however a more limited selection of apps that can actually utilize a large core count, for instance, games which involve more complex physics calculations, might see Samsung's approach as the fastest option. Either way, the most accurate way of comparing the performance of these processors is using synthetic benchmarks.

Let's start with the GeekBench 3 benchmark, which tests CPU performance:
As you can see, Apple's second generation Cyclone core is just about the fastest core used in any current smartphone. Nvidia's Denver CPU core used in their Tegra K1 SoC outperforms the Cyclone core, but since the Tegra K1 is pretty much a tablet-only platform, I'm not considering it in this comparison. Meanwhile, The also 64-bit Exynos 5433, while behind the A8 by a large margin, is slightly above the Snapdragon 805. I also included data from the Snapdragon 801 chipset to quantify the evolution of the Krait 450 core in the Snapdragon 805 compared to its predecessor, Krait 400. The difference isn't big, actually, which makes for the fact that the Snapdragon 805 has the weakest single-threaded performance of all current high-end SoCs.
With four high-performance CPU cores aided by another four low-power cores (yes, Samsung managed to make both core clusters work at the same time, unlike with their previous big.LITTLE CPUs), it was obvious from the start that Samsung's processor would come out on top in applications that scale to multiple cores. In fact, the Exynos 5433's multi-threaded performance has a significant advantage over the competition. In second place comes the Snapdragon 805, with a much lower yet still very high score. Again, the multi-threaded test shows only a marginal improvement over the Snapdragon 801. And in last place comes Apple's dual-core A8, which, despite employing a very powerful core solution, simply had too few cores to outperform the competition. Still, it's not far behind the Snapdragon 805, and its score is very respectable indeed. 

Now, moving on to what probably is considered the most important area in SoC performance: graphics. To measure these processors' capability for graphics rendering, we turn to the GFXBench 3.0 test.
It's reasonable to say that the three main competitors in the high-end SoC segment are pretty much on par in terms of their GPUs' OpenGL ES 3.0 performance. However, the PowerVR GX6450 in the iPhone 6 Plus takes the lead, followed closely by the Snapdragon 805's Adreno 420, and in last place is the Mali-T760 in the Exynos 5433, but again, losing by a small margin.
For OpenGL ES 2.0 performance we see the performance gap widen, however the same basic trend can be seen: The Apple A8 takes first place, followed closely by the Snapdragon 805 and a bit further behind we have the Exynos 5433. Also note how, unlike what we've seen in the CPU benchmarks, this time the Snapdragon 805 gets a huge boost compared to its predecessor, the Snapdragon 801.
The ALU test focuses on measuring the GPU's raw compute power, and on this front, Qualcomm seems to be sitting very comfortably, since both the Snapdragon 805 and its sucessor the 801 are far ahead of the Apple A8 and the Exynos 5433 GPUs.

The Fill test depends mostly on the GPU's Render Output Units (ROPs) and on the SoC's memory interface. Given that the Snapdragon 805 has a massive memory interface, comparable to the one on Apple's tablet-primed A8X chip, it naturally had a huge advantage in this test. Meanwhile, the Apple A8 is slightly below the last-gen Snapdragon 801, and the Exynos 5433 comes in last place, but by a small margin.

Power Consumption and Thermal Efficiency

Since these chips are supposed to run inside smartphones, a lot of attention has to be given for the SoC to fulfill two requirements: consume as little power as possible, especially during idle times, and not heat up too much when under strain. I believe that Apple's A8 chip fares best in this department, because apart from being built on a 20nm process, it's Cyclone CPU has proved to be quite efficient in previous appearances. As for Samsung's Exynos 5433, despite being built on 20nm too, I'm not sure that a processor that can have 8 CPU cores running simultaneously can keep itself cool when under strain without thermal throttling. Although at least, in terms of power consumption, idle power should be very low thanks to the low-power Cortex-A7 cores. Finally, it's a bit hard to determine how power efficient Qualcomm's processors are because the company discloses close to nothing about its CPU and GPU architectures. However, it is a proved solution. Krait + Adreno SoC's from Qualcomm can be found on almost every flagship smartphone from 2014, so while it has the disadvantage of still not having moved to 20nm, experience from the past proves that their SoCs and architectures are sufficiently efficient. 

Conclusion

It's a bit hard to determine exactly which processor is the best. Each one of these fares better than the others in at least one area, but each also has its clear weakness.

The Apple A8, using just two, however powerful, CPU cores, not to mention at relatively low clock speeds, can deliver top-notch single-threaded performance, however its low core count hurts its performance amid the quad- and octa-core competition in multi-threaded applications. Also, the PowerVR GX6450 GPU was a good choice, as at least for general gaming it appears to be the fastest solution available on any smartphone. Power consumption should be also pretty low thanks to the 20nm process used and to Apple's and ImgTech's efficient architectures.

The Snapdragon 805 is really more of an evolution of the 801, without any huge changes. For instance, it's the only 32-bit processor being compared here. However, it still manages to deliver excellent performance, building on the success of the outgoing 801. While it's single-threaded performance is a bit disappointing for a 2.5GHz CPU, it does very well in multi-threaded applications, nearing the Exynos 5433's performance. The Adreno 420 GPU also performs extremely well, losing only to the Apple A8 in GFXBench's general gaming tests and absolutely destroying the competition in terms of memory bandwidth and raw compute power. While a move to 20nm would be appreciated, Qualcomm's processors are known for being power efficient, so no problem here. 

Finally, Samsung's Exynos 5433 is really a mixed bag. It's 20nm HKMG process, together with the low-power Cortex-A7 cores, makes way for excellent power efficiency, at least in terms of idle power, and thanks to its huge core count, its multi-threaded performance is ahead of everyone else. It should be noted that, despite the 20nm process, having 8 cores running at full load might introduce the need for thermal throttling, especially in a smartphone chassis.
However, the Mali-T760 GPU employed is slightly behind the competition in terms of general gaming performance, and raw compute power is quite disappointing...thankfully, raw compute power matters little to the vast majority of users. Still, it's an excellent GPU, just not THE best. 

Overall, these are all excellent processors, each one with their respective advantages and disadvantages. It all comes down to what aspects you think is more important to you, for instance, if you value performance in multi-threaded applications, a Exynos 5433-powered device is ideal for that. For an excellent all-around package, which is also a proven solution for smartphones (plus admirable GPU compute power), pick a Snapdragon 805 device. And if you don't mind as much about multi-threaded performance, but want to have the best gaming performance in any smartphone, you can pick one of Apple's A8-powered iDevices. 

domingo, 23 de novembro de 2014

Apple A8X vs Tegra K1 vs Snapdragon 805 - Tablet SoC Comprarison (2014 Edition)

In the last few years, ultra-mobile System-on-Chip processors have made unprecedented strides in terms of performance and efficiency, advancing very quickly the standards for mobile performance. One form factor that particularly benefits from the exponential growth of SoC performance are tablets, since their large screens allow for the processors' abilities to be fully utilized. For the holiday season of 2014, we have the latest and greatest of mobile performance shipping inside high-end tablets. Apple has made a whole new SoC just for their iPad Air 2 tablet, which they call the A8X. Nvidia's Tegra K1 processor, which borrows Nvidia's venerable Kepler GPU architecture, has also appeared on a number of new high-end tablets. Finally, we also have the Qualcomm Snapdragon 805 processor found in the Amazon Kindle Fire HDX 8.9" (2014). Unfortunately, most other tablets either use the aging Snapdragon 801 processor, or in the case of Samsung's latest high-end tablets, use an even older Snapdragon 800 processor or the also old Exynos 5420 processor, which debuted with the Note 3 phablet in late 2013. In any case, at the pinnacle of tablet performance, we have the Apple A8X, the Tegra K1 and the Snapdragon 805 battling for the top spot.

 Apple A8X   Nvidia Tegra K1   Snapdragon 805
 Process Node   20nm  28nm HPM  28nm HPM
 CPU  Tri-core "Enhanced Cyclone" (64-bit) @ 1.5GHz  32-bit: Quad-core ARM Cortex A15 @ 2.3GHz
 64-bit: Dual-core Denver @ 2.5GHZ
 Quad-core Krait 450 @ 2.5GHz
 GPU  PoverVR GXA6850 @ 450MHz (230 GFLOPS)  192-core Kepler GPU @ 852MHz (327 GFLOPS)  Adreno 420 @ 600MHz (172.8 GFLOPS)
 Memory Interface  64-bit Dual-channel LPDDR3-1600 (25.6GB/s)  64-bit Dual-channel LPDDR3-1066 (17GB/s)  64-bit Dual-channel LPDDR3-1600 (25.6GB/s)


The CPU

It can certainly be said that all of this year's high-end mobile processors have excellent CPU performance. However, each manufacturer took a different path to reach those high performance demands, and that is what we'll be looking at in this section.

Starting with the A8X's CPU, what we have in hand is Apple's first CPU with more than two CPU cores. This time we have a Tri-core CPU, based on an updated revision of the Apple-designed Cyclone core, which utilizes the ARMv8 ISA and is therefore a 64-bit architecture. Clock speeds remain conservative with Apple's latest CPU, going no further than 1.5GHz. So with three cores at 1.5GHz, how does Apple get performance competitive with quad-core, 2GHz+ offerings from competitors? The answer lies within the Cyclone core.
The Cyclone CPU, now in its second generation, is a very wide core. As it is, it can issue up to 6 instructions per clock. Also, each Cyclone core contains 4 ALUs, as opposed to 2 ALUs/core in Apple's previous CPU architecture, Swift. Also, the reorder buffer has been increased to 192 instructions, in order to avoid memory stalls and to utilize more fully the 6 execution pipelines. In comparison, a Cortex-A15 core can co-issue up to 3 instructions per clock, half as much as Cyclone, and can hold up to 128 instructions in its reorder buffer, only two thirds of the amount that Cyclone's reorder buffer can hold.
By building a very wide CPU architecture, and keeping their CPUs to low core counts and clock speeds, Apple has, in one move, achieved excellent single-threaded performance, far beyond what a Cortex A15 or a Krait core can produce, while at least matching the quad-core competition in multi-threaded processing. I've always said that, due to the fact that CPU instructions tend to have a very threaded nature, CPUs should be way more efficient if they are built emphasizing single-threaded performance, and Apple continues to do the right thing with Cyclone.

The Snapdragon 805 is the last high-end SoC to utilize Qualcomm's own Krait CPU architecture, which was introduced WAY back with the Snapdragon S4. Needless to say, it's still a 32-bit core. The last revision of the Krait architecture is dubbed Krait 450. While Krait 450 carries many improvements compared to the original Krait core, the basic architecture is still the same. Like the Cortex-A15 it's based on, Krait is a 3-wide machine, capable of co-issuing up to 8 instructions at once. In comparison to Cyclone, it's a relatively small core, therefore, it won't be as fast in terms of single threaded performance. Krait 450's tweaked architecture allows it to run at a whopping 2.7GHz, or to be more exact, 2.65GHz. In the case of the Snapdragon 805, we have four of these Krait 450 cores. Qualcomm's signature architecture tweak, which involves putting each core on an individual voltage/frequency controller, allows each core to have a different frequency. That reduces the power consumption of the SoC, and should translate into better battery life. With four cores, and at such a high frequency, the Snapdragon 805's CPU gets very good multi-threaded performance, although the relatively narrow Krait core hurts single-threaded performance very much.

Finally, we have the Tegra K1 and its two different versions. The 32-bit version of the Tegra K1 employs a quad-core Cortex-A15 CPU clocked at up to 2.3GHz, and we've seen a CPU configuration like this in so many SoCs that by this point it's a very well known quantity. The interesting story here is the 64-bit Tegra K1, which uses a dual-core configuration of Nvidia's brand new custom CPU architecture, named Denver. If you don't care much to know about Denver's architecture, you'd better skip this section, because there is A LOT to say about Nvidia's custom CPU.

Denver: The Oddest CPU in SoC history

Denver is Nvidia's first attempt at making a proprietary CPU architecture, and for a first attempt it's actually very good. Some of Nvidia's expertise as a GPU maker has translated into its CPU architecture. For instance, exactly like with Nvidia's GPU architectures, Denver works with VLIW (Very Long Instruction Word) instructions. Basically, this means that the instructions are packed into a 32-bit long "word", and only then are sent into the execution pipelines.

Denver's most peculiar characteristic might be this one: it's an in-order machine, while basically every other high-end mobile CPU has Out-of-Order Execution (OoOE) capabilities. Denver's lack of a dedicated engine that reorders instructions in order to reduce memory stalls and therefore increase the IPC (Instructions Per Clock) should be a huge performance bottleneck. However, Nvidia employs a very interesting (and in my opinion unnecessarily complicated) way of dealing with its in-order architecture.

By not having a hardware OoOE engine built into the CPU, Nvidia has to rely on software tricks to reorder instructions and enhance ILP (Instruction Level Parallelism). Denver is actually not meant to decode ARM instructions most of the time. Rather, Nvidia chose to build a decoder that would run native instructions, optimized for maximum ILP. For this optimization to occur, Nvidia has implemented a Dynamic Code Optimizer (DCO). Basically, the DCO's job is to recognize ARM instructions that are being sent to the CPU frequently, translate it into native instructions and optimize the instruction by reordering parts of the instruction to reduce memory stalls and maximize ILP. For this to work, a small part of the device's internal storage must be reserved to store the optimized instructions.

One implication of this system is that the CPU must be able to decode both native instructions and normal ARM instructions. For this purpose there are two decoders in the CPU block. One huge 7-wide decoder for native instructions generated by the DCO, and a secondary 2-wide decoder for ARM instructions. The difference in size between the two decoders shows how Nvidia expects to have the native instructions being used most of the time. Of course, at the first time that a program is run, and there are no optimized native instructions ready for the native decoder to use, only the ARM decoder would be used until the DCO starts recognizing recurring ARM instructions from the program and optimizes those instructions, from which point onwards that specific instruction would always go through the native decoder. If a program ran the same instructions multiple times (for example, a benchmark program), eventually all of the program's instructions would have a corresponding native optimized instruction stored, and then only the native decoder would be utilized. That would correspond to Denver's peak performance scenario.

While Nvidia's architecture might be a very interesting move, I ask myself if it wouldn't just be easier to build a regular Out-of-Order machine. But still, if it performs well in real life, it doesn't really matter how odd Nvidia's approach was. 

Now, going on to the execution potion of the Denver machine, we see why Denver is the widest mobile CPU in existence. That title was previously held by Cyclone, with its 6 execution pipelines, however, Nvidia went a step ahead and produced a 7-wide machine, capable of co-issuing up to seven instructions at once. That alone should give the Denver core excellent single-threaded performance.

The 64-bit version of the Tegra K1 employs two Denver cores clocked at up to 2.5GHz. That makes it the SoC with the lowest core count among the ones being compared here. While single-threaded performance will most certainly be great, I'm not sure that the dual-core Denver CPU can outrun its triple-core and quad-core opponents.

In order to test that, let's start our synthetic benchmarks evalutation of the CPUs with Geekbench 3.0, which evaluates the CPU both in terms of single-threaded performance and multi-threaded performance.

CPU Benchmarks

In single-threaded applications, Nvidia's custom Denver CPU core takes the first place, followed closely by Apple's enhanced Cyclone core on the Apple A8X. Meanwhile, the older Cortex-A15 and Krait 400 CPU cores are far behind, with the 2.2GHz A15 core in the 32-bit Tegra K1 pulling slightly ahead of the 2.7GHz Krait 450 core in the Snapdragon 805. 


In multi-threaded applications, where all of the CPU's cores can be used, the A8X, with its Triple-core configuration blows past the competition. The dual-core Denver version of the Tegra K1 gets about the same performance as the quad-core Cortex-A15 Tegra K1 variant, with the quad-core Krait 450 coming in last place, but by a very, very small margin. 

Apple's addition of one extra core to the A8X's CPU, together with the fact that Cyclone is a very powerful core, make it easily the fastest CPU in the market for multi-threaded applications. While Nvidia's 64-bit Denver CPU core has some impressive performance, thanks to its wide core architecture, it's core count works against it in the multi-threaded benchmark. It is, in fact, the only dual-core CPU being compared here. Even if it's not as fast as the A8X's CPU, Nvidia's Denver CPU is a beast. Were it in a quad-core configuration, it would absolutely blow the competition out of the water.

The GPU

Moving away from CPU benchmarks, we shall now analyze graphics performance, which is probably even more important than CPU performance, given that it is practically a requirement for high-end tablets to act as a decent gaming machine. First we'll look at OpenGL ES 3.0 performance with GFXBench 3.0's Manhattan test, followed by the T-Rex test, which tests OpenGL ES 2.0 performance, followed by some of GFXBench 3.0's low level tests.

The Manhattan test puts the Apple A8X ahead of the competition, followed closely by both Tegra K1 variants, which have about the same performance, since they have the exact same GPU and clock speed. Unfortunately, the Adreno 420 in the Snpadragon 805 is no match for the A8X and the Tegra K1, something that points out the need for Qualcomm to up their GPU game.

The T-Rex test paints a similar picture, with the A8X slightly ahead of the Tegra K1, while both of the Tegra K1 variants get about the same score, and the Snapdragon 805 falls behind the other two processors by a pretty big margin.

The Fill rate test stresses mostly the processor's memory interface and the GPUs TMUs (Texture Mapping Units). Since both the Apple A8X and the Snapdrgon 805 have the same dual-channel 64-bit LPDDR3 memory interface clocked at 800MHz, the performance advantage the Snapdragon 805 has shown in comparison to the A8X can only be attributed to the possibility that the Adreno 420 GPU has better texturing performance than the PowerVX GXA6850 in the Apple A8X. Meanwhile, the two variants of the Tegra K1 feature the same memory interface, which also consists of a dual-channel 64-bit LPDDR3 interface, only with a lower 533MHz clock speed. Therefore, the Tegra K1 offers signifcantly less texturing performance compared to the A8X and the Snapdragon 805, but is a very worthy performer nevertheless.
The ALU test is more about testing the GPUs sheer compute power. Since Nvidia's Tegra K1 has 192 CUDA cores on its GPU, it naturally takes the top spot here, and by a pretty significant margin.

For some reason, all tests show the 32-bit Tegra K1 in the Nvidia Shield Tablet scoring a few more points than the 64-bit Tegra K1 in the Google Nexus 9. But given that the two processors have the exact same GPU, this difference in performance is probably due to software tweaks in the Shield Tablet's operating system, which would make sense, given that it is more than anything a tablet for gaming.

Thermal Efficiency and Power Consumption

In the ultra-mobile space, power consumption and thermals are the biggest limiting factors for peformance. As the three processors being compared here are all performance beasts, several measures had to be taken so that they wouldn't drain a battery too fast or heat up too much.

In order to keep power consumption and die size in check, Apple has decided to shrink the manufacturing process from 28nm to 20nm, a first in the ultra-mobile processor market. That alone gives it a huge advantage over the competition, since they can put more transistors in the same die area, and with the same power consumption. Since the A8X is, in general, the fastest SoC available, the smaller process node is important to keep the iPad Air 2's battery life good. 

Nvidia's Tegra K1 should also do well in terms of power consumption and thermal efficiency in situations where the GPU isn't pushed too hard. The 28nm HPM process it's built upon is nothing particularly good, but it's still not old for a 2014 processor. While the Kepler architecture is very power efficient, straining a 192-core GPU to its maximum is still going to produce a lot of heat in any case. The Nexus 9 tablet reportedly can get very warm on the back while the tablet is running an intensive game.

Finally, the Snapdragon 805 should be the less power hungry processor because it is also a smartphone processor. Given that a 5" phone can carry this processor without heating up too much or draining the battery too fast, a tablet should certainly be able to do the same. To put things in perspective, if we put the Tegra K1 or the Apple A8X inside a smartphone, both would be too power hungry and would produce too much heat to make for a decent phone. In any case, the Snapdragon 805 is, like the Tegra K1, built on a 28nm HPm process. Given that its not as much a performance moster as the other two processors mentioned here, it must be the least power hungry of all three.

Conclusion

Objectively speaking, the comparisons made here make it pretty much clear that once again Apple takes the crown for the best SoC for this generation of high-end tablet processors. Not that the competition is bad. On the contrary, Nvidia went, in just one generation, from being almost irrelevant in the SoC market (let's face it, the Tegra 4 was not an impressive processor) to being at the heels of the current king of this market (aka Apple). The Tegra K1 is an excellent SoC, and even if it can't quite match the Apple A8X, it's still quite close to it in most aspects.

Meanwhile, Qualcomm is seeing it's dominance in the tablet market start to fail. It's latest SoC, the Snapdragon 805, available even on some smartphones and phablets, is available in only one tablet, while most others carry the Snapdragon 801 or even the 800, and this is disappointing, given that a tablet can utilize the processing power more usefully than a smartphone or a phablet. Either way, the Snapdragon 805 is still a very good processor. It's just far from being the fastest. Perhaps Qualcomm should consider, like Nvidia and Apple, making a processor with extra oomph, but meant only to run inside tablets, because while the Snapdragon 805 is an excellent smartphone processor, it's not as competitive in the tablet market.