Earlier this month, Qualcomm invited journalists to a virtual Snapdragon Tech Summit where they announced the Snapdragon 888 mobile platform. Qualcomm's latest 8-series SoC brings major improvements to image processing and machine learning but only incremental improvements to CPU and GPU performance. To find out just how much more powerful Qualcomm's latest chipset is, we typically get the opportunity to run benchmarks on its reference hardware. Due to COVID-19, however, Qualcomm could not arrange an in-person benchmarking session, so instead, they sent us a prerecorded video showing a Qualcomm Snapdragon 888 reference device running the gamut of popular benchmarks.

On the Snapdragon 888 reference device, Qualcomm ran one holistic benchmark (AnTuTu), a CPU-centric benchmark (Geekbench), a GPU-centric benchmark (GFXBench), and several AI/ML benchmarks (AIMark, AITuTu, MLPerf, and Procyon). Each benchmark was run three times, so the company shared the average result across three iterations. In addition, the company says they ran each benchmark using the default settings on the Snapdragon 888 reference design, meaning they did not enable any high-performance mode. However, because the benchmark scores were provided for us, we can't verify the results or testing conditions for ourselves. Once we get our hands on a commercial device with the Qualcomm Snapdragon 888, we'll rerun these benchmarks.

If you're interested in reading up on all the specifications and features of the Qualcomm Snapdragon 888 mobile platform, then I recommend reading Idrees Patel's excellent explainer on the Snapdragon 888 published earlier this month. His article goes into detail about all the improvements Qualcomm made to the CPU, GPU, modem, connectivity subsystem, ISP, AI engine, DSP, and everything else. For quick reference, I put together a chart comparing the key specifications of the Qualcomm Snapdragon 888 reference device compared to the other two devices used in this benchmark comparison: the Snapdragon 865-powered reference device and Snapdragon 855-powered Pixel 4 that I used in last year's benchmarking session. You can find that chart below ahead of the benchmark results.

Qualcomm Snapdragon 888 Benchmark Results

Test Devices' Specifications

Qualcomm Snapdragon 855(Google Pixel 4)

Qualcomm Snapdragon 865(Qualcomm Reference Device)

Qualcomm Snapdragon 888(Qualcomm Reference Device)

CPU

  • 1x Kryo 485 (ARM Cortex A76-based) Prime core @ 2.84GHz, 1x 512KB L2 cache
  • 3x Kryo 485 (ARM Cortex A76-based) Performance cores @ 2.42GHz, 3x 256KB L2 cache
  • 4x Kryo 385 (ARM Cortex A55-based) Efficiency cores @ 1.8GHz, 4x 128KB L2 cache
  • 2MB L3 cache
  • 1x Kryo 585 (ARM Cortex A77-based) Prime core @ 2.84GHz, 1x 512KB L2 cache
  • 3x Kryo 585 (ARM Cortex A77-based) Performance cores @ 2.4GHz, 3x 256KB L2 cache
  • 4x Kryo 385 (ARM Cortex A55-based) Efficiency cores @ 1.8GHz, 4x 128KB L2 cache
  • 4MB L3 cache
  • 1x Kryo 680 (ARM Cortex X1-based) Prime core @ 2.84GHz, 1x 1MB L2 cache
  • 3x Kryo 680 (ARM Cortex A78-based) Performance cores @ 2.4GHz, 3x 512KB L2 cache
  • 4x Kryo 680 (ARM Cortex A55-based) Efficiency cores @ 1.8GHz, 4x 128KB L2 cache
  • 4MB L3 cache

GPU

Adreno 640

Adreno 650

Adreno 660

Display

  • 2280 x 1080 resolution
  • 60Hz refresh rate
  • 2880 x 1440 resolution
  • 60Hz refresh rate
  • 2340 x 1080 resolution
  • 120Hz refresh rate

AI

  • Hexagon 690 with Hexagon Vector eXtensions and Hexagon Tensor Accelerator
  • 4th generation AI Engine
  • 7 TOPS
  • Hexagon 698 with Hexagon Vector eXtensions and new Hexagon Tensor Accelerator
  • 5th generation AI Engine
  • Qualcomm Sensing Hub
  • 15 TOPS
  • Hexagon 780 with Fused AI Accelerator architecture
  • 6th generation AI Engine
  • Qualcomm Sensing Hub (2nd generation)
    • New dedicated AI processor
    • 80% task reduction offload from Hexagon DSP
    • 5X more processing power YoY
  • 16X larger shared memory
  • 50% faster scalar accelerator, 2x faster tensor accelerator YoY
  • 26 TOPS

Memory

  • 6GB LPDDR4
  • 3MB system level cache
  • 12GB LPDDR5
  • 3MB system level cache
  • 12GB LPDDR5
  • 3MB system level cache

Storage

64GB UFS 2.1

128GB UFS 3.0

512GB UFS 3.0

ISP

  • Dual 14-bit Spectra 380 ISP
  • Dual 14-bit Spectra 480 ISP
  • 2.0 Gigapixels per second throughput
  • Triple 14-bit Spectra 580 ISP
  • 2.7 Gigapixels per second throughput

Manufacturing Process

7nm (TSMC's N7)

7nm (TSMC's N7P)

5nm (Samsung's 5LPE)

Software version

Android 10

Android 10

Android 11

Benchmarks Overview

With inputs from Mario Serrafero

  • AnTuTu: This is a holistic benchmark. AnTuTu tests the CPU, GPU, and memory performance, while including both abstract tests and, as of late, relatable user experience simulations (for example, the subtest which involves scrolling through a ListView). The final score is weighted according to the designer’s considerations.
  • GeekBench: A CPU-centric test that uses several computational workloads including encryption, compression (text and images), rendering, physics simulations, computer vision, ray tracing, speech recognition, and convolutional neural network inference on images. The score breakdown gives specific metrics. The final score is weighted according to the designer’s considerations, placing a large emphasis on integer performance (65%), then float performance (30%) and finally crypto (5%).
  • GFXBench: Aims to simulate video game graphics rendering using the latest APIs. Lots of onscreen effects and high-quality textures. Newer tests use Vulkan while legacy tests use OpenGL ES 3.1. The outputs are frames during test and frames per second (the other number divided by the test length, essentially), instead of a weighted score.
    • Aztec Ruins: These tests are the most computationally heavy ones offered by GFXBench. Currently, top mobile chipsets cannot sustain 30 frames per second. Specifically, the test offers really high polygon count geometry, hardware tessellation, high-resolution textures, global illumination and plenty of shadow mapping, copious particle effects, as well as bloom and depth of field effects. Most of these techniques will stress the shader compute capabilities of the processor.
    • Manhattan ES 3.0/3.1: This test remains relevant given that modern games have already arrived at its proposed graphical fidelity and implement the same kinds of techniques. It features complex geometry employing multiple render targets, reflections (cubic maps), mesh rendering, many deferred lighting sources, as well as bloom and depth of field in a post-processing pass.
  • MLPerf Mobile: MLPerf Mobile is an open-source benchmark for testing mobile AI performance. It was created by MLCommons, a non-profit, open engineering consortium, to "deliver transparency and a level playing field for comparing ML systems, software, and solutions." MLPerf Mobile's first iteration provides an inference-performance benchmark for a handful of computer vision and natural language processing tasks. For more information, refer to the paper "MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It."
    • Image classification: This test involves inferring a label to apply to an input image. Typical use cases include photo searches or text extraction. The reference model used is MobileNetEdgeTPU with 4M parameters, the dataset is ImageNet 2012 (224x224), and the quality target is 98% of FP32 (76.19% Top-1).
    • Image segmentation: This test involves partitioning an input image into labeled objects. Typical use cases include self-driving or remote sensing. The reference model used is DeepLab v3+ with 2M parameters, the dataset is ADE20K (512x512), and the quality target is 93% of FP32 (0.244 mAP).
    • Object detection: This test involves drawing bounding boxes around objects as well as providing a label for those objects. Typical use cases involve camera input such as for hazard detection or traffic analysis while driving. The reference model is SSD-MobileNet v2 with 17M parameters, the dataset is COCO 2017 (300x300), and the quality target is 97% of FP32 (54.8% mIoU).
    • Language processing: This test involves responding to questions colloquially. Typical use cases include online search engines. The reference model is MobileBERT with 25M parameters, the dataset is mini Squad (Stanford Question Answering Dataset) v1.1 dev, and the quality target is 93% of FP32 (93.98% F1).

AnTuTu Results

Starting off with AnTuTu, we can see that the Qualcomm Snapdragon 888 reference device scored nearly 17,000 points higher than the Snapdragon 865 reference device and nearly 350,000 points higher than the Snapdragon 855-powered Pixel 4. When you look at the CPU, GPU, Memory, and UX subscores (not shown here), we can see that the biggest improvements in performance come from GPU and memory. The Snapdragon 888 QRD scored approximately 45.56% higher in AnTuTu's GPU subtest compared to the Snapdragon 865 QRD. Similarly, the Snapdragon 888 QRD scored about 52.08% higher in AnTuTu's memory subtest compared to the Snapdragon 865 QRD. Compared to the Snapdragon 855-powered Pixel 4, the 888 QRD outscored it in the GPU and memory subtests by 98.42% and 117.58%, respectively.

Meanwhile, the Snapdragon 888 QRD scored approximately 30.05% and 90.28% higher in AnTuTu's CPU subtest compared to the Snapdragon 865 QRD and Snapdragon 855-powered Pixel 4, respectively. The UX subscore is difficult to compare because of the different Android OS versions each device was running (the Pixel 4 and Snapdragon 865 QRD were running Android 10 when I benchmarked them last year, while the 888 QRD is running Android 11.)

Qualcomm Snapdragon 888 AnTuTu results

The big boost in memory performance is quite interesting. Both the 865 QRD and the 888 QRD feature 12GB of LPDDR5 RAM, though we don't know what the RAM is clocked at. Notably, the 865 supports up to 16GB of LPDDR5 RAM at 2750MHz, while the 888 supports up to 16GB of LPDDR5 RAM at 3200MHz. The bumps in CPU and GPU performance here are slightly above our expectations, as Qualcomm said the Snapdragon 888's CPU and GPU gains are 25% and 35% respectively year-on-year. The more CPU and GPU centric benchmarks that follow show gains that are more in line with our expectations, though.

Geekbench Results

In Geekbench 5.0, the Qualcomm Snapdragon 888 performs 22.17% and 9.97% higher in the single-core and multi-core tests respectively compared to the Snapdragon 865. Compared to the Snapdragon 855, the 888 performs about 89.17% and 51.82% better respectively.

Qualcomm says the Snapdragon 888 provides a 25% boost in CPU performance over the Snapdragon 865. The CPU's lone ARM Cortex-X1 Prime core is clocked at a conservative 2.84GHz — the same clock speed as the last-gen ARM Cortex-A77 Prime core — so it's possible that we'll see a 3+GHz clock speed for the inevitable mid-year Snapdragon 888 "Plus" refresh. If that's the case, we can expect the CPU performance to improve even further, though right now, it's fair to say the gains are solid, yet merely incremental.

Thus, if you're upgrading from a two-year-old flagship, the 888 should bring major improvements in CPU performance. If you're upgrading from a year-old flagship, those gains are much smaller. I'm personally excited to see how a Snapdragon 888 device handles console emulation.

GFXBench Results

Qualcomm hasn't disclosed the core count or maximum frequency of the Adreno 660 GPU in the Snapdragon 888, so we have little to say about the GPU other than its gains in performance. In GFXBench's Manhattan test, which uses the OpenGL ES 3.0 API and renders a 1080p scene offscreen, the Snapdragon 888 had an average framerate of 169fps, about 34.13% and 83.7% higher than the framerates achieved by the Snapdragon 865 and 855 respectively. In GFXBench's Aztec Ruins test, which uses the Vulkan graphics API and renders a 1080p scene offscreen, the Snapdragon 888 had an average framerate of 86fps, about 38.71% and 95.45% higher than the framerates achieved by the Snapdragon 865 and 855 respectively.

There aren't very many games that demand a lot of GPU horsepower (the recent Genshin Impact is one exception), but improved GPU performance is useful for more than just gaming. But, gaming is definitely the biggest reason why people will care about these benchmark results, and the Snapdragon 888 definitely delivers with its 35% faster graphics rendering and 20% better power efficiency year-on-year. These results only demonstrate the peak GPU performance, however, so we'll have to revisit GFXBench—once we get our hands on commercial hardware—in order to run the benchmark's long-term performance tests.

MLPerf Results

Perhaps the most interesting gains are in AI performance. Qualcomm generally makes huge leaps in AI performance each year, but this year's gains are the most impressive. The Snapdragon 888's AI engine boasts 26 TOPS performance, an increase from the 15 TOPS performance of the Snapdragon 865 and 7 TOPS performance of the Snapdragon 855. Qualcomm credits much of this gain to the new fused AI accelerator architecture of the Hexagon 780 DSP, fusing the scalar, vector, and tensor accelerators to eliminate physical distances and pool memory for sharing and moving data efficiently.

It is difficult for us to demonstrate just how significant this leap in performance actually is, however. We've talked in-depth about the difficulties of AI benchmarking during our interviews with Qualcomm's Travis Lanier, Gary Brotman, and Ziad Asghar. The good news is that, since our discussions with Qualcomm execs, there have been significant advances in the field of AI benchmarks.

At the beginning of this article, we mentioned that Qualcomm ran 4 different AI benchmarks on the Snapdragon 888 reference device: AIMark, AITuTu, MLPerf, and UL's Procyon. Perhaps the most promising of these benchmarks is MLPerf Mobile, which is a soon-to-be-released, open-source mobile AI benchmark backed by multiple SoC vendors, ML framework providers, and model producers. Its initial batch of mobile inferencing results is public, so we used those results to compare with the Snapdragon 888. The results only cover 3 devices: the MediaTek Dimensity 820-powered Xiaomi Redmi 10X 5G, the Qualcomm Snapdragon 865+-powered ASUS ROG Phone 3, and the Exynos 990-powered Samsung Galaxy Note 20 Ultra 5G. Qualcomm did not provide latency results — only throughput figures — so we did not plot the full results as submitted by the vendors for verification by MLCommons.

In these select computer vision and natural language processing inferencing benchmarks, we can see that the Qualcomm Snapdragon 888 reference device achieved the highest scores in all four tests. Of the 3 previous-generation chipsets, MediaTek's Dimensity 820 outperformed the Snapdragon 865+ and Exynos 990 in object detection, while the Exynos 990 outperformed the Snapdragon 865+ and Dimensity 820 in NLP. Qualcomm's Snapdragon 865+ was generally competitive, scoring on par with the Dimensity 820 in image segmentation and outperforming it in NLP. In these specific inferencing tests with these specific models and datasets, the Snapdragon 888 outperformed the 3 last-gen chipsets.

It will be interesting to see what applications and features developers and OEMs can create using the AI prowess of the Snapdragon 888. Computer vision will play an especially important role in the many AI-enhanced videography features we'll likely see in 2021, while improved NLP performance can likewise affect video adjacent aspects like audio recording.

We should note, however, that the Snapdragon 888's results are unverified by MLCommons since part of the organization's verification process require that the device be commercially available (Qualcomm's reference devices are not sold through a carrier or as an unlocked phone). Furthermore, the performance depends on what ML models, numerical formats, and ML frameworks are chosen, as well as what ML accelerators are available.

Conclusion

Qualcomm's Snapdragon 888 once again brings incremental improvements to CPU and GPU performance, but massive improvements to image processing and AI. Not many people upgrading from a two-year-old device will notice the improvements in CPU and GPU (unless they plan on running emulators or playing games like Genshin Impact), but they will definitely notice the other advancements that have been made in mobile technology. Devices have higher refresh rate displays, more cameras with higher resolution image sensors, support for 5G connectivity, and much more these days. The massive gains in AI performance will go unnoticed by the average user, but the possibilities that have opened up with Qualcomm's new chipset are exciting to ponder. Real-time AI video enhancements, multi-camera streams, and much more are on the horizon next year, and companies like Google continue to surprise with the features they release backed by training machine learning models.

Qualcomm isn't the only company making improvements to its SoC lineup, though. Samsung's upcoming Exynos 2100 for the Galaxy S21 is said to bring major performance improvements. There's also Huawei's new HiSilicon Kirin 9000 and MediaTek's growing Dimensity line of mobile SoCs. I hope to revisit these benchmarks once we have at least one top-of-the-line device with Samsung's, Huawei's, and MediaTek's next-gen silicon.

Qualcomm Snapdragon 888 Benchmarking Demo

I mentioned at the beginning of this article that Qualcomm shared a prerecorded video with us. If you're interested, I have uploaded that video to YouTube. It shows the Snapdragon 888 running all the benchmarks I shared above, as well as the remaining AI benchmarks that I didn't showcase.

Meanwhile, here's the table that Qualcomm provided us summarizing the Snapdragon 888's benchmark results:

Qualcomm Snapdragon 888 benchmark results