Why Google Pixel lags 10x more than Moto Z

In my previous post I made an argument that a modern phone is only as fast as the slowest component: ability of NAND to handle 4k writes. I decided to compare two Android flagships on the opposite ends of random-write-4k benchmark spectrum: Moto Z vs Google Pixel.

I wrote a little fio benchmark driver to fill all available device storage with random 4k writes, print perf stats along the way. Idea is to run the benchmark on /data/ partition, then fill all available space by writing to /storage/emulated/0, then do another round of testing on /data.

The chart above has p50 (50% IOs complete under X), p90 and p99 numbers for both devices. Moto Z median value is around 0.5ms, Pixel is 7x that at 3.3ms. Difference widens for p90.

On mobile phones 16.67ms is a magic number. That’s the amount of time one has to update screen at buttery-smooth 60FPS. Optimistically, one can roughly translate each data-persistence operation on Android into at-least 2 sequential random writes (best-case WAL SQLite mode). So if an app is saving a single piece of data, expect 6.6ms to be eaten up by IO on Pixel and when your device is busy, expect that number to rise quickly.

Note this is best-case performance for these devices, I expect performance to degrade as they age. Expect Pixel to drop frames or stutter as it ages. Pixel performs relatively poorly in this test.

How Motorola Smoked Google by ~10x at Storage Perf

I spent a few days poking around the filesystems while developing my benchmark experiment. Motorola (division of Lenovo) has bravely gone above and beyond stock Android to reduce storage lag. They got Moto-Z to performing close to high-end laptop SSDs.

How did Motorola do this? Answers were hiding in /proc/mounts file.

/storage/emulated/0: Google added a weird permission model for the common storage pool on Android. In a fit of either lazyness or rushing to meet some PM deadlines for features no users asked for: they wrote a passthrough fuse filesystem to enforce cross-app-file-sharing. This means that on the Pixel every user IO gets a round-trip back into user-space before hitting the NAND. Fuse burns more CPU and slows down IO by up to 30%. I love fuse for things like sshfs, but this is a terrible application of it. Motorola thought a little harder and replaced the nasty fuse hack with esdfs(fork of wrapfs).
/data: Pixel uses the traditional ext4 Linux filesystem. Moto-Z opted for f2fs. f2fs is a new filesystem developed by Samsung. It’s amazing, read the paper & watch preso. They drove development of the filesystem specifically by Twitter/FB/etc workloads captured from the phone. It does many neat things, but the thing it does best is avoid fsync write-amplification. F2FS flags fsyncs via block metadata instead of doing a full checkpoint. This means fsync requires 50%-less write operations than ext4 (interestingly competing filesystems like BTRFS have even higher fsync write amplification than ext4). I think the tradeoff is slightly slower recovery times. f2fs nets Moto-Z a 2x speed-up and 2x increase in NAND lifespan. Expect Moto-Z to age much better than Pixel.
nobarrier: Moto-Z has a very interesting mount option soup for mounting f2fs: rw,seclabel,nosuid,nodev,noatime,nodiratime,background_gc=on,discard,user_xattr,inline_xattr,acl,inline_data,nobarrier,extent_cache,active_logs=6. Just for kicks I took a USB hard-drive, formatted it with f2fs and applied same mount options. Suddenly the hard drive was 2x faster than the Pixel, WTF?

The key option is nobarrier. This effectively makes fsync() a no-op and explains most of the difference in performance. See XFS FAQ for the best description of nobarrier feature. This is where most of the performance difference comes from. Moto-Z is either awesome and implemented a RAM-cache solution for cellphones, or they are betting on excellent crash-recovery abilities of f2fs or they are really brave on behalf of users. Even if they didn’t implement battery-backed-RAM-cache for their NAND and that f2fs isn’t overly horrible at recovering from crashes this is probably still the right choice. As a user, I’m much happier to have a long-lasting phone that might forget a couple of seconds of data than a device that has to be trashed after a year of use.

If anyone has root on Pixel and Moto-Z, would be interesting to see if underlying block devices perform differently. I suspect they are very similar and that Motorola differentiates entirely in software.

Conclusion

Android OEMs like Motorola/Samsung (f2fs authors) are improving Android performance. Moto Z and a few other recent Androids have drastically reduced storage lag. Next time you are shopping, try to avoid buying devices that will slow down to point of being unusable as NAND wears out (ala Nexus 7, Nexus 6). I doubt anyone would spot the difference between a brand new Pixel and Moto-Z. However after a year of use, the difference should be stark.

Phone reviewers should be more vigilant and shame poorly-implemented devices. I won’t be recommending the Pixel to any family members.

I’m not recommending people buy Moto Z. WIFI/cell reception seems worse on Z than Pixel. Camera is worse too.

Comments/HackerNews

Comments/Reddit

Updates

I’m confusing UI state transitions with UI animations. Android animation framework does not run on main thread. Disregard 16.6ms section
In a follow-up twitter discussion, Android engineers made a solid case that this is likely a hack. If Motorola made nobarrier a no-op in hw, it wouldn’t be needed in sw (eg this email). It’s unclear how nobarrier was deemed safe. One could theorize that Motorola spent time QAing failure scenarios.
I’m still hoping that an Android vendor will implement battery-backed-RAM-cache to solve the write-4k-bottleneck. Moto-Z can be considered a risky prototype of what storage performance should be like. Will be interesting to see if my prediction of Pixel aging worse than Z come true. I doubt write-4k is a bottleneck in any android workload on the Moto-Z.