/images/avatar.png

mmap Page Fault Tracing with bpftrace and ext4

Previous blog post on how to trace Firefox IO using bpftrace via official page_fault_user tracepoint left me a bit unsatisfied with how complicated it turned out. Complexity has potential to be error-prone and the syscall-tracing dependency makes it impossible to trace IO within the main executable. I decided to try reimplement the trace using my old approach of tracing ext4 functions that handle page-faults. This turned out to be much more robust.

EBPF for Tracing How Firefox Uses Page Faults to Load Libraries

Modern browsers are some of the most complicated programs ever written. For example, the main Firefox library on my system is over 130Mbytes. Doing 130MB of IO poorly can be quite a performance hit, even with SSDs! :). Few people seem to understand how memory-mapped IO works. There are no pre-canned tools to observe it on Linux, thus even even fewer know how to observe it. Years ago, when I was working on Firefox startup performance, I discovered that libraries were loaded backwards (blog1 , blog2 , paper , GCC bug ) on Linux.

Reading NFS at >=25GB/s using FIO + libnfs

My current employer does a lot of really cool systems work that’s covered by NDAs. I recently did some work to integrate a cool open source tool into our workflow. Felt it deserved a blog post. NFS Testing Requires Parallelism. I work for Pure Storage . One of the products we make is a scale-out NFS1 (and S3-compatible) server called FlashBlade . I was asked to test FlashBlade2 performance scaling. I needed to generate NFS read workloads of 15-300 Gigabytes/second.

Firefox's Optimized Zip Format: Reading Zip Files Really Quickly

This post is about minimizing amount of disk IO and CPU overhead when reading Zip files. I recently saw an article about a new format that was faster than zip. This is quite surprising as to my mind, zip is one of the most flexible and low-overhead formats I’ve encountered. Some googling showed me that over past 11 years people have noticed that Firefox uses optimized zip files. This inspired me to document thinking behind the optimized zip format I implemented in Firefox in the pre-pandemic 2010.

Why Google Pixel lags 10x more than Moto Z

Optional title In my previous post I made an argument that a modern phone is only as fast as the slowest component: ability of NAND to handle 4k writes. I decided to compare two Android flagships on the opposite ends of random-write-4k benchmark spectrum: Moto Z vs Google Pixel. I wrote a little fio benchmark driver to fill all available device storage with random 4k writes, print perf stats along the way.

Laggy phones and misleading benchmarks

TLDR: You can predict degree of unresponsiveness of a phone via random-write-4k benchmarks. I wish review websites would fill phones to 80-90% prior to running the benchmark, especially on smaller-capacity phones where users are more likely to run out of space. SQLite vs Phone NAND I’ve long held a theory that Android lag is almost directly determined by slowness induced by SQLite transactions. This weekend, while researching phones for a family member, I found some supporting evidence.