CPU and Kernel Page Faults
Page faults occurs when the process tries to access a memory that isn’t backed by a physical page kernel raises a fault which loads a page. It happens on first access, stack expansion, COW, swap and much more. However it comes with a cost. In this episode of the backend engineering show I dissect the need and the cost page faults in the kernel. 0:00 Intro 4:00 Virtual memoryAbstraction of physical memoryMemory sharingAllow more processes to run , unused go to diskNuma, kernel can place memory near the cpu12:00 VMA areasText/code Data BSSHeapStack19:50 Kernel mode25:30 What is a Page fault?30:30 First access page fault33:00 Stack Expansion page fault34:30 CoW page fault38:00 Swap page fault39:39 File backed page fault40:29 Permission page fault 45:30 Summary
Amazon US-EAST-1 Outage in Details
On October 19 2025 AWS experienced an outage that lasted over a day, 10 days later we finally got the root cause analysis and we know exactly what caused the DNS to fail0:00 Summary 5:30 How did Dynamo lost its DNS?13:41 EC2 Errors 16:16 Network Load Balancer ErrorsRCA here https://aws.amazon.com/message/101925/
Graceful shutdown in HTTP
There are cases where the backend may need to close the connection to prevent unexpected situations, prevent bad actors or simply just free up resources. Closing a connection gracefully allows clients and backends to clean up and finish any pending requests. In this episode of the backend engineering show I discuss graceful connections in both HTTP/1.1 via the connection header and HTTP/2 via the GOAWAY frame. 0:00 Intro4:58 Why shutdown connection? 6:46 HTTP/1.1 Graceful shutdown12:26 Cost of HTTP/2 17:40 HTTP/2 GoAWAY frame23:40 SummaryLinkshttps://www.youtube.com/watch?v=fVKPrDrEwTI&t=1s https://chromium.googlesource.com/chromium/src/net/%2B/master/socket/client_socket_pool_manager.cc#76https://issues.chromium.org/issues/40555364https://issues.chromium.org/issues/40501721
Postgres 18 gets Async IO
Postgres 18 has been released with many exciting features such as UUIDv7, Over explain module, composite index skip scans, and the most anticipated asynchronous IO with worker and io_uring mode which I uncover in this show. Hope you enjoy it0:00 Intro1:30 Synchronous vs Asynchronous calls3:00 Synchronous IO6:30 Asynchronous IO10:00 Postgres 17 synchronous io 17:20 The challenge of Async IO in Postgres 1820:00 io_method worker23:00 io_method io_uring29:30 io_method sync 31:08 Async IO isn’t done! 31:30 Support for backend writers32:36 Improve worker io_method33:00 direct io support 37:00 Summary
Kernel level TLS
Fundamentals of Operating Systems Course https://oscourse.winktls is brilliant.TLS encryption/decryption often happens in userland. While TCP lives in the kernel. With ktls, userland can hand the keys to the kernel and the kernel does crypto. When calling write, the kernel encrypts the packet and send it to the NIC.When calling read, the kernel decrypts the packet and handed it to the userspace. This mode still taxes the host’s CPU of course, so there is another mode where the kernel offloads the crypto to the NIC device! Host CPU becomes free. Incoming packets to the NIC are decrypted in device before they are DMAed to the kernel. outgoing packets are encrypted before they leave the NIC to the network.ktls still need handshake to happen in userspace. There is also enabling zerocopy in some cases (now that kernel has context) Deserves a video. So much good stuff.0:00 Intro2:00 Userspace SSL Libraries 3:00 ktls 6:00 Kernel Encrypts/Decrypts (TLS_SW)8:20 NIC offload mode (TLS_HW)10:15 NIC does it all (TLS_HW_RECORD)12:00 Write TX Example13:50 Read RX Example17:00 Zero copy (sendfile)https://docs.kernel.org/networking/tls-offload.html