Podbean logo
  • Discover
  • Podcast Features
    • Podcast Hosting

      Start your podcast with all the features you need.

    • Podbean AI Podbean AI

      AI-Enhanced Audio Quality and Content Generation.

    • Blog to Podcast

      Repurpose your blog into an engaging podcast.

    • Video to Podcast

      Convert YouTube playlists to podcasts, videos to audios.

  • Monetization
    • Ads Marketplace

      Join Ads Marketplace to earn through podcast sponsorships.

    • PodAds

      Manage your ads with dynamic ad insertion capability.

    • Apple Podcasts Subscriptions Integration

      Monetize with Apple Podcasts Subscriptions via Podbean.

    • Live Streaming

      Earn rewards and recurring income from Fan Club membership.

  • Podbean App
    • Podcast Studio

      Easy-to-use audio recorder app.

    • Podcast App

      The best podcast player & podcast app.

  • Help and Support
    • Help Center

      Get the answers and support you need.

    • Podbean Academy

      Resources and guides to launch, grow, and monetize podcast.

    • Podbean Blog

      Stay updated with the latest podcasting tips and trends.

    • What’s New

      Check out our newest and recently released features!

    • Podcasting Smarter

      Podcast interviews, best practices, and helpful tips.

  • Popular Topics
    • How to Start a Podcast

      The step-by-step guide to start your own podcast.

    • How to Start a Live Podcast

      Create the best live podcast and engage your audience.

    • How to Monetize a Podcast

      Tips on making the decision to monetize your podcast.

    • How to Promote Your Podcast

      The best ways to get more eyes and ears on your podcast.

    • Podcast Advertising 101

      Everything you need to know about podcast advertising.

    • Mobile Podcast Recording Guide

      The ultimate guide to recording a podcast on your phone.

    • How to Use Group Recording

      Steps to set up and use group recording in the Podbean app.

  • All Arts Business Comedy Education
  • Fiction Government Health & Fitness History Kids & Family
  • Leisure Music News Religion & Spirituality Science
  • Society & Culture Sports Technology True Crime TV & Film
  • Live
  • How to Start a Podcast
  • How to Start a Live Podcast
  • How to Monetize a podcast
  • How to Promote Your Podcast
  • How to Use Group Recording
  • Log in
  • Start your podcast for free
  • Podcasting
    • Podcast Features
      • Podcast Hosting

        Start your podcast with all the features you need.

      • Podbean AI Podbean AI

        AI-Enhanced Audio Quality and Content Generation.

      • Blog to Podcast

        Repurpose your blog into an engaging podcast.

      • Video to Podcast

        Convert YouTube playlists to podcasts, videos to audios.

    • Monetization
      • Ads Marketplace

        Join Ads Marketplace to earn through podcast sponsorships.

      • PodAds

        Manage your ads with dynamic ad insertion capability.

      • Apple Podcasts Subscriptions Integration

        Monetize with Apple Podcasts Subscriptions via Podbean.

      • Live Streaming

        Earn rewards and recurring income from Fan Club membership.

    • Podbean App
      • Podcast Studio

        Easy-to-use audio recorder app.

      • Podcast App

        The best podcast player & podcast app.

  • Advertisers
  • Enterprise
  • Pricing
  • Resources
    • Help and Support
      • Help Center

        Get the answers and support you need.

      • Podbean Academy

        Resources and guides to launch, grow, and monetize podcast.

      • Podbean Blog

        Stay updated with the latest podcasting tips and trends.

      • What’s New

        Check out our newest and recently released features!

      • Podcasting Smarter

        Podcast interviews, best practices, and helpful tips.

    • Popular Topics
      • How to Start a Podcast

        The step-by-step guide to start your own podcast.

      • How to Start a Live Podcast

        Create the best live podcast and engage your audience.

      • How to Monetize a Podcast

        Tips on making the decision to monetize your podcast.

      • How to Promote Your Podcast

        The best ways to get more eyes and ears on your podcast.

      • Podcast Advertising 101

        Everything you need to know about podcast advertising.

      • Mobile Podcast Recording Guide

        The ultimate guide to recording a podcast on your phone.

      • How to Use Group Recording

        Steps to set up and use group recording in the Podbean app.

  • Discover
  • Log in
    Sign up free
Coding Blocks

Coding Blocks

Technology

Designing Data-Intensive Applications – Weak Isolation and Snapshotting

Designing Data-Intensive Applications – Weak Isolation and Snapshotting

2023-02-20
Download Right click and do "save link as"

Ever wonder how database backups work if new data is coming in while the backup is running? Hang with us while we talk about that, while Allen doesn’t stand a chance, Outlaw is in love, and Joe forgets his radio voice.

The full show notes for this episode are available at https://www.codingblocks.net/episode204.

News
  • Thanks for the great reviews!
    • Audible: Allison Williams
  • Orlando Code Camp 2023 is coming up on March 25th 2023 (orlandocodecamp.com)
The big, beautiful, boar book: Designing Data-Intensive Applications
Weak Isolation levels
  • If two transactions don’t touch the same data, they can be run in parallel.
  • Race conditions occur when two different processes are trying to modify and access or modify the same data at the same time.
  • Concurrency bugs are hard to find and test for – it usually comes down to unlucky timing.
  • Concurrency bugs can also be very difficult to understand because multiple parts of an application can be interacting with the database simultaneously and in unexpected ways.
  • Single-user interactions with a database are hard enough, and when you have multiple interactions happening simultaneously, it makes it all much more difficult.
  • Databases try to make it look like interactions happen one at a time for that very reason – to simplify the work for a developer.
    • Serializable isolation is a database guarantee that makes transactions look as if they happened serially – one after another.
  • Isolation is not that simple in reality
    • Serializable isolation comes at a performance cost
      • For this reason, most databases choose not to use it
    • Most databases use weaker isolation levels to protect against some concurrency issues but not all of them
      • These aren’t just theoretical bugs
        • Have resulted in large financial losses
        • Investigations by financial auditors
        • Customer data corruption
  • It’s been a common theme that “use a relational db if you’re doing financial transactions” – however, being that most db’s use weak isolation, that doesn’t guarantee things would have been perfect
    • For this reason – you should understand the various weak isolation (non-serial) levels
Read Committed
  • Two guarantees
    • When reading from the database, you will only see data that has been committed (no dirty reads)
    • When writing to the database, you will only overwrite data that has been committed (no dirty writes)
      • A second write is delayed until the first write’s transaction has been committed or aborted
      • This does not protect against the incrementing race condition – ie. two processes read a value at the same time, id = 1, then process one increments that and saves it, so the value is 2. Now, process two (which had 1 in memory from the read) does its increment, and stores the value as 2 as well – the value should have been three but because it had an old id, it didn’t increment properly
        • Avoiding this is discussed later in “Preventing Lost Updates”
Snapshot Isolation and Repeatable Read
  • Addresses read skew, an example of a non-repeatable read
    • The example given was a customer has two bank accounts, gets her balance of account A, and then some moments after a transfer of $100 from account 2 to account 1 the customer gets the balance for account B…customer has an old value from account 1 and a new value from account two, so it looks like the customer is missing $100.
      • This is acceptable in the read committed isolation as both account values were committed at the time of the reads.
      • How could this happen? Here’s a quick example…
        • Multiple queries were issued to get the different account values – get balance for account 1, get balance for account 2…behind the scenes, someone did a transfer from one account to the other
    • This is a very temporary state
  • There are situations where this can’t happen
    • Backups, analytics queries and integrity checks
  • Snapshot isolation is a typical solution to the problem
    • Transactions read from a consistent snapshot – meaning that a transaction is opened to read multiple values from a snapshot state of the database when the read first started
      • Very popular feature – supported by PostgreSQL, Oracle, SQL Server, MySQL with InnoDB
How is snapshot isolation accomplished?
  • Usually use write locks to stop dirty writes
    • Reads never block writes, and writes never block reads
  • Because there may be multiple transactions taking place at once, there may need to be multiple copies of database objects in play at once – this is referred to as multi-version concurrency control
  • The difference between read committed and snapshot isolation is read committed will use a different snapshot for each read whereas snapshot isolation will use the same snapshot for multiple reads within the transaction
    • They show an example of PostgreSQL’s implementation
      • Found this README in Postgres – https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README-SSI
      • The implementation basically uses some metadata fields on a row – created_by and deleted_by fields which contain transaction id’s
        • If you were to delete a row, that deleted_by field is updated, the row isn’t actually deleted at that point in time, but garbage collection will pick it up later and remove it physically from the table – at a time when it’s deemed that it will no longer be accessed
        • Updates are converted to creates and deletes (similar to what you’d see if you’re familiar with triggers in something like SQL Server)
Visibility for seeing a consistent snapshot
  • Consistent snapshots work by following these rules:
    • At the start of a transaction, a list of all transactions in progress are identified and ignored for any reads
    • Any writes made by transactions that were aborted are ignored
    • Any writes made by a newer transaction id are ignored
    • All other writes are available to read
  • Another way of thinking about it – an object is visible if
    • A transaction that had created an object had already been completed BEFORE the reader transaction starts
    • An object is either not marked for deletion OR if it is marked for deletion, the transaction had not been completed at the time the read transaction started
  • Because the database is never truly updating/deleting values in place, a number of running transactions can continue to function from snapshots of those objects with very small overhead
Snapshot isolation and indexes
  • Considering what we mentioned about the database storing multiple snapshots of state, how does this work with indexes?
    • One way would be to have the index point to all the transaction ids and have them filtered out, and when garbage collection happens, remove those entries from the index as well
  • Turns out, there are a lot of implementation details and performance thoughts to take into consideration depending on the database implementation
    • They gave an example of how things are done in PostgreSQL – if multiple versions of the same object can fit on the same page file, nothing is done to the index
    • Another approach used by CouchDB and others are to use an append-only / copy-on-write method that does not overwrite the existing page in the b-tree but rather creates a copy of the modified page. Then, a copy of each parent is made all the way up to the root page to point to the new pages. Any pages not impacted by the write operation don’t need to be touched
      • If that sounded like it was creating a new tree for every write in the append-only-B-tree, you’d be correct. By taking this approach, every individual root tree is a consistent shot of the database at that point in time
        • The benefit of this approach is you don’t have to filter anything because every root node has only the transactions that belong in that snapshot
        • You do need a background process to garbage collect and compact
Repeatable read and naming confusion
  • Unfortunately, snapshot isolation is known by many names
    • Oracle calls it serializable
    • PostgreSQL and MySQL call it repeatable read
  • Why is it not consistent naming? Because it’s not a part of SQL, but rather a part of System R’s 1975 isolation levels, it hadn’t been defined yet!
    • They HAD defined repeatable read which is very similar to snapshot isolation
  • Unfortunately in relational databases, “repeatable read” doesn’t define what guarantees they really provide
  • It was called out that there is a formal definition of a repeatable read but most implementations don’t meet the definition
  • “Nobody really knows what repeatable read means”
Resources We Like
  • Orlando Code Camp 2023 (orlandocodecamp.com)
  • The 12 Factor App and Google Cloud (cloud.google.com)
  • Martin Kleppmann’s website (martin.kleppmann.com)
  • Dave Foster Wallace – This is Water (youtube)
Tip of the Week
  • “Infinite Jest” is an interesting book, but it’s not a good audio experience. Get the physical book this time. (amazon.com)
  • Tamara Makes Games is a game dev on YouTube that has a lot of videos oriented around isometric, city-builder, and strategy games similar to Factorio. It’s a cool niche, and it’s a nice balance of code and visuals that are a delight to watch. (youtube)
  • There’s a lot you can do with iTerm2’s status bar, it’s highly configurable making it easy to show system resource monitors, shell information, and other miscellaneous items. (iterm2.com)
  • Google has a tool named “container-diff” for analyzing and comparing container images. It can examine images along several different criteria, great for tracking down issues..like knowing why Docker isn’t caching a layer. (github.com)
  • Xeol is a great utility for checking for end-of-life packages that you should get rid of. Thanks, gaprogman! (github)
  • Using minikube? You can manage the space for the vm by using minikube ssh to shell into the machine and then prune your images with docker builder prune. Alternatively, you can use eval $(minikube docker-env) it to proxy docker to your local machine so you can just docker builder prune (and any other docker commands) without the shell. (minikube.sigs.k8s.io)
view more

More Episodes

When to Log Out
2024-10-07
Things to Know when Considering Multi-Tenant or Multi-Threaded Applications
2024-09-02
Two Water Coolers Walk Into a Bar…
2024-08-18
How did We Even Arrive Here?
2024-08-04
AI, Blank Pages, and Client Libraries…oh my!
2024-07-07
Alternatives to Administering and Running Apache Kafka
2024-06-23
Nuts and Bolts of Apache Kafka
2024-06-09
Intro to Apache Kafka
2024-05-26
StackOverflow AI Disagreements, Kotlin Coroutines and More
2024-05-13
Llama 3 is Here, Spending Time on Environmental Setup and More
2024-04-28
Ktor, Logging Ideas, and Plugin Safety
2024-04-14
Importance of Data Structures, Bad Documentation and Comments and More
2024-04-01
Decorating your Home Office
2024-03-18
Multi-Value, Spatial, and Event Store Databases
2024-03-04
Overview of Object Oriented, Wide Column, and Vector Databases
2024-02-19
Picking the Right Database Type – Tougher than You Think
2024-02-05
There is still cool stuff on the internet
2024-01-21
Reflecting on 2023 and Looking Forward to 2024
2024-01-08
Gartner Top Strategic Technology Trends 2024
2023-12-18
2023 Holiday Season Developer Shopping List
2023-11-25
  • ←
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • →
012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store Download Podbean app on Google Play

Create your
podcast in
minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get started

It is Free

  • Podcast Services

    • Podcast Features
    • Pricing
    • Enterprise Solution
    • Private Podcast
    • The Podcast App
    • Live Stream
    • Audio Recorder
    • Remote Recording
    • Podbean AI
  •  
    • Create a Podcast
    • Video Podcast
    • Start Podcasting
    • Start Radio Talk Show
    • Education Podcast
    • Church Podcast
    • Nonprofit Podcast
    • Get Sermons Online
    • Free Audiobooks
  • MONETIZATION & MORE

    • Podcast Advertising
    • Dynamic Ads Insertion
    • Apple Podcasts Subscriptions
    • Switch to Podbean
    • YouTube to Podcast
    • Blog to Podcast
    • Submit Your Podcast
    • Podbean Plugins
    • Developers
  • KNOWLEDGE BASE

    • How to Start a Podcast
    • How to Start a Live Podcast
    • How to Monetize a Podcast
    • How to Promote Your Podcast
    • Mobile Podcast Recording Guide
    • How to Use Group Recording
    • Podcast Advertising 101
  • Support

    • Support Center
    • What’s New
    • Free Webinars
    • Podcast Events
    • Podbean Academy
    • Podbean Amplified Podcast
    • Badges
    • Resources
  • Podbean

    • About Us
    • Podbean Blog
    • Careers
    • Press and Media
    • Green Initiative
    • Affiliate Program
    • Contact Us
  • Privacy Policy
  • Cookie Policy
  • Terms of Use
  • Consent Preferences
  • Copyright © 2015-2026 Podbean.com