PostgreSQL

PostgreSQL

/pg138

A place to discuss the PostgreSQL RDBMS

@ryansmith·17:48 12/02/2025

wow!

I didn't realize AWS has a mechanism for taking consistent snapshots across EBS volumes.

https://aws.amazon.com/blogs/storage/taking-crash-consistent-snapshots-across-multiple-amazon-ebs-volumes-on-an-amazon-ec2-instance/

Taking crash-consistent snapshots across multiple Amazon EBS volumes on an Amazon EC2 instance | Amazon Web Services

Amazon Elastic Block Store (Amazon EBS) enables you to back up volumes at any time using EBS snapshots. Snapshots retain the data from all completed I/O operations, allowing you to restore the volume to its exact state at the moment before backup (referred to as crash-consistency). Many of our customers use snapshots in their backup […]

@ryansmith·20:52 23/01/2025

Fascinating to think about optimizing loops for CPUs. I wonder if PG's numeric additions could benefit from a more branchless approach

https://github.com/postgres/postgres/blob/4f15759bdcddd23e874526a6b2c0ff86e0beb042/src/interfaces/ecpg/pgtypeslib/numeric.c#L637-L754

https://15721.courses.cs.cmu.edu/spring2023/slides/06-execution.pdf

https://imagedelivery.net/BXluQx4ige9GuW0Ia56BHw/2c7e2ecf-f60e-42cd-7e4c-c0a3abe93500/original

postgres/src/interfaces/ecpg/pgtypeslib/numeric.c at 4f15759bdcddd23e874526a6b2c0ff86e0beb042 · postgres/postgres

Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Subm...

@ryansmith·06:31 02/01/2025

UPDATE heavy workloads on PG are known to scatter rows around on pages which may increase IO when scanning many rows in a single query. Unless you are getting 100% HOT updates (most aren't).

To solve this in the past, I've used CLUSTER -- which isn't great because it requires a lock and takes a long time. It's almost unusable.

pg_repack solves this by doing online re-clustering and storage reclamation.

https://reorg.github.io/pg_repack/

pg_repack 1.5.2 -- Reorganize tables in PostgreSQL databases with minimal locks

reorg.github.io

@ryansmith·06:29 02/01/2025

https://www.cs.cmu.edu/~pavlo/blog/2023/04/the-part-of-postgresql-we-hate-the-most.html

Spoiler: MVCC

This also helps bring awareness to the impact of UPDATE heavy workloads on PG

The Part of PostgreSQL We Hate the Most

As much as Andy loves PostgreSQL, there is one part that is terrible and causes many headaches for people. Learn what it is and why it sucks.

@ryansmith·20:29 29/12/2024

A rebuttal to Uber’s PG/MySQL paper.

Keep in mind, PG has changed since this discussion. A lot of the replication issues are long since put to rest.

The storage sub-systems are the interesting point of difference between these systems. MYSQL’s indexes point to primary keys instead of a CTID. An update to a MySQL row will have same storage location but PG will be a new CTID/page — unless it was a Heap Only Tuple (HOT) update. But this means PG can reduce IO for queries.

As with anything, it’s always a tradeoff. However, database systems in particular are especially sensitive to workload details. The tradeoffs really matter!

https://thebuild.com/presentations/uber-perconalive-2017.pdf

@sds·07:56 29/12/2024

Using stored procedures for complex query patterns makes sense, but I don't have an intuition for the kind of query complexity necessary before seeing tangible benefits switching from prepared statements.

Glad someone is experimenting! https://github.com/pg-nano/pg-nano/

GitHub - pg-nano/pg-nano: Postgres native driver for TypeScript: automatic type definitions for Postgres functions, instant schema updates, and Vite-inspired plugins

Postgres native driver for TypeScript: automatic type definitions for Postgres functions, instant schema updates, and Vite-inspired plugins - pg-nano/pg-nano

@ryansmith·23:00 03/12/2024

initial thoughts on AWS DSQL

1. PG compatible (pg is dominant at this point)
2. Most of the article talked about HA
3. AWS has a proprietary, internal distributed transaction log service that is supposedly the most important piece of AWS and everything in AWS depends on it. Hope that teams makes 10M annual tc each.
4. It depresses me to think that the future of compute and storage is: run it on aws

It's really neat to think about having a network storage layer that can support many database frontends. this seems like the future. Neon et al. are moving in this direction. Also something that Stonebraker has been talking about for a while now.

https://aws.amazon.com/blogs/database/introducing-amazon-aurora-dsql/

Introducing Amazon Aurora DSQL | Amazon Web Services

Today, we introduce Amazon Aurora DSQL, the fastest serverless distributed SQL database for always available applications. It offers virtually unlimited scale, highest availability, and zero infrastructure management. It can scale to meet any workload demand without database sharding or instance upgrades. In this post, we discuss the benefits of Aurora DSQL and how to get started.

@ryansmith·01:00 25/11/2024

https://avi.im/blag/2024/zero-disk-architecture/

This is a really neat idea. Low latency, CAS, and append are killer S3 features.

But kind of dystopian to think that all our data is owned and operated by a single US company.

WDYT?

Zero Disk Architecture - blag

State is pain. The next generation of infrastructure tools will be built on diskless paradigm. In this short post I will explain what is Diskless / Zero Disk Architecture

@ryansmith·03:31 20/11/2024

It's quite rich that the "Tablespaces" section follows the "Destroying a Database" section since misconfiguring table spaces is an excellent way to absolutely hose your database.

https://imagedelivery.net/BXluQx4ige9GuW0Ia56BHw/256edf86-79ed-42ef-18dc-95a83e9ab000/original

@ryansmith·07:10 19/11/2024

pg_column_size is nice. I knew a bigint was 8 bytes, but I was curious to see the storage size for comparable numeric data.

https://imagedelivery.net/BXluQx4ige9GuW0Ia56BHw/a5f983a7-32b2-4a09-6b42-c5c2a83daf00/original

@ryansmith·15:12 14/11/2024

What every programmer should know about solid-state drives

https://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

Coding for SSDs – Part 6: A Summary – What every programmer should know about solid-state drives

This is Part 6 over 6 of "Coding for SSDs". For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well on SSDs. In this part, I am summarizing the co

codecapsule.com

@sds·22:35 02/11/2024

I'm behind the times—apparently folks have been running Postgres on unikernels for quite some time†.

https://www.prisma.io/blog/announcing-prisma-postgres-early-access
https://nanovms.com/dev/tutorials/running-postgres-as-a-unikernel †

Prisma Postgres®: Building a Modern PostgreSQL Service Using Unikernels & MicroVMs

At Prisma, we believe that deploying a database should be as simple as adding a new page in Notion. Today, we are excited to share the first milestone towards this vision: Prisma Postgres® gives developers an always-on database with pay-as-you-go pricing, thanks to our unique architecture design.

Running Postgres as a Unikernel

This isn't the first time we took a look at running postgres as a unikernel. A few years ago one of our engineers took a look at how difficult it might be to port it.

@ryansmith·18:54 01/11/2024

Damn.

Screen shot from this talk on PG/ZFS

https://people.freebsd.org/%7Eseanc/postgresql/scale15x-2017-postgresql_zfs_best_practices.pdf

https://imagedelivery.net/BXluQx4ige9GuW0Ia56BHw/8f666a23-a351-4887-ae0a-ccc8b96e1e00/original

@ryansmith·21:35 31/10/2024

Config for running PG on ZFS

https://vadosware.io/post/everything-ive-seen-on-optimizing-postgres-on-zfs-on-linux/

Everything I've seen on optimizing Postgres on ZFS

Looking to run Postgres on ZFS? I've gathered some of the information and sage advice out there to give you a head start on figuring out how to do it safely and efficiently.

@ryansmith·19:18 30/10/2024

Fun PG-related meetup for the SF folks

1229

Harry

@htormey·17:05 30/10/2024

Hosting a panel of expert speakers from engineering at Roblox, YugabyteDB and Coinbase tonight in SF. The subject is SQL at scale, if you are interested in joining signup below. Should be an interesting discussion. https://lu.ma/rcr09cp4

@ryansmith·03:59 24/10/2024

I love these 5 minutes of Postgres videos. Here's a good one on upcoming IO changes:

https://www.youtube.com/watch?v=QAYzWAlxCYc

- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

www.youtube.com

@ryansmith·01:07 20/10/2024

Did you know that Postgres was originally written in LISP? (and C)

"We expected that it would be especially easy to write the optimizer and inference engine in LISP, since both are mostly tree processing modules"

https://dsf.berkeley.edu/papers/ERL-M90-34.pdf

@ryansmith·19:31 18/10/2024

PostgreSQL's C driver is called: libpq

What's PQ?

Q was short for QUEL. QUEL is a dead query language once used by Ingress. PostgreSQL was post-Ingress.

At one time PG supported a query language called PostQUEL.

@ryansmith·18:44 15/10/2024

https://www.depesz.com/2022/07/05/understanding-pg_stat_activity/

1) Depsez is pg legend
2) This is a great explainer on pg_stat_activity

Understanding pg_stat_activity – select * from depesz;

@ryansmith·05:20 12/10/2024

just added a new chain to @indexsupply and used PG 17!

eager to see how the new vector IO impacts table scans

@sds·05:55 17/09/2024

Very slick: https://postgres.new

Now make a system that takes your schema and inspects pg_stat_statements on your production system and makes actionable recommendations of migrations to improve performance.

@ryansmith·05:25 13/09/2024

Fantastic talk on PG's share buffers -- and memory in general

https://www.youtube.com/watch?v=u-r8VuzXeBE

- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

www.youtube.com

@sds·23:18 08/09/2024

One of those ideas that sounds good at first but upon deeper reflection reveals otherwise (project abandoned). Might still be useful for querying information about your infrastructure, but not for creating.
https://iasql.com

Infra as SQL | IaSQL

Cloud infrastructure as data in PostgreSQL

@sds·14:19 07/09/2024

Making Postgres fun again:

https://github.com/nuno-faria/tetris-sql

GitHub - nuno-faria/tetris-sql: Using SQL's Turing Completeness to Build Tetris

Using SQL's Turing Completeness to Build Tetris. Contribute to nuno-faria/tetris-sql development by creating an account on GitHub.

@samuellhuber.eth·08:47 20/08/2024

Supabase CEO on user complaints the self hosted version is limited and sucks compared to the cloud version

https://www.reddit.com/r/selfhosted/s/v02qg93ao9

@manan·17:28 07/08/2024

How does Postgres manage to do this? 🧐

SELECT S
WHERE W
ORDER BY O
> takes 0.193s

-----
SELECT S
WHERE W
ORDER BY O
LIMIT 20
> takes 74.129s

@ryansmith·03:40 01/08/2024

I enjoy using PGTune when setting up a fresh pg server. Maybe you will too:

https://pgtune.leopard.in.ua

@ryansmith·21:41 31/07/2024

An enjoyable read on how PG encodes NUMERIC data:

https://github.com/postgres/postgres/blob/05a5a1775c89f6beb326725282e7eea1373cbec8/src/backend/utils/adt/numeric.c#L253-L303

@sds·23:42 25/07/2024

FSMs as a core data primitive are going to get more popular. Combined with a runtime library for your preferred language, and you have a way to formally verify statements about business logic in your system.

https://raphael.medaer.me/2019/06/12/pgfsm.html

Versioned FSM (Finite-State Machine) with Postgresql

Inspired by Felix Geisendorfer blog post I implemented a database FSM (Finite-State Machine) with Postgresql. I brought some improvements to Felix’s implementation but before reading the following I recommend you to read carefully the original post.

raphael.medaer.me

@ryansmith·00:14 25/07/2024

Anyone got a feature / fix that they are eager to see land in 17?

Here are a few I'm looking forward to:
- COPY option ON_ERROR ignore to discard error rows
- to_bin() and to_oct()

@sds·20:27 11/07/2024

Impressive tool from Xata providing a solution for zero-downtime, REVERSIBLE (!!!) migrations—the holy grail.

Excited for this to eventually reach v1. Current feature set is very impressive, but it's not yet ready to handle all possible migrations just yet. Something to keep an eye on.
https://xata.io/blog/pgroll-schema-migrations-postgres

Introducing pgroll: zero-downtime, reversible, schema migrations for Postgres

We are excited to ship the first version of pgroll, a command line tool that offers safe and reversible schema migrations for PostgreSQL

@sds·21:37 03/07/2024

Great post on how PG lock behavior during schema changes. It's surprising (but obvious in hindsight) how a statement that obtains a lock which doesn't conflict with a schema change could still be blocked by another statement that does obtain a blocking lock, due to the FIFO nature of the lock queue.

https://xata.io/blog/migrations-and-exclusive-locks

Schema changes and the Postgres lock queue

Learn how schema changes can cause downtime by locking out reads and writes and how migration tools can avoid it by using lock timeouts, along with backoff and retry strategies.

@ryansmith·17:13 03/07/2024

Should be some good perf. improvements with v17

https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-OPTIMIZER

in particular, some CTEs should get faster

E.3. Release 17

E.3. Release 17 # E.3.1. Overview E.3.2. Migration to Version 17 E.3.3. Changes E.3.4. Acknowledgments Release date: 2024-09-26 E.3.1. Overview # PostgreSQL 17 …

www.postgresql.org

@ryansmith·01:23 03/07/2024

A CTE (WITH) can be MATERIALIZED or NOT MATERIALIZED.

When MATERIALIZED, the query is computed only once for the outer query. Good for reducing work. Not good when you reference the CTE multiple times with different predicates.

NOT MATERIALIZED forces PG to "inline" the CTE which allows predicate push down but possibly duplicates work.

The default is MATERIALIZED when a CTE is referenced more than once.

https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-CTE-MATERIALIZATION

7.8. WITH Queries (Common Table Expressions)

7.8. WITH Queries (Common Table Expressions) # 7.8.1. SELECT in WITH 7.8.2. Recursive Queries 7.8.3. Common Table Expression Materialization 7.8.4. Data-Modifying …

www.postgresql.org

@ryansmith·19:24 01/07/2024

TIL pg source has a helpful README for the Executor

https://github.com/postgres/postgres/blob/master/src/backend/executor/README

@ryansmith·18:42 27/06/2024

Good article on correlated / uncorrelated subqueries:

https://www.cybertec-postgresql.com/en/subqueries-and-performance-in-postgresql/

Subqueries and performance in PostgreSQL

This article discusses how subqueries perform in PostgreSQL and how to rewrite queries to improve their performance.

www.cybertec-postgresql.com

@ryansmith·23:58 26/06/2024

Have you talked with your loved ones
about independent ordering options lately?

https://imagedelivery.net/BXluQx4ige9GuW0Ia56BHw/11f55890-e26d-44bb-fa15-018bd367a300/original

@ryansmith·18:43 26/06/2024

still one of the best pg blogs in the net

https://www.depesz.com

@ryansmith·21:49 25/06/2024

without looking it up, please guess the following limits:

1. relations per database
2. relation size

@ryansmith·19:31 25/06/2024

I really enjoy using _actual_ postgres servers in my tests. Packages like this one: https://github.com/theseus-rs/postgresql-embedded make it fast and easy!

GitHub - theseus-rs/postgresql-embedded: Embed PostgreSQL database

Embed PostgreSQL database. Contribute to theseus-rs/postgresql-embedded development by creating an account on GitHub.

@sds·20:57 19/06/2024

With an unlogged table, Postgres is fast as a cache. Downside is unlogged tables aren't replicated, and so this pattern doesn't scale.

https://www.cybertec-postgresql.com/en/postgresql-vs-redis-vs-memcached-performance/

PostgreSQL vs Redis vs Memcached performance

PostgreSQL vs Redis vs Memcached: How would it look on the performance side, if one just skips the cache & hit the database directly?

www.cybertec-postgresql.com

@ryansmith·16:22 12/06/2024

What's a good rule of thumb for shared_buffers size? 25% of RAM?

@sds·05:55 10/06/2024

Excellent overview of all the different kinds of lock in Postgres and the various SQL statements that invoke them:
https://medium.com/@hnasr/postgres-locks-a-deep-dive-9fc158a5641c

Postgres Locks — A Deep Dive

I used to think database locks are two types, shared and exclusive. Readers acquire many shared locks on a resource (row, object or table)…

@ryansmith·19:17 20/05/2024

PG incremental base backups seem really neat:

https://pganalyze.com/blog/5mins-postgres-17-incremental-backups

@ryansmith·17:07 15/05/2024

Draft PG 17 release notes

https://www.postgresql.org/docs/devel/release-17.html

E.3. Release 17

E.3. Release 17 # E.3.1. Overview E.3.2. Migration to Version 17 E.3.3. Changes E.3.4. Acknowledgments Release date: 2024-09-26 E.3.1. Overview # PostgreSQL 17 …

www.postgresql.org

@ryansmith·03:49 11/05/2024

auto_explain is invaluable

https://www.postgresql.org/docs/current/auto-explain.html

F.3. auto_explain — log execution plans of slow queries

F.3. auto_explain — log execution plans of slow queries # F.3.1. Configuration Parameters F.3.2. Example F.3.3. Author The auto_explain module provides …

www.postgresql.org

@ryansmith·18:43 10/05/2024

I sometimes find myself thinking: this would a great case for PG ARRAYs

But it always comes back to bite me

@sds·07:01 26/04/2024

Postgres in the browser. This space is starting to get really interesting.

https://github.com/electric-sql/pglite

GitHub - electric-sql/pglite: Lightweight WASM Postgres with real-time, reactive bindings.

Lightweight WASM Postgres with real-time, reactive bindings. - electric-sql/pglite

@ryansmith·21:45 24/04/2024

most projects that "embed" pg binaries into the test suites end up using this one maven repo: https://mvnrepository.com/artifact/io.zonky.test.postgres

ZONKY

@ryansmith·03:01 21/04/2024

Postgres C hits different:

fctx->tupdesc = BlessTupleDesc(tupdesc);

https://github.com/postgres/postgres/blob/f4fdc24aa35c2268f519905a3a66658ebd55a466/src/backend/executor/execTuples.c#L2149-L2165