Good CTE, Bad CTE

(boringsql.com)

91 points | by radimm 1 day ago

9 comments

  • vlaaad 5 hours ago
    Use the term, never define the term, classic.

    CTE stands for Common Table Expressions in SQL. They are temporary result sets defined within a single query using the WITH clause, acting like named subqueries to improve readability and structure.

    • radimm 5 hours ago
      OP here, damn - that's a very good point. Can't believe I missed it.
      • xxs 8 minutes ago
        I read the article before the abbreviate definition inclusion as its very opening. I had never met the abbreviation before.

        It'd be quite surprising the WITH statement in top a query to be the first feature to learn/use past basic SQL. Is it personal experience in some industry?

      • swasheck 19 minutes ago
        i appreciate the way you took the feedback. i saw the domain name and immediately knew the content and context. the article did not disappoint. i come from a heavy mssql background with some postgres sprinkled in, but my current company is migrating our mssql fleet to pg and it’s nice to have a technical foundation and article to be able to truly understand and pass the differences and similarities then how the two platforms handle workload. Traditionally it’s just been multiple sides, mercilessly, criticizing, the other for their deficiencies, but each platform has its own strengths and its own gaps. I’m excited to be a part of this migration, and I appreciate technical articles such as this that help me articulate the broader challenges to both our executive level levels, and our developers.
      • iainmerrick 4 hours ago
        From the headline, I thought it might be about sports-related concussions!

        I was morbidly curious what a "good CTE" could possibly be...

        • QuantumNomad_ 3 hours ago
          As someone who is not much of a sports person, now I was wondering what CTE means in sports.

          Seems to be this:

          > Chronic traumatic encephalopathy (CTE) is a progressive neurodegenerative disease […]

          > Evidence indicates that repetitive concussive and subconcussive blows to the head cause CTE. In particular, it is associated with contact sports such as boxing, American football, Australian rules football, wrestling, mixed martial arts, ice hockey, rugby, and association football.

          https://en.wikipedia.org/wiki/Chronic_traumatic_encephalopat...

          • iainmerrick 2 hours ago
            Yeah - Muhammad Ali is the most famous victim (or at least likely victim, I don’t think he was officially diagnosed with CTE as it wasn’t well understood back then). In the UK, it’s gradually becoming recognised as a serious problem in rugby.

            I assumed the C stood for Concussion. Wrong but also partly right!

        • tialaramex 2 hours ago
          I was thinking "Compile Time Execution" like Rust's const, C++ consteval functions, Zig's comptime, that sort of thing. So the good/ bad made more sense but I was still on the wrong track, yeah a definition was appropriate.
    • tclancy 3 hours ago
      Agreed. I was relieved to see this wasn’t written by Cam Skatteboro.
      • lizknope 1 hour ago
        Yeah, I thought I was on a sports site. Cam thinks CTE isn't real and is "all in your head." Technically correct about the all in your head part.
  • oveja 52 minutes ago
    To the author of the article. This was a really nice and educating read. You made me finally understand recursive CTEs, the org chart was a really good example. Thank you.
  • mcdonje 1 hour ago
    There's some good stuff in here. I didn't know about the issues an aggregation in a CTE can cause and haven't used EXISTS much.

    Regarding recursive CTEs, you might be interested in how DuckDb evolved them with USING KEY: https://duckdb.org/2025/05/23/using-key

  • ctippett 5 minutes ago
    Not to sound like a broken record, but I need to echo what many other replies have already said: great article.

    Embarassingly, despite thinking of myself as pretty knowledgeable with SQL, I had no idea you could nest DML statements inside a CTE. I always assumed/used DML statements as the final statement after a CTE was defined. I'm not sure if or when I might use this in the future, but it's neat to learn something new (and to be humbled at the same time).

  • yen223 5 hours ago
    I've always thought of CTEs as a code organisation tool, not an optimisation tool. The fact the some rdbms treats them as an optimisation fence was a bug, not a feature.
    • solumunus 4 hours ago
      Improved readability is definitely the primary benefit.
  • bob1029 5 hours ago
    > Recursive CTEs use an iterative working-table mechanism. Despite the name, they aren't truly recursive. PostgreSQL doesn't "call itself" by creating a nested stack of unfinished queries.

    If you want something that is more like actual recursion (I.e., depth-first), Oracle has CONNECT BY which does not require the same kind of tracking. It also comes with extra features to help with cycle detection, stack depth reflection, etc.

    If your problem is aligned with the DFS model, the oracle technique can run circles around recursive CTEs. Anything with a deep hierarchy and early termination conditions is a compelling candidate.

    • hans_castorp 1 hour ago
      > If you want something that is more like actual recursion (I.e., depth-first), Oracle has CONNECT BY which does not require the same kind of tracking. It also comes with extra features to help with cycle detection, stack depth reflection, etc.

      All that is supported with CTEs as well. And both Postgres and Oracle support the SQL standard for these things.

      You can't choose between breadth first/depth first using CONNECT BY in Oracle. Oracle's manual even states that CTE are more powerful than CONNECT BY

  • dspillett 5 hours ago
    I wrangle databases by day, and do martial arts of an evening. Two arenas where CTEs can cause significant headaches!
  • uwemaurer 4 hours ago
    Great article, I always like to structure my queries with CTEs and I was (wrongly) assuming it all gets inlined at the end. Sometimes it also gets complicated since these intermediate results can't be easily seen in a SQL editor. I was working on a UI to parse CTE queries and then execute them step by step to show the results of all the CTEs for easier understanding of the query (as part of this project https://github.com/sqg-dev/sqg/)
    • siddboots 3 hours ago
      I think your assumption about inlining is essentially correct. As far as I know postgres was the last major rdbms to have an optimiser fence around CTEs.
      • nraynaud 2 hours ago
        I concur, “the Germans” have created an algorithm that completely “see through” subqueries/CTEs when planning a query. The way the query is written has no bearing on the execution.
  • qwertydog 5 hours ago
    Great post - thanks. I think the columns in the index you suggested in the pre-pg12 section are in the wrong order (that index would get used)
    • radimm 5 hours ago
      Thanks - I will recheck later today.