No Semicolons Needed

(terts.dev)

49 points | by karakoram 12 hours ago

13 comments

  • Animats 9 hours ago
    Classic mistakes in language design that have to be fixed later.

    - "We don't need any attributes", like "const" or "mut". This eventually gets retrofitted, as it was to C, but by then there is too much code without attributes in use. Defaulting to the less restrictive option gives trouble for decades.

    - "We don't need a Boolean type". Just use integers. This tends to give trouble if the language has either implicit conversion or type inference. Also, people write "|" instead of "||", and it almost works. C and Python both retrofitted "bool". When the retrofit comes, you find that programs have "True", "true", and "TRUE", all user-defined.

    Then there's the whole area around Null, Nil, nil, and Option. Does NULL == NULL? It doesn't in SQL.

    • tadfisher 8 hours ago
      That's what's nice about coarse-grained feature options like Rust's editions or Haskell's "languages", you can opt in to better default behavior and retain compatibility with libraries coded to older standards.

      The "null vs null" problem is commonly described as a problem with the concept of "null" or optional values; I think of it as a problem with how the language represents "references", whether via pointers or some opaque higher-level concept. Hoare's billion-dollar mistake was disallowing references which are guaranteed to be non-null; i.e. ones that refer to a value which exists.

  • bmandale 9 hours ago
    > I would love to see a language try to implement a rule where only an indented line is considered part of the previous expression.

    After python, it seems like every language decided that making parsing depend on indents was a bad idea. A shame, because humans pretty much only go by indents. An example I've frequently run into is where I forget a closing curly brace. The error is reported at the end of the file, and gives me no advice on where to go looking for the typo. The location should be obvious, as it's at exactly the point where the indentation stops matching the braces. But the parser doesn't look at indents at all, so it can't tell me that.

    • notepad0x90 31 minutes ago
      we don't even use indents that way in natural language. We use things like bullet points, we need specific markers.

      space is for spacing, tabs are for tabulation, they are not in any human language I know of used as terminators of statements. You know what is the equivalent of an indent or a semicolon in english? a period. <-

      We have paragraph breaks just like this to delimit blocks of english statements.

      A semi-colon is used to indicate part of a sentence is done, yet there is still related continuation of the statement that is yet to be finished. Periods are confusing because they're used in decimal points and other language syntax, so a semi-colon seems like a good fit.

      If I had my pick, I would use a colon to indicate the end of a statement, and a double colon '::' to indicate the end of a block.

      func main(): int i - 0: do: i=i+1: print(i): while(i<10):: ::

      another downside of indentation is it's award to do one-liners with such languages, as with python.

      There is a lot of subjectivity with syntax, but python code for example with indents is not easy for humans to read, or for syntax validators to validate. it is very easy for example to intend a statement inside a for loop or an if/else block in python (nor not-intend it), and when pasting around code you accidentally indent one level off without meaning to. if you know to look for it, you'll catch it, but it's very easy to miss, since the mis-indented statement is valid and sensible on its own, nothing will flag it as unusual.

      In my opinion, while the spirit behind indentation is excellent, the right execution is in the style of 'go fmt' with Go, where indenting your code for you after it has been properly parsed by the compiler/interpreter is the norm.

      I would even say the first thing a compiler as well as interpreter do should be to auto-indent, and line-wrap your code. that should be part of the language standard. If the compiler can't indent your code without messing up logic or without making it unreadable, then either your code or the language design has flaw.

    • kayson 9 hours ago
      I was much more opposed to this early on than I am now. With modern IDEs and extensions handling tabs vs spaces, tab width, and formatting, python ends up being very easy to read and write. I use it daily, and while I hate it for other reasons, I can't remember the last time I had any issues with indentation.
      • notepad0x90 20 minutes ago
        I must disagree but only a tiny bit. modern IDEs try to indent and attempt to add indentation as you code which can cause problems sometimes.

        tabs vs spaces is very painful still when copying code that is in a different format. it's not just tabs and spaces, but the width of the tabs and the spaces. Even with VSCode extensions and sublime text extensions I've struggled a lot recently with this.

        I commented on a sibling thread just now, but it is still very easy in python to mess up one level of indentation. When I caught bugs of that sort, it was introduced either when copy pasting, when trying to make a linter happy and doing cosmetic cleanup, or when moving code around levels of indentation, like introducing a try/except. I had one recently where if I recall correctly I moved a continue statement under or out of a try/except when messing around with the try/except logic. it was perfectly valid, and didn't stand out much visually, pydnatic and other checkers didn't catch it either. It could have happened with a '}' but it's easier to mess up a level of indentation than it is to put the '}' at the wrong level. a cosmetic fix results in a logic bug because of indentations in python. with curly's, a misplacement of that sort can't happen because of indentation or cosmetic/readability fixes.

        what the curly approach asserts is a separation of readability and logic syntax.

        The interpreter/compiler should understand your code first and foremost, indenting code should be done, and should be enforced, but automatically by the compiler/interpreter. Python could have used curly braces and semi-colons, and force-indented your code every time it ran it to make it more readable for humans.

    • stinkbeetle 9 hours ago
      > An example I've frequently run into is where I forget a closing curly brace. The error is reported at the end of the file, and gives me no advice on where to go looking for the typo. The location should be obvious, as it's at exactly the point where the indentation stops matching the braces. But the parser doesn't look at indents at all, so it can't tell me that.

      That's somewhat a quality of service issue though. Compilers should look at where the braces go out of kilter vs indentation and suggest the possible unmatched opening brace.

    • vips7L 9 hours ago
      Scala 3 decided to go with indents.
      • elfly 8 hours ago
        You are not telling the whole story.

        You can mix indentation and braces to delimit blocks.

        It's insane.

        • macintux 8 hours ago
          As a casual observer who has written perhaps a dozen lines of Scala in his life, I feel like Scala approaches any “pick one” decision with “why not both?”.

          Functional or OO? Yes.

        • Blikkentrekker 8 hours ago
          I like how Haskell does it. One can do both but not mix, as in either indent or use `{ ... }`.
    • Blikkentrekker 8 hours ago
      The issue is that you find you very often want to break those roles. Python basically has `elif` because `else if` would make each branch nest one level deeper which isn't what one wants, except Python uses exceptions for flow control so you find yourself having to use `except ... try` as an analogue to `else if` but not `excetry` exists to do the same and stop the indentation.

      There are many other examples. It exists to give people freedom. Also, while humans only go by intendation it's very hand for text editing and manipulation without requiring special per-language support to move the cursor say to the nearest closing brace and so forth.

      • Joker_vD 4 hours ago
        > Python basically has `elif` because `else if` would make each branch nest one level deeper which isn't what one wants

        There are, of course, other ways to handle this. For instance, "else if <cond>:" could've been made legal à la Golang:

            IfStmt = "if" [ SimpleStmt ";" ] Expression Block [ "else" ( IfStmt | Block ) ] .
      • XorNot 7 hours ago
        I'm very okay with elif though because it makes it clear that the conditional is part of the chained block and not a brand new one.
  • Joker_vD 5 hours ago
    Perhaps having unary minus (and especially unary plus) is just a bad idea in general; just mandate "0 - expr". To make negative constants work properly, you still have to either special case literals "256", "65536", etc. and ideally check whether they got negated or not, or introduce a special syntax just for them, like "~1" of ML for negative one, or "-1" (which you are not allowed to break with whitespace) of some other language I've forgotten the name of.

    While we're at it, probably the unary bitwise complement could go as well? Obviously, "^(0-1)" would suck to write but since 99% of the time bitwise "not" used in expressions/statements like "expr &~ mask..." or "var &= ~mask", I feel like simply having binary "and-not" operator that looks like "&~" or "&^" (Golang) is just better.

    Also, a small prize (a "thank you!" from a stranger on the Internet i.e. me) to someone who can propose a good syntax for compound assignment with reversed subtraction:

        x ^= 0-1    # in-place bitwise complement
        x ^= true   # in-place logical negation
        x ?= 0      # in-place integer negation???
  • librasteve 9 hours ago
    This article makes a strong case for every language to use ‘;’ as a statement separator.
    • rao-v 7 hours ago
      Exactly. I genuinely do not understand how any significant user of python can handle white space delimitation. You cannot copy or paste anything without busywork, your IDE or formatter dare not help you till you resolve the ambiguity.

      One day https://github.com/mathialo/bython one day!

      • silon42 1 minute ago
        looks cool..

        Alternatively, I've several times used 'pass' as block terminator for my personal code.

    • jasperry 8 hours ago
      Indeed it does, by showing how many different and confusing types of parsing rules are used in languages that don't have statement terminators. Needing a parser clever enough to interpret essentially a 2-d code format seems like unnecessary complexity to me, because at its core a programming language is supposed to be a formal, unambiguous notation. Not that I'm against readability; I think having an unambiguous terminating mark makes it easier for humans to read as well. If you want to make a compiler smart enough to help by reading the indentation, that's fine, but don't require it as part of the notation.

      Non-statement-based (functional) languages can be excepted, but I still think those are harder to read than statement-based languages.

      • hajile 7 hours ago
        Lisps aren’t necessarily functional, but don’t need semicolons either.
        • II2II 7 hours ago
          The syntax of languages like Lisp and Forth are so fundamentally different that they don't need an explicit statement separator. You don't have to think about many other things either, or I should say you don't have to think about them in the same way. Consider how much simpler the order of operations is in those languages.
        • wvenable 1 hour ago
          Lisp has explicit "statement" terminators (just aren't semicolons)
  • sheept 9 hours ago
    Because formatters are increasingly popular, I think it'd be interesting to see a language that refuses to compile if the code is improperly formatted, and ships with a more tolerant formatter whose behavior can change from version to version. This way, the language can worry less about backwards compatibility or syntax edge cases, at the cost of taking away flexibility from its users.
  • sheept 9 hours ago
    > I would love to see a language try to implement a rule where only an indented line is considered part of the previous expression.

    Elm does this (so maybe Haskell too). For example

        x = "hello "
         ++ "world"
    
        y = "hello "
        ++ "world" -- problem
  • kayson 9 hours ago
    Are we really saving that much by not having semicolons? IDEs could probably autocomplete this with high success, and it removes ambiguity from weird edge cases. On the other hand, I've not once had to think about where go is putting semicolons...
  • marcosdumay 9 hours ago
    Looks at 11 languages, ignores Haskell or anything really different...
    • jfengel 7 hours ago
      Once I learned Haskell, everything else looks pretty much identical. Java, C, C++, Smalltalk... At least Lisp looks a little bit different.
    • librasteve 9 hours ago
      or Raku
      • jasperry 8 hours ago
        Those are functional languages that generally don't use statements, so it makes sense to leave them out of a discussion about statement separators. If you think more people should use functional languages and so avoid the semicolon problem altogether, you could argue that.
        • marcosdumay 8 hours ago
          Yet, the author ends with a half-backed clone of the Haskell syntax.
        • Blikkentrekker 7 hours ago
          Functional hardly matters Haskell has plenty of indentation which is by the way interchangeable with `{ ... }`, one can use both at one's own pleasure and it's needed for many things.

          Also, famously `do { x ; y ; z }` is just syntactic sugar for `x >> y >> z` in Haskell where `>>` is a normal pure operator.

  • lightingthedark 8 hours ago
    It's interesting seeing all of the different ways language designers have approached this problem. I have to say that my takeaway is that this seems like a pretty strong argument for explicit end of statements. There is enough complexity inherent in the code, adding more in order to avoid typing a semicolon doesn't seem like a worthwhile tradeoff.

    I'm definitely biased by my preferences though, which are that I can always autoformat the code. This leads to a preference for explicit symbols elsewhere, for example I prefer curly brace languages to indentation based languages, for the same reason of being able to fully delegate formatting to the computer. I want to focus on the meaning of the code, not on line wrapping or indentation (but poorly formatted code does hinder understanding the meaning). Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.

    Would love to hear from someone who does think this is worthwhile, why do you hate semicolons?

    • duped 7 hours ago
      Start from the perspective of the user seeing effectively:

      > error: expected the character ';' at this exact location

      The user wonders, "if the parser is smart enough to tell me this, why do I need to add it at all?"

      The answer to that question "it's annoying to write the code to handle this correctly" is thoroughly lazy and boring. "My parser generator requires the grammar to be LR(1)" is even lazier. Human language doesn't fit into restrictive definitions of syntax, why should language for machines?

      > Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.

      That's why meaningful whitespace is better than semicolons. It forces you to write the ambiguous cases as readable code.

  • gcanyon 10 hours ago
    Can anyone give a good reason why supporting syntax like:

        y = 2 * x
          - 3
    
    is worth it?
    • whateveracct 10 hours ago
      in Haskell, supporting that is how you get neat composition

      such as applicative style formatted like this:

              f
          <$> x
          <*> y
          <*> z
    • marcosdumay 9 hours ago
      By "is worth it" you mean it's worth the work?

      Because it's very little extra work.

      If you want to know if it's a good syntax, AFAIK it's the only way to do a semicolon-less language that doesn't break all the time.

    • duped 8 hours ago

          if object.method()
          || other_object.field.condition()
          || (foo > bar && baz < qux)
    • zephen 9 hours ago
      Obviously, that's a really short expression.

      So, the question is, if you have a long expression, should you have to worry too much about either adding parentheses, or making sure that your line break occurs inside a pair of parentheses.

      It boils down to preference, but a language feature that supports whatever preference you have might be nice.

        priority = "URGENT"  if hours < 2  else
                   "HIGH"    if hours < 24 else
                   "MEDIUM"  if hours < 72 else
                   "LOW"
    • justsomehnguy 9 hours ago
      Mostly eye-candy, especially for some long one-liners.

      In PowerShell you can do that by explicitly instructing what the next line is actually a continuation of the previous one:

          $y = 2 * $x `
             - 3
    • szmarczak 10 hours ago
      It's not. Your eyes can deceive you by guessing the correct indentation. Indentation should never be used for grammar separation. Explicit characters such as } ] ) are clearer and unambiguous.
      • bmandale 9 hours ago
        Clearer for the computer, but not for the human. Many errors, some severe, have been caused by a human only looking at the indentation and not realizing the braces don't match.
        • Blikkentrekker 7 hours ago
          That's just because most languages go by braces and have optional intendation that is just ignored by the compiler.

          I'd reckon that in a language where stuff is done by indentation but optional braces exist that are just ignored so many errors would also have been caused by braces being misplaced by the programmer to queue other programmers who thought some scope happened as a consequence but the compiler disagreed due to the indentation, which by the way was caused by tabs and spaces being mixed in the code and it not properly showing up for another programmer with tab with set differently.

          • bmandale 5 hours ago
            > tabs and spaces being mixed in the code

            Python banned this in python3. Problem solved.

        • szmarczak 9 hours ago
          > human only looking at the indentation and not realizing the braces don't match.

          If it ever gets to that point, a refactor is obligatory.

          Don't give the human tools to make easy mistakes. Any grammar can be abused, so blame the human for not writing clean code.

          • TheOtherHobbes 8 hours ago
            Javascript's delimeter soup ((){([]{})}); can become near impossible to parse barebrained, especially when mixed with indents and semicolons.

            Semicolons are just noise. They're absolutely redundant.

            Some brackets are necessary, but whitespace/indent languages make it clear there's a lot of redundancy there too.

            The goal is to minimise errors and cognitive load. The fewer characters the better.

            • szmarczak 7 hours ago
              > whitespace/indent languages make it clear there's a lot of redundancy there too.

              The only purpose for whitespace indentation is to make the code easier on the eyes. A space shouldn't have an impact in terms of execution, that would be too hazardous. It's too easy to randomly insert a space rather than a character.

        • szmarczak 7 hours ago
          > and not realizing the braces don't match.

          Make your IDE highlight the current section or display a hint showing starting bracket. For example, C++ devs do #endif // #if ...

          Too many brackets? Refactor - problem solved.

  • stack_framer 7 hours ago
    I never actually type semicolons in my JavaScript / TypeScript. In work projects, my IDE adds them for me thanks to the linter. In personal projects, I just leave them out (I don't use a linter, so my IDE does not add them), and I've never had a problem. Not even once.

    Semicolon FUD is for the birds.

    • jfengel 7 hours ago
      I occasionally run into problems with JS weird parsing rules. My favorite is:

          return
              x 
      
      Which does not return. It returns undefined.

      Typescript helps a lot with that. A linter will probably flag it as well. Still, JS went way out of its way to accept just about anything, whether it makes sense or not.

      • galaxyLogic 1 hour ago
        That is what I lament too. So I've started to use the comma-operator to make sure my return statements don't care about line-breaks. I often write:

        return 0,

        x;

        I find this a mildly amusing discovery for myself because it took me a long time to figure out a useful use for the comma-operator.

  • IshKebab 9 hours ago
    > how does Gleam determine that the expression continues on the second line?

    The fact that it isn't obvious means the syntax is bad. Stuff this basic shouldn't be ambiguous.

    > Go's lexer inserts a semicolon after the following tokens if they appear just before a newline ... [non-trivial list] ... Simple enough!

    Again I beg to differ. Fundamentally it's just really difficult to make a rule that is actually simple, and lets you write code that you'd expect to work.

    I think the author's indentation idea is fairly reasonable, though I think indentation sensitivity is pretty error-prone.

  • gethly 8 hours ago
    [flagged]