clojure-mode meets tree-sitter

It’s no secret that one of the biggest weaknesses of Emacs is that features like font-locking (syntax highlighting) and indentation are (usually) implemented in terms of regular expressions. Even though this primitive approach mostly gets the job done, it’s both slow and quite limiting in some ways.

Emacs 29 aims to change this with the introduction of a built-in support for Tree-sitter. Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. Sounds great, right?

I’m guessing that over time most Emacs major modes will adopt Tree-sitter, but that journey will be hard and painful, mostly because of backward compatibility with Emacs releases that don’t have support for Tree-sitter. This also probably means that built-in modes that ship with Emacs will be at the forefront of the Tree-sitter adoption as they don’t have to care about backwards compatibility.

So, what does this mean for clojure-mode? It means that for us the best way to support Tree-sitter will be via a new standalone mode. Enter clojure-ts-mode. The new mode lives in a separate GitHub repository and shares no code with the existing clojure-mode. This means it’s pretty light on features right now, but it also means it doesn’t carry all the legacy clojure-mode has from lisp-mode, on which it was originally based. Slates don’t come much cleaner than this.

Danny Freeman has been driving the work on clojure-ts-mode and he already got some of the basics working (e.g. font-locking and fixed indentation), so feel free to play with the mode if you want. That being said - a ton of work remains to done and help from everyone is most welcome. The setup is a bit complex right now, but it’s very well documented.

As CIDER (and inf-clojure) itself relies on the existing clojure-mode for things like font-locking, indentation and finding expression boundaries, it will be a while until clojure-ts-mode can fully replace clojure-mode, but the first steps on the long journey have been taken. Still, I expect that most people who switch to Emacs 29 will be able to do their basic programming in the new mode a lot faster, as it won’t be an issue to have both modes installed at the same time. Once clojure-ts-mode matures enough we’ll likely teach CIDER (and inf-clojure) about it as well, so it could leverage either it or clojure-mode.

Finally, on some happy day in the (not so) distant future the old clojure-mode will be retired and clojure-ts-mode will become clojure-mode. Right now it’s common for the modes using Tree-sitter to be named something-ts-mode to distinguish them from the existing modes that don’t, but I guess eventually we’ll go back to having just one Tree-sitter-powered major mode for each programming language. As they were saying in “Highlander” - there can be only one!

I really believe that Tree-sitter can be a game-changer for Emacs, as it will potentially make Emacs a lot more competitive with “modern” editors and IDEs, that have relied on custom parsers for a while to provide fast and reliable indentation and syntax highlight, and sophisticated code analysis and refactoring functionality. In general, the combination of Tree-sitter and LSP support should narrow a lot the gap in the area of programming languages support between various editors over time and I think this will play to Emacs’s advantage.¹ Time will tell.

That’s all I have for you today. Keep hacking!

Emacs is great in many departments, but sadly the support for some programming languages has been less than great for various reasons. ↩