37.7 Developing major modes with tree-sitter

This section covers some general guidelines on developing tree-sitter integration for a major mode.

A major mode supporting tree-sitter features should roughly follow this pattern:

(define-derived-mode woomy-mode prog-mode "Woomy"
  "A mode for Woomy programming language."
  (when (treesit-ready-p 'woomy)
    (setq-local treesit-variables ...)
    ...
    (treesit-major-mode-setup)))

treesit-ready-p automatically emits a warning if conditions for enabling tree-sitter aren’t met.

If a tree-sitter major mode shares setup with its “native” counterpart, one can create a “base mode” that contains the common setup, like this:

(define-derived-mode woomy--base-mode prog-mode "Woomy"
  "An internal mode for Woomy programming language."
  (common-setup)
  ...)

(define-derived-mode woomy-mode woomy--base-mode "Woomy"
  "A mode for Woomy programming language."
  (native-setup)
  ...)

(define-derived-mode woomy-ts-mode woomy--base-mode "Woomy"
  "A mode for Woomy programming language."
  (when (treesit-ready-p 'woomy)
    (setq-local treesit-variables ...)
    ...
    (treesit-major-mode-setup)))
Function: treesit-ready-p language &optional quiet

This function checks for conditions for activating tree-sitter. It checks whether Emacs was built with tree-sitter, whether the buffer’s size is not too large for tree-sitter to handle, and whether the grammar for language is available on the system (see Tree-sitter Language Grammar).

This function emits a warning if tree-sitter cannot be activated. If quiet is message, the warning is turned into a message; if quiet is t, no warning or message is displayed.

If all the necessary conditions are met, this function returns non-nil; otherwise it returns nil.

Function: treesit-major-mode-setup

This function activates some tree-sitter features for a major mode.

Currently, it sets up the following features:

  • If treesit-font-lock-settings (see Parser-based Font Lock) is non-nil, it sets up fontification.
  • If treesit-simple-indent-rules (see Parser-based Indentation) is non-nil, it sets up indentation.
  • If treesit-defun-type-regexp is non-nil, it sets up navigation functions for beginning-of-defun and end-of-defun.
  • If treesit-defun-name-function is non-nil, it sets up add-log functions used by add-log-current-defun.
  • If treesit-simple-imenu-settings (see Imenu) is non-nil, it sets up Imenu.

For more information on these built-in tree-sitter features, see Parser-based Font Lock, see Parser-based Indentation, and see Moving over Balanced Expressions.

For supporting mixing of multiple languages in a major mode, see Parsing Text in Multiple Languages.

Besides beginning-of-defun and end-of-defun, Emacs provides some additional functions for working with defuns: treesit-defun-at-point returns the defun node at point, and treesit-defun-name returns the name of a defun node.

Function: treesit-defun-at-point

This function returns the defun node at point, or nil if none is found. It respects treesit-defun-tactic: if its value is top-level, this function returns the top-level defun, and if its value is nested, it returns the immediate enclosing defun.

This function requires treesit-defun-type-regexp to work. If it is nil, this function simply returns nil.

Function: treesit-defun-name node

This function returns the defun name of node. It returns nil if there is no defun name for node, or if node is not a defun node, or if node is nil.

Depending on the language and major mode, the defun names are names like function name, class name, struct name, etc.

If treesit-defun-name-function is nil, this function always returns nil.

Variable: treesit-defun-name-function

If non-nil, this variable’s value should be a function that is called with a node as its argument, and returns the defun name of the node. The function should have the same semantics as treesit-defun-name: if the node is not a defun node, or the node is a defun node but doesn’t have a name, or the node is nil, it should return nil.