Major and Minor Modes
A mode is a set of definitions that customize Emacs behavior in useful ways. There are two varieties of modes: minor modes, which provide features that users can turn on and off while editing; and major modes, which are used for editing or interacting with a particular kind of text. Each buffer has exactly one major mode at a time. This chapter describes how to write both major and minor modes, how to indicate them in the mode line, and how they run hooks supplied by the user. For related topics such as keymaps and syntax tables, see Keymaps, and Syntax Tables.
Hooks
A hook is a variable where you can store a function or functions (What Is a Function) to be called on a particular occasion by an existing program. Emacs provides hooks for the sake of customization. Most often, hooks are set up in the init file (Init File), but Lisp programs can set them also. Standard Hooks, for a list of some standard hook variables. Most of the hooks in Emacs are normal hooks. These variables contain lists of functions to be called with no arguments. By convention, whenever the hook name ends in -hook, that tells you it is normal. We try to make all hooks normal, as much as possible, so that you can use them in a uniform way. Every major mode command is supposed to run a normal hook called the mode hook as one of the last steps of initialization. This makes it easy for a user to customize the behavior of the mode, by overriding the buffer-local variable assignments already made by the mode. Most minor mode functions also run a mode hook at the end. But hooks are used in other contexts too. For example, the hook suspend-hook runs just before Emacs suspends itself (Suspending Emacs). If the hook variable's name does not end with -hook, that indicates it is probably an abnormal hook. These differ from normal hooks in two ways: they can be called with one or more arguments, and their return values can be used in some way. The hook's documentation says how the functions are called and how their return values are used. Any functions added to an abnormal hook must follow the hook's calling convention. By convention, abnormal hook names end in -functions. If the name of the variable ends in -predicate or -function (singular) then its value must be a function, not a list of functions. As with abnormal hooks, the expected arguments and meaning of the return value vary across such single function hooks. The details are explained in each variable's docstring. Since hooks (both multi and single function) are variables, their values can be modified with setq or temporarily with let. However, it is often useful to add or remove a particular function from a hook while preserving any other functions it might have. For multi function hooks, the recommended way of doing this is with add-hook and remove-hook (Setting Hooks). Most normal hook variables are initially void; add-hook knows how to deal with this. You can add hooks either globally or buffer-locally with add-hook. For hooks which hold only a single function, add-hook is not appropriate, but you can use add-function (Advising Functions) to combine new functions with the hook. Note that some single function hooks may be nil which add-function cannot deal with, so you must check for that before calling add-function.
Running Hooks
In this section, we document the run-hooks function, which is used to run a normal hook. We also document the functions for running various kinds of abnormal hooks.
-
run-hooks - This function takes one or more normal hook variable names as arguments, and runs each hook in turn. Each argument should be a symbol that is a normal hook variable. These arguments are processed in the order specified. If a hook variable has a non-
nilvalue, that value should be a list of functions.run-hookscalls all the functions, one by one, with no arguments. The hook variable's value can also be a single functionโeither a lambda expression or a symbol with a function definitionโwhichrun-hookscalls. But this usage is obsolete. If the hook variable is buffer-local, the buffer-local variable will be used instead of the global variable. However, if the buffer-local variable contains the elementt, the global hook variable will be run as well. -
run-hook-with-args - This function runs an abnormal hook by calling all the hook functions in hook, passing each one the arguments args.
-
run-hook-with-args-until-failure - This function runs an abnormal hook by calling each hook function in turn, stopping if one of them fails by returning
nil. Each hook function is passed the arguments args. If this function stops because one of the hook functions fails, it returnsnil; otherwise it returns a non-nilvalue. -
run-hook-with-args-until-success - This function runs an abnormal hook by calling each hook function, stopping if one of them succeeds by returning a non-
nilvalue. Each hook function is passed the arguments args. If this function stops because one of the hook functions returns a non-nilvalue, it returns that value; otherwise it returnsnil.
Setting Hooks
Here's an example that adds a function to a mode hook to turn on Auto Fill mode when in Lisp Interaction mode:
(add-hook 'lisp-interaction-mode-hook 'auto-fill-mode)
The value of a hook variable should be a list of functions. You can manipulate that list using the normal Lisp facilities, but the modular way is to use the functions add-hook and remove-hook, defined below. They take care to handle some unusual situations and avoid problems. It works to put a lambda-expression function on a hook, but we recommend avoiding this because it can lead to confusion. If you add the same lambda-expression a second time but write it slightly differently, you will get two equivalent but distinct functions on the hook. If you then remove one of them, the other will still be on it.
-
add-hook - This function is the handy way to add function function to hook variable hook. You can use it for abnormal hooks as well as for normal hooks. function can be any Lisp function that can accept the proper number of arguments for hook. For example,
(add-hook 'text-mode-hook 'my-text-hook-function)
adds my-text-hook-function to the hook called text-mode-hook. If function is already present in hook (comparing using equal), then add-hook does not add it a second time. If function has a non-nil property permanent-local-hook, then kill-all-local-variables (or changing major modes) won't delete it from the hook variable's local value. For a normal hook, hook functions should be designed so that the order in which they are executed does not matter. Any dependence on the order is asking for trouble. However, the order is predictable: normally, function goes at the front of the hook list, so it is executed first (barring another add-hook call). In some cases, it is important to control the relative ordering of functions on the hook. The optional argument depth lets you indicate where the function should be inserted in the list: it should then be a number between -100 and 100 where the higher the value, the closer to the end of the list the function should go. The depth defaults to 0 and for backward compatibility when depth is a non-nil symbol it is interpreted as a depth of 90. Furthermore, when depth is strictly greater than 0 the function is added after rather than before functions of the same depth. One should never use a depth of 100 (or -100), because one can never be sure that no other function will ever need to come before (or after) us. add-hook can handle the cases where hook is void or its value is a single function; it sets or changes the value to a list of functions. If local is non-nil, that says to add function to the buffer-local hook list instead of to the global hook list. This makes the hook buffer-local and adds t to the buffer-local value. The latter acts as a flag to run the hook functions in the default value as well as in the local value.
-
remove-hook - This function removes function from the hook variable hook. It compares function with elements of hook using
equal, so it works for both symbols and lambda expressions. If local is non-nil, that says to remove function from the buffer-local hook list instead of from the global hook list.
Major Modes
Major modes specialize Emacs for editing or interacting with particular kinds of text. Each buffer has exactly one major mode at a time. Every major mode is associated with a major mode command, whose name should end in -mode. This command takes care of switching to that mode in the current buffer, by setting various buffer-local variables such as a local keymap. Major Mode Conventions. Note that unlike minor modes there is no way to "turn off" a major mode, instead the buffer must be switched to a different one. However, you can temporarily suspend a major mode and later restore the suspended mode, see below. The least specialized major mode is called Fundamental mode, which has no mode-specific definitions or variable settings.
-
Command fundamental-mode - This is the major mode command for Fundamental mode. Unlike other mode commands, it does not run any mode hooks (Major Mode Conventions), since you are not supposed to customize this mode.
-
major-mode-suspend - This function works like
fundamental-mode, in that it kills all buffer-local variables, but it also records the major mode in effect, so that it could subsequently be restored. This function andmajor-mode-restore(described next) are useful when you need to put a buffer under some specialized mode other than the one Emacs chooses for it automatically (Auto Major Mode), but would also like to be able to switch back to the original mode later. -
major-mode-restore - This function restores the major mode recorded by
major-mode-suspend. If no major mode was recorded, this function callsnormal-mode(normal-mode), but tries to force it not to choose any modes in avoided-modes, if that argument is non-nil. -
clean-mode - Changing the major mode clears out most local variables, but it doesn't remove all artifacts in the buffer (like text properties and overlays). It's rare to change a buffer from one major mode to another (except from
fundamental-modeto everything else), so this is usually not a concern. It can sometimes be convenient (mostly when debugging a problem in a buffer) to do a "full reset" of the buffer, and that's what theclean-modemajor mode offers. It will kill all local variables (even the permanently local ones), and also removes all overlays and text properties.
The easiest way to write a major mode is to use the macro define-derived-mode, which sets up the new mode as a variant of an existing major mode. Derived Modes. We recommend using define-derived-mode even if the new mode is not an obvious derivative of another mode, as it automatically enforces many coding conventions for you. Basic Major Modes, for common modes to derive from. Writing major modes based on the tree-sitter library has some special aspects and conventions; see Tree-sitter Major Modes. The standard GNU Emacs Lisp directory tree contains the code for several major modes, in files such as text-mode.el, texinfo.el, lisp-mode.el, and rmail.el. You can study these libraries to see how modes are written.
-
major-mode - The buffer-local value of this variable holds the symbol for the current major mode. Its default value holds the default major mode for new buffers. The standard default value is
fundamental-mode. If the default value isnil, then whenever Emacs creates a new buffer via a command such asC-x b(switch-to-buffer), the new buffer is put in the major mode of the previously current buffer. As an exception, if the major mode of the previous buffer has amode-classsymbol property with valuespecial, the new buffer is put in Fundamental mode (Major Mode Conventions).
Major Mode Conventions
The code for every major mode should follow various coding conventions, including conventions for local keymap and syntax table initialization, function and variable names, and hooks. If you use the define-derived-mode macro, it will take care of many of these conventions automatically. Derived Modes. Note also that Fundamental mode is an exception to many of these conventions, because it represents the default state of Emacs. The following list of conventions is only partial. Each major mode should aim for consistency in general with other Emacs major modes, as this makes Emacs as a whole more coherent. It is impossible to list here all the possible points where this issue might come up; if the Emacs developers point out an area where your major mode deviates from the usual conventions, please make it compatible.
- Define a major mode command whose name ends in
-mode. When called with no arguments, this command should switch to the new mode in the current buffer by setting up the keymap, syntax table, and buffer-local variables in an existing buffer. It should not change the buffer's contents. - Write a documentation string for this command that describes the special commands available in this mode. Mode Help. The documentation string may include the special documentation substrings,
\[COMMAND],\{KEYMAP}, and\<KEYMAP>, which allow the help display to adapt automatically to the user's own key bindings. Keys in Documentation. - The major mode command should start by calling
kill-all-local-variables. This runs the normal hookchange-major-mode-hook, then gets rid of the buffer-local variables of the major mode previously in effect. Creating Buffer-Local. - The major mode command should set the variable
major-modeto the major mode command symbol. This is howdescribe-modediscovers which documentation to print. - The major mode command should set the variable
mode-nameto the "pretty" name of the mode, usually a string (but see Mode Line Data, for other possible forms). The name of the mode appears in the mode line. - Calling the major mode command twice in direct succession should not fail and should do the same thing as calling the command only once. In other words, the major mode command should be idempotent.
- Since all global names are in the same name space, all the global variables, constants, and functions that are part of the mode should have names that start with the major mode name (or with an abbreviation of it if the name is long). Coding Conventions.
- In a major mode for editing some kind of structured text, such as a programming language, indentation of text according to structure is probably useful. So the mode should set
indent-line-functionto a suitable function, and probably customize other variables for indentation. Auto-Indentation. - The major mode should usually have its own keymap, which is used as the local keymap in all buffers in that mode. The major mode command should call
use-local-mapto install this local map. Active Keymaps, for more information. This keymap should be stored permanently in a global variable namedMODENAME-mode-map. Normally the library that defines the mode sets this variable. Tips for Defining, for advice about how to write the code to set up the mode's keymap variable. - The key sequences bound in a major mode keymap should usually start with
C-c, followed by a control character, a digit, or{,},<,>,:or;. The other punctuation characters are reserved for minor modes, and ordinary letters are reserved for users. A major mode can also rebind the keysM-n,M-pandM-s. The bindings forM-nandM-pshould normally be some kind of moving forward and backward, but this does not necessarily mean cursor motion. It is legitimate for a major mode to rebind a standard key sequence if it provides a command that does the same job in a way better suited to the text this mode is used for. For example, a major mode for editing a programming language might redefineC-M-ato move to the beginning of a function in a way that works better for that language. The recommended way of tailoringC-M-ato the needs of a major mode is to setbeginning-of-defun-function(List Motion) to invoke the function specific to the mode. It is also legitimate for a major mode to rebind a standard key sequence whose standard meaning is rarely useful in that mode. For instance, minibuffer modes rebindM-r, whose standard meaning is rarely of any use in the minibuffer. Major modes such as Dired or Rmail that do not allow self-insertion of text can reasonably redefine letters and other printing characters as special commands. - Major modes for editing text should not define
RETto do anything other than insert a newline. However, it is ok for specialized modes for text that users don't directly edit, such as Dired and Info modes, to redefineRETto do something entirely different. - Major modes should not alter options that are primarily a matter of user preference, such as whether Auto-Fill mode is enabled. Leave this to each user to decide. However, a major mode should customize other variables so that Auto-Fill mode will work usefully if the user decides to use it.
- The mode may have its own syntax table or may share one with other related modes. If it has its own syntax table, it should store this in a variable named
MODENAME-mode-syntax-table. Syntax Tables. (Major modes based on the tree-sitter library use the parsers provided by tree-sitter for this, Parser-based Font Lock.) - If the mode handles a language that has a syntax for comments, it should set the variables that define the comment syntax. Options Controlling Comments.
- The mode may have its own abbrev table or may share one with other related modes. If it has its own abbrev table, it should store this in a variable named
MODENAME-mode-abbrev-table. If the major mode command defines any abbrevs itself, it should passtfor the system-flag argument todefine-abbrev. Defining Abbrevs. - The mode should specify how to do highlighting for Font Lock mode, by setting up a buffer-local value for the variable
font-lock-defaults(Font Lock Mode). For a major mode based on tree-sitter, see Parser-based Font Lock. - Each face that the mode defines should, if possible, inherit from an existing Emacs face. Basic Faces, and Faces for Font Lock.
- Consider adding a mode-specific menu to the menu bar. This should preferably include the most important menu-specific settings and commands that will allow users discovering the main features quickly and efficiently.
- Consider adding mode-specific context menus for the mode, to be used if and when users activate the
context-menu-mode(Menu Mouse Clicks). To this end, define a mode-specific function which builds one or more menus depending on the location of themouse-3click in the buffer, and then add that function to the buffer-local value ofcontext-menu-functions. - The mode should specify how Imenu should find the definitions or sections of a buffer, by setting up a buffer-local value for the variable
imenu-generic-expression, for the two variablesimenu-prev-index-position-functionandimenu-extract-index-name-function, or for the variableimenu-create-index-function(Imenu). - The mode should specify how Outline minor mode should find the heading lines, by setting up a buffer-local value for the variables
outline-regexporoutline-search-function, and also for the variableoutline-level(Outline Minor Mode). - The mode can tell ElDoc mode how to retrieve different types of documentation for whatever is at point, by adding one or more buffer-local entries to the special hook
eldoc-documentation-functions. - The mode can specify how to complete various keywords by adding one or more buffer-local entries to the special hook
completion-at-point-functions. Completion in Buffers. - To make a buffer-local binding for an Emacs customization variable, use
make-local-variablein the major mode command, notmake-variable-buffer-local. The latter function would make the variable local to every buffer in which it is subsequently set, which would affect buffers that do not use this mode. It is undesirable for a mode to have such global effects. Buffer-Local Variables. With rare exceptions, the only reasonable way to usemake-variable-buffer-localin a Lisp package is for a variable which is used only within that package. Using it on a variable used by other packages would interfere with them. - Each major mode should have a normal mode hook named
MODENAME-mode-hook. The very last thing the major mode command should do is to callrun-mode-hooks. This runs the normal hookchange-major-mode-after-body-hook, the mode hook, the functionhack-local-variables(when the buffer is visiting a file), and then the normal hookafter-change-major-mode-hook. Mode Hooks. - The major mode command may start by calling some other major mode command (called the parent mode) and then alter some of its settings. A mode that does this is called a derived mode. The recommended way to define one is to use the
define-derived-modemacro, but this is not required. Such a mode should call the parent mode command inside adelay-mode-hooksform. (Usingdefine-derived-modedoes this automatically.) Derived Modes, and Mode Hooks. - If something special should be done if the user switches a buffer from this mode to any other major mode, this mode can set up a buffer-local value for
change-major-mode-hook(Creating Buffer-Local). - If this mode is appropriate only for specially-prepared text produced by the mode itself (rather than by the user typing at the keyboard or by an external file), then the major mode command symbol should have a property named
mode-classwith valuespecial, put on as follows: (put 'funny-mode 'mode-class 'special) This tells Emacs that new buffers created while the current buffer is in Funny mode should not be put in Funny mode, even though the default value ofmajor-modeisnil. By default, the value ofnilformajor-modemeans to use the current buffer's major mode when creating new buffers (Auto Major Mode), but with suchspecialmodes, Fundamental mode is used instead. Modes such as Dired, Rmail, and Buffer List use this feature. The functionview-bufferdoes not enable View mode in buffers whose mode-class is special, because such modes usually provide their own View-like bindings. Thedefine-derived-modemacro automatically marks the derived mode as special if the parent mode is special. Special mode is a convenient parent for such modes to inherit from; Basic Major Modes. - If you want to make the new mode the default for files with certain recognizable names, add an element to
auto-mode-alistto select the mode for those file names (Auto Major Mode). If you define the mode command to autoload, you should add this element in the same file that callsautoload. If you use an autoload cookie for the mode command, you can also use an autoload cookie for the form that adds the element (autoload cookie). If you do not autoload the mode command, it is sufficient to add the element in the file that contains the mode definition. - The top-level forms in the file defining the mode should be written so that they may be evaluated more than once without adverse consequences. For instance, use
defvarordefcustomto set mode-related variables, so that they are not reinitialized if they already have a value (Defining Variables).
How Emacs Chooses a Major Mode
When Emacs visits a file, it automatically selects a major mode for the buffer based on information in the file name or in the file itself. It also processes local variables specified in the file text.
-
Command normal-mode - This function establishes the proper major mode and buffer-local variable bindings for the current buffer. It calls
set-auto-mode(see below). As of Emacs 26.1, it no longer runshack-local-variables, this now being done inrun-mode-hooksat the initialization of major modes (Mode Hooks). If the find-file argument tonormal-modeis non-nil,normal-modeassumes that thefind-filefunction is calling it. In this case, it may process local variables in the-*-line or at the end of the file. The variableenable-local-variablescontrols whether to do so. Local Variables in Files, for the syntax of the local variables section of a file. If you runnormal-modeinteractively, the argument find-file is normallynil. In this case,normal-modeunconditionally processes any file local variables. The function callsset-auto-modeto choose and set a major mode. If this does not specify a mode, the buffer stays in the major mode determined by the default value ofmajor-mode(see below).normal-modeusescondition-casearound the call to the major mode command, so errors are caught and reported as aFile mode specification error, followed by the original error message. -
set-auto-mode - This function selects and sets the major mode that is appropriate for the current buffer. It bases its decision (in order of precedence) on the
-*-line, on anymode:local variable near the end of a file, on the#!line (usinginterpreter-mode-alist), on the text at the beginning of the buffer (usingmagic-mode-alist), and finally on the visited file name (usingauto-mode-alist). How Major Modes are Chosen. Ifenable-local-variablesisnil,set-auto-modedoes not check the-*-line, or near the end of the file, for any mode tag. There are some file types where it is not appropriate to scan the file contents for a mode specifier. For example, a tar archive may happen to contain, near the end of the file, a member file that has a local variables section specifying a mode for that particular file. This should not be applied to the containing tar file. Similarly, a tiff image file might just happen to contain a first line that seems to match the-*-pattern. For these reasons, both these file extensions are members of the listinhibit-local-variables-regexps. Add patterns to this list to prevent Emacs searching them for local variables of any kind (not just mode specifiers). If keep-mode-if-same is non-nil, this function does not call the mode command if the buffer is already in the proper major mode. For instance,set-visited-file-namesets this totto avoid killing buffer local variables that the user may have set. -
set-buffer-major-mode - This function sets the major mode of buffer to the default value of
major-mode; if that isnil, it uses the current buffer's major mode (if that is suitable). As an exception, if buffer's name is*scratch*, it sets the mode toinitial-major-mode. The low-level primitives for creating buffers do not use this function, but medium-level commands such asswitch-to-bufferandfind-file-noselectuse it whenever they create buffers. -
initial-major-mode - The value of this variable determines the major mode of the initial
*scratch*buffer. The value should be a symbol that is a major mode command. The default value islisp-interaction-mode. -
interpreter-mode-alist - This variable specifies major modes to use for scripts that specify a command interpreter in a
#!line. Its value is an alist with elements of the form(REGEXP . MODE); this says to use mode mode if the file specifies an interpreter which matches\\`REGEXP\\'. For example, one of the default elements is("python[0-9.]*" . python-mode). -
magic-mode-alist - This variable's value is an alist with elements of the form
(REGEXP . FUNCTION), where regexp is a regular expression and function is a function ornil. After visiting a file,set-auto-modecalls function if the text at the beginning of the buffer matches regexp and function is non-nil; if function isnil,auto-mode-alistgets to decide the mode. -
magic-fallback-mode-alist - This works like
magic-mode-alist, except that it is handled only ifauto-mode-alistdoes not specify a mode for this file. -
auto-mode-alist - This variable contains an association list of file name patterns (regular expressions) and corresponding major mode commands. Usually, the file name patterns test for suffixes, such as
.eland.c, but this need not be the case. An ordinary element of the alist looks like(REGEXP . MODE-FUNCTION). For example,
(("\\`/tmp/fol/" . text-mode)
("\\.texinfo\\'" . texinfo-mode)
("\\.texi\\'" . texinfo-mode)
("\\.el\\'" . emacs-lisp-mode)
("\\.c\\'" . c-mode)
("\\.h\\'" . c-mode)
...)
When you visit a file whose expanded file name (File Name Expansion), with version numbers and backup suffixes removed using file-name-sans-versions (File Name Components), matches a regexp, set-auto-mode calls the corresponding mode-function. This feature enables Emacs to select the proper major mode for most files. If an element of auto-mode-alist has the form (REGEXP FUNCTION t), then after calling function, Emacs searches auto-mode-alist again for a match against the portion of the file name that did not match before. This feature is useful for uncompression packages: an entry of the form ("\\.gz\\'" FUNCTION t) can uncompress the file and then put the uncompressed file in the proper mode according to the name sans .gz. If auto-mode-alist has more than one element whose regexp matches the file name, Emacs will use the first match. Here is an example of how to prepend several pattern pairs to auto-mode-alist. (You might use this sort of expression in your init file.)
(setq auto-mode-alist
(append
;; File name (within directory) starts with a dot.
'(("/\\.[^/]*\\'" . fundamental-mode)
;; File name has no dot.
("/[^\\./]*\\'" . fundamental-mode)
;; File name ends in โ.Cโ.
("\\.C\\'" . c++-mode))
auto-mode-alist))
-
major-mode-remap-defaults - This variable contains an association list indicating which function to call to activate a given major mode. This is used for file formats that can be supported by various major modes, where this variable can be used to indicate which alternative should be used by default. For example, a third-party package providing a much improved Pascal major mode, can use the following to tell
normal-modeto usespiffy-pascal-modefor all the files that would normally usepascal-mode:
(add-to-list 'major-mode-remap-defaults '(pascal-mode . spiffy-pascal-mode))
This variable has the same format as major-mode-remap-alist. If both lists match a major mode, the entry in major-mode-remap-alist takes precedence.
-
major-mode-remap - This function returns the major mode to use instead of mode according to
major-mode-remap-alistandmajor-mode-remap-defaults. It returns mode if the mode is not remapped by those variables. When a package wants to activate a major mode for a particular file format, it should use this function, passing asmodeargument the canonical major mode for that file format, to find which specific major mode to activate, so as to take into account the user's preferences.
Getting Help about a Major Mode
The describe-mode function provides information about major modes. It is normally bound to C-h m. It uses the value of the variable major-mode (Major Modes), which is why every major mode command needs to set that variable.
-
Command describe-mode - This command displays the documentation of the current buffer's major mode and minor modes. It uses the
documentationfunction to retrieve the documentation strings of the major and minor mode commands (Accessing Documentation). If called from Lisp with a non-nilbuffer argument, this function displays the documentation for that buffer's major and minor modes, rather than those of the current buffer.
Defining Derived Modes
The recommended way to define a new major mode is to derive it from an existing one using define-derived-mode. If there is no closely related mode, you should inherit from either text-mode, special-mode, or prog-mode. Basic Major Modes. If none of these are suitable, you can inherit from fundamental-mode (Major Modes).
-
define-derived-mode - This macro defines variant as a major mode command, using name as the string form of the mode name. variant and parent should be unquoted symbols. The new command variant is defined to call the function parent, then override certain aspects of that parent mode:
- ?
- The new mode has its own sparse keymap, named
VARIANT-map.define-derived-modemakes the parent mode's keymap the parent of the new map, unlessVARIANT-mapis already set and already has a parent. - ?
- The new mode has its own syntax table, kept in the variable
VARIANT-syntax-table, unless you override this using the:syntax-tablekeyword (see below).define-derived-modemakes the parent mode's syntax-table the parent ofVARIANT-syntax-table, unless the latter is already set and already has a parent different from the standard syntax table. - ?
- The new mode has its own abbrev table, kept in the variable
VARIANT-abbrev-table, unless you override this using the:abbrev-tablekeyword (see below). - ?
- The new mode has its own mode hook,
VARIANT-hook. It runs this hook, after running the hooks of its ancestor modes, withrun-mode-hooks, as the last thing it does, apart from running any:after-hookform it may have. Mode Hooks.
In addition, you can specify how to override other aspects of parent with body. The command variant evaluates the forms in body after setting up all its usual overrides, just before running the mode hooks. If parent has a non-nil mode-class symbol property, then define-derived-mode sets the mode-class property of variant to the same value. This ensures, for example, that if parent is a special mode, then variant is also a special mode (Major Mode Conventions). You can also specify nil for parent. This gives the new mode no parent. Then define-derived-mode behaves as described above, but, of course, omits all actions connected with parent. Conversely, you can use derived-mode-set-parent and derived-mode-add-parents, described below, to explicitly set the ancestry of the new mode. The argument docstring specifies the documentation string for the new mode. define-derived-mode adds some general information about the mode's hook, followed by the mode's keymap, at the end of this documentation string. If you omit docstring, define-derived-mode generates a documentation string. The keyword-args are pairs of keywords and values. The values, except for :after-hook's, are evaluated. The following keywords are currently supported:
-
:syntax-table - You can use this to explicitly specify a syntax table for the new mode. If you specify a
nilvalue, the new mode uses the same syntax table as parent, or the standard syntax table if parent isnil. (Note that this does not follow the convention used for non-keyword arguments that anilvalue is equivalent with not specifying the argument.) -
:abbrev-table - You can use this to explicitly specify an abbrev table for the new mode. If you specify a
nilvalue, the new mode uses the same abbrev table as parent, orfundamental-mode-abbrev-tableif parent isnil. (Again, anilvalue is not equivalent to not specifying this keyword.) -
:interactive - Modes are interactive commands by default. If you specify a
nilvalue, the mode defined here won't be interactive. This is useful for modes that are never meant to be activated by users manually, but are only supposed to be used in some specially-formatted buffer. -
:group - If this is specified, the value should be the customization group for this mode. (Not all major modes have one.) The command
customize-modeuses this.define-derived-modedoes not automatically define the specified customization group. -
:after-hook - This optional keyword specifies a single Lisp form to evaluate as the final act of the mode function, after the mode hooks have been run. It should not be quoted. Since the form might be evaluated after the mode function has terminated, it should not access any element of the mode function's local state. An
:after-hookform is useful for setting up aspects of the mode which depend on the user's settings, which in turn may have been changed in a mode hook.
Here is a hypothetical example:
(defvar-keymap hypertext-mode-map "<down-mouse-3>" #'do-hyper-link) (define-derived-mode hypertext-mode text-mode "Hypertext" "Major mode for hypertext." (setq-local case-fold-search nil))
Do not write an interactive spec in the definition; define-derived-mode does that automatically.
-
derived-mode-p - This function returns non-
nilif the current major mode is derived from any of the major modes given by the list of symbols in modes. Instead of a list, modes can also be a single mode symbol. Furthermore, we still support a deprecated calling convention where the modes were passed as separate arguments. When examining the parent modes of the current major mode, this function takes into consideration the current mode's parents set bydefine-derived-mode, and also its additional parents set byderived-mode-add-parents, described below. -
provided-mode-derived-p - This function returns non-
nilif mode is derived from any of the major modes given by the list of symbols in modes. Like withderived-mode-p, modes can also be a single symbol, and this function also supports a deprecated calling convention where the modes were passed as separate symbol arguments. When examining the parent modes of mode, this function takes into consideration the parents of mode set bydefine-derived-mode, and also its additional parents set byderived-mode-add-parents, described below.
The graph of a major mode's ancestry can be accessed and modified with the following lower-level functions:
-
derived-mode-set-parent - This function declares that mode inherits from
parent. This is the function thatdefine-derived-modecalls after defining mode to register the fact that mode was defined by reusingparent. -
derived-mode-add-parents - This function makes it possible to register additional parents beside the one that was used when defining mode. This can be used when the similarity between mode and the modes in extra-parents is such that it makes sense to treat mode as a child of those modes for purposes like applying directory-local variables and other mode-specific settings. The additional parent modes are specified as a list of symbols in extra-parents. Those additional parent modes will be considered as one of the /mode/s parents by
derived-mode-pandprovided-mode-derived-p. -
derived-mode-all-parents - This function returns the list of all the modes in the ancestry of mode, ordered from the most specific to the least specific, and starting with mode itself. This includes the additional parent modes, if any, added by calling
derived-mode-add-parents.
Basic Major Modes
Apart from Fundamental mode, there are three major modes that other major modes commonly derive from: Text mode, Prog mode, and Special mode. While Text mode is useful in its own right (e.g., for editing files ending in .txt), Prog mode and Special mode exist mainly to let other modes derive from them. As far as possible, new major modes should be derived, either directly or indirectly, from one of these three modes. One reason is that this allows users to customize a single mode hook (e.g., prog-mode-hook) for an entire family of relevant modes (e.g., all programming language modes).
-
Command text-mode - Text mode is a major mode for editing human languages. It defines the
"and\characters as having punctuation syntax (Syntax Class Table), and arranges forcompletion-at-pointto complete words based on the spelling dictionary (Completion in Buffers). An example of a major mode derived from Text mode is HTML mode. SGML and HTML Modes. -
Command prog-mode - Prog mode is a basic major mode for buffers containing programming language source code. Most of the programming language major modes built into Emacs are derived from it. Prog mode binds
parse-sexp-ignore-commentstot(Motion via Parsing) andbidi-paragraph-directiontoleft-to-right(Bidirectional Display). -
Command special-mode - Special mode is a basic major mode for buffers containing text that is produced specially by Emacs, rather than directly from a file. Major modes derived from Special mode are given a
mode-classproperty ofspecial(Major Mode Conventions). Special mode sets the buffer to read-only. Its keymap defines several common bindings, includingqforquit-windowandgforrevert-buffer(Reverting). An example of a major mode derived from Special mode is Buffer Menu mode, which is used by the*Buffer List*buffer. Listing Existing Buffers.
In addition, modes for buffers of tabulated data can inherit from Tabulated List mode, which is in turn derived from Special mode. Tabulated List Mode.
Mode Hooks
Every major mode command should finish by running the mode-independent normal hook change-major-mode-after-body-hook, its mode hook, and the normal hook after-change-major-mode-hook. It does this by calling run-mode-hooks. If the major mode is a derived mode, that is if it calls another major mode (the parent mode) in its body, it should do this inside delay-mode-hooks so that the parent won't run these hooks itself. Instead, the derived mode's call to run-mode-hooks runs the parent's mode hook too. Major Mode Conventions. Emacs versions before Emacs 22 did not have delay-mode-hooks. Versions before 24 did not have change-major-mode-after-body-hook. When user-implemented major modes do not use run-mode-hooks and have not been updated to use these newer features, they won't entirely follow these conventions: they may run the parent's mode hook too early, or fail to run after-change-major-mode-hook. This will have undesirable effects such as preventing minor modes defined with define-globalized-minor-mode from being enabled in buffers using these major modes. If you encounter such a major mode, please correct it to follow these conventions. When you define a major mode using define-derived-mode, it automatically makes sure these conventions are followed. If you define a major mode "by hand", not using define-derived-mode, use the following functions to handle these conventions automatically.
-
run-mode-hooks - Major modes should run their mode hook using this function. It is similar to
run-hooks(Hooks), but it also runschange-major-mode-after-body-hook,hack-local-variables(when the buffer is visiting a file) (File Local Variables), andafter-change-major-mode-hook. The last thing it does is to evaluate any:after-hookforms declared by parent modes (Derived Modes). When this function is called during the execution of adelay-mode-hooksform, it does not run the hooks orhack-local-variablesor evaluate the forms immediately. Instead, it arranges for the next call torun-mode-hooksto run them. -
delay-mode-hooks - When one major mode command calls another, it should do so inside of
delay-mode-hooks. This macro executes body, but tells allrun-mode-hookscalls during the execution of body to delay running their hooks. The hooks will actually run during the next call torun-mode-hooksafter the end of thedelay-mode-hooksconstruct. -
change-major-mode-after-body-hook - This is a normal hook run by
run-mode-hooks. It is run before the mode hooks. -
after-change-major-mode-hook - This is a normal hook run by
run-mode-hooks. It is run at the very end of every properly-written major mode command.
Tabulated List mode
Tabulated List mode is a major mode for displaying tabulated data, i.e., data consisting of entries, each entry occupying one row of text with its contents divided into columns. Tabulated List mode provides facilities for pretty-printing rows and columns, and sorting the rows according to the values in each column. It is derived from Special mode (Basic Major Modes). Tabulated List mode is geared towards displaying text using monospaced fonts, using a single font and text size. If you want to display a table using variable pitch fonts or images, make-vtable can be used instead. vtable also support having more than a single table in a buffer, or having a buffer that contains both a table and additional text in it. Introduction, for more information. Tabulated List mode is intended to be used as a parent mode by a more specialized major mode. Examples include Process Menu mode (Process Information) and Package Menu mode (Package Menu). Such a derived mode should use define-derived-mode in the usual way, specifying tabulated-list-mode as the second argument (Derived Modes). The body of the define-derived-mode form should specify the format of the tabulated data, by assigning values to the variables documented below; optionally, it can then call the function tabulated-list-init-header, which will populate a header with the names of the columns. The derived mode should also define a listing command. This, not the mode command, is what the user calls (e.g., M-x list-processes). The listing command should create or switch to a buffer, turn on the derived mode, specify the tabulated data, and finally call tabulated-list-print to populate the buffer.
-
tabulated-list-gui-sort-indicator-asc - This variable specifies the character to be used on GUI frames as an indication that the column is sorted in the ascending order. Whenever you change the sort direction in Tabulated List buffers, this indicator toggles between ascending ("asc") and descending ("desc").
-
tabulated-list-gui-sort-indicator-desc - Like
tabulated-list-gui-sort-indicator-asc, but used when the column is sorted in the descending order. -
tabulated-list-tty-sort-indicator-asc - Like
tabulated-list-gui-sort-indicator-asc, but used for text-mode frames. -
tabulated-list-tty-sort-indicator-desc - Like
tabulated-list-tty-sort-indicator-asc, but used when the column is sorted in the descending order. -
tabulated-list-format - This buffer-local variable specifies the format of the Tabulated List data. Its value should be a vector. Each element of the vector represents a data column, and should be a list
(NAME WIDTH SORT . PROPS), where - ?
- name is the column's name (a string).
- ?
- width is the width to reserve for the column (an integer). This is meaningless for the last column, which runs to the end of each line.
- ?
- sort specifies how to sort entries by the column. If
nil, the column cannot be used for sorting. Ift, the column is sorted by comparing string values. Otherwise, this should be a predicate function forsort(Rearrangement), which accepts two arguments with the same form as the elements oftabulated-list-entries(see below). - ?
- props is a plist (Property Lists) of additional column properties. If the value of the property
:right-alignis non-nilthen the column should be right-aligned. And the property:pad-rightspecifies the number of additional padding spaces to the right of the column (by default 1 if omitted). -
tabulated-list-entries - This buffer-local variable specifies the entries displayed in the Tabulated List buffer. Its value should be either a list, or a function. If the value is a list, each list element corresponds to one entry, and should have the form
(ID CONTENTS), where - ?
- id is either
nil, or a Lisp object that identifies the entry. If the latter, the cursor stays on the same entry when re-sorting entries. Comparison is done withequal. - ?
- contents is a vector with the same number of elements as
tabulated-list-format. Each vector element is either a string, which is inserted into the buffer as-is; an image descriptor, which is used to insert an image (Image Descriptors); or a list(LABEL . PROPERTIES), which means to insert a text button by callinginsert-text-buttonwith label and properties as arguments (Making Buttons). There should be no newlines in any of these strings.
Otherwise, the value should be a function which returns a list of the above form when called with no arguments.
-
tabulated-list-groups - This buffer-local variable specifies the groups of entries displayed in the Tabulated List buffer. Its value should be either a list or a function. If the value is a list, each list element corresponds to one group, and should have the form
(GROUP-NAME ENTRY1 ENTRY2 ...), where group-name is a string inserted before all group entries, and entry1, entry2 and so on each have the same format as an element oftabulated-list-entries(see above). Otherwise, the value should be a function which returns a list of the above form when called with no arguments. You can useseq-group-byto createtabulated-list-groupsfromtabulated-list-entries. For example:
(setq tabulated-list-groups
(seq-group-by 'Buffer-menu-group-by-mode
tabulated-list-entries))
where you can define Buffer-menu-group-by-mode like this:
(defun Buffer-menu-group-by-mode (entry) (concat "* " (aref (cadr entry) 5)))
-
tabulated-list-revert-hook - This normal hook is run prior to reverting a Tabulated List buffer. A derived mode can add a function to this hook to recompute
tabulated-list-entries. -
tabulated-list-printer - The value of this variable is the function called to insert an entry at point, including its terminating newline. The function should accept two arguments, id and contents, having the same meanings as in
tabulated-list-entries. The default value is a function which inserts an entry in a straightforward way; a mode which uses Tabulated List mode in a more complex way can specify another function. -
tabulated-list-sort-key - The value of this variable specifies the current sort key for the Tabulated List buffer. If it is
nil, no sorting is done. Otherwise, it should have the form(NAME . FLIP), where name is a string matching one of the column names intabulated-list-format, and flip, if non-nil, means to invert the sort order. -
tabulated-list-init-header - This function computes and sets
header-line-formatfor the Tabulated List buffer (Header Lines), and assigns a keymap to the header line to allow sorting entries by clicking on column headers. Modes derived from Tabulated List mode should call this after setting the above variables (in particular, only after settingtabulated-list-format). -
tabulated-list-print - This function populates the current buffer with entries. It should be called by the listing command. It erases the buffer, sorts the entries specified by
tabulated-list-entriesaccording totabulated-list-sort-key, then calls the function specified bytabulated-list-printerto insert each entry. If the optional argument remember-pos is non-nil, this function looks for the id element on the current line, if any, and tries to move to that entry after all the entries are (re)inserted. If the optional argument update is non-nil, this function will only erase or add entries that have changed since the last print. This is several times faster if most entries haven't changed since the last time this function was called. The only difference in outcome is that tags placed viatabulated-list-put-tagwill not be removed from entries that haven't changed (normally all tags are removed). -
tabulated-list-delete-entry - This function deletes the entry at point. It returns a list
(ID COLS), where id is the ID of the deleted entry and cols is a vector of its column descriptors. It moves point to the beginning of the current line. It returnsnilif there is no entry at point. Note that this function only changes the buffer contents; it does not altertabulated-list-entries. -
tabulated-list-get-id - This
defsubstreturns the ID object fromtabulated-list-entries(if that is a list) or from the list returned bytabulated-list-entries(if it is a function). If omitted ornil, pos defaults to point. -
tabulated-list-get-entry - This
defsubstreturns the entry object fromtabulated-list-entries(if that is a list) or from the list returned bytabulated-list-entries(if it is a function). This will be a vector for the ID at pos. If there is no entry at pos, then the function returnsnil. -
tabulated-list-header-overlay-p - This
defsubstreturns non-nilif there is a fake header at pos. A fake header is used iftabulated-list-use-header-lineisnilto put the column names at the beginning of the buffer. If omitted ornil, pos defaults topoint-min. -
tabulated-list-put-tag - This function puts tag in the padding area of the current line. The padding area can be empty space at the beginning of the line, the width of which is governed by
tabulated-list-padding. tag should be a string, with a length less than or equal totabulated-list-padding. If advance is non-nil, this function advances point by one line. -
tabulated-list-clear-all-tags - This function clears all tags from the padding area in the current buffer.
-
tabulated-list-set-col - This function changes the tabulated list entry at point, setting col to desc. col is the column number to change, or the name of the column to change. desc is the new column descriptor, which is inserted via
tabulated-list-print-col. If change-entry-data is non-nil, this function modifies the underlying data (usually the column descriptor in the listtabulated-list-entries) by setting the column descriptor of the vector todesc.
Generic Modes
Generic modes are simple major modes with basic support for comment syntax and Font Lock mode. To define a generic mode, use the macro define-generic-mode. See the file generic-x.el for some examples of the use of define-generic-mode.
-
define-generic-mode - This macro defines a generic mode command named mode (a symbol, not quoted). The optional argument docstring is the documentation for the mode command. If you do not supply it,
define-generic-modegenerates one by default. The argument comment-list is a list in which each element is either a character, a string of one or two characters, or a cons cell. A character or a string is set up in the mode's syntax table as a comment starter. If the entry is a cons cell, the CAR is set up as a comment starter and the CDR as a comment ender. (Usenilfor the latter if you want comments to end at the end of the line.) Note that the syntax table mechanism has limitations about what comment starters and enders are actually possible. Syntax Tables. The argument keyword-list is a list of keywords to highlight withfont-lock-keyword-face. Each keyword should be a string. Meanwhile, font-lock-list is a list of additional expressions to highlight. Each element of this list should have the same form as an element offont-lock-keywords. Search-based Fontification. The argument auto-mode-list is a list of regular expressions to add to the variableauto-mode-alist. They are added by the execution of thedefine-generic-modeform, not by expanding the macro call. Finally, function-list is a list of functions for the mode command to call for additional setup. It calls these functions just before it runs the mode hook variableMODE-hook.
Major Mode Examples
Text mode is perhaps the simplest mode besides Fundamental mode. Here are excerpts from text-mode.el that illustrate many of the conventions listed above:
;; Create the syntax table for this mode.
(defvar text-mode-syntax-table
(let ((st (make-syntax-table)))
(modify-syntax-entry ?\" ". " st)
(modify-syntax-entry ?\\ ". " st)
;; Add 'p' so M-c on 'hello' leads to 'Hello', not 'hello'.
(modify-syntax-entry ?' "w p" st)
...
st)
"Syntax table used while in `text-mode'.")
Here is how the actual mode command is defined now:
(define-derived-mode text-mode nil "Text"
"Major mode for editing text written for humans to read.
In this mode, paragraphs are delimited only by blank or white lines.
You can thus get the full benefit of adaptive filling
(see the variable `adaptive-fill-mode').
\\{text-mode-map}
Turning on Text mode runs the normal hook `text-mode-hook'."
(setq-local require-final-newline mode-require-final-newline))
The three Lisp modes (Lisp mode, Emacs Lisp mode, and Lisp Interaction mode) have more features than Text mode and the code is correspondingly more complicated. Here are excerpts from lisp-mode.el that illustrate how these modes are written. Here is how the Lisp mode syntax and abbrev tables are defined:
;; Create mode-specific table variables.
(define-abbrev-table 'lisp-mode-abbrev-table ()
"Abbrev table for Lisp mode.")
(defvar lisp-mode-syntax-table
(let ((table (make-syntax-table lisp--mode-syntax-table)))
(modify-syntax-entry ?\[ "_ " table)
(modify-syntax-entry ?\] "_ " table)
(modify-syntax-entry ?# "' 14" table)
(modify-syntax-entry ?| "\" 23bn" table)
table)
"Syntax table used in `lisp-mode'.")
The three modes for Lisp share much of their code. For instance, Lisp mode and Emacs Lisp mode inherit from Lisp Data mode and Lisp Interaction Mode inherits from Emacs Lisp mode. Amongst other things, Lisp Data mode sets up the comment-start variable to handle Lisp comments:
(setq-local comment-start ";") ...
Each of the different Lisp modes has a slightly different keymap. For example, Lisp mode binds C-c C-z to run-lisp, but the other Lisp modes do not. However, all Lisp modes have some commands in common. The following code sets up the common commands:
(defvar-keymap lisp-mode-shared-map :parent prog-mode-map :doc "Keymap for commands shared by all sorts of Lisp modes." "C-M-q" #'indent-sexp "DEL" #'backward-delete-char-untabify)
And here is the code to set up the keymap for Lisp mode:
(defvar-keymap lisp-mode-map :doc "Keymap for ordinary Lisp mode. All commands in `lisp-mode-shared-map' are inherited by this map." :parent lisp-mode-shared-map "C-M-x" #'lisp-eval-defun "C-c C-z" #'run-lisp)
Finally, here is the major mode command for Lisp mode:
(define-derived-mode lisp-mode lisp-data-mode "Lisp"
"Major mode for editing Lisp code for Lisps other than GNU Emacs Lisp.
Commands:
Delete converts tabs to spaces as it moves back.
Blank lines separate paragraphs. Semicolons start comments.
\\{lisp-mode-map}
Note that `run-lisp' may be used either to start an inferior Lisp job
or to switch back to an existing one."
(setq-local find-tag-default-function 'lisp-find-tag-default)
(setq-local comment-start-skip
"\\(\\(^\\|[^\\\n]\\)\\(\\\\\\\\\\)*\\)\\(;+\\|#|\\) *")
(setq imenu-case-fold-search t))
Minor Modes
A minor mode provides optional features that users may enable or disable independently of the choice of major mode. Minor modes can be enabled individually or in combination. Most minor modes implement features that are independent of the major mode, and can thus be used with most major modes. For example, Auto Fill mode works with any major mode that permits text insertion. A few minor modes, however, are specific to a particular major mode. For example, Diff Auto Refine mode is a minor mode that is intended to be used only with Diff mode. Ideally, a minor mode should have its desired effect regardless of the other minor modes in effect. It should be possible to activate and deactivate minor modes in any order.
-
local-minor-modes - This buffer-local variable lists the currently enabled minor modes in the current buffer, and is a list of symbols.
-
global-minor-modes - This variable lists the currently enabled global minor modes, and is a list of symbols.
-
minor-mode-list - The value of this variable is a list of all minor mode commands.
Conventions for Writing Minor Modes
There are conventions for writing minor modes just as there are for major modes (Major Modes). These conventions are described below. The easiest way to follow them is to use the macro define-minor-mode. Defining Minor Modes.
- Define a variable whose name ends in
-mode. We call this the mode variable. The minor mode command should set this variable. The value will benilif the mode is disabled, and non-nilif the mode is enabled. The variable should be buffer-local if the minor mode is buffer-local. This variable is used in conjunction with theminor-mode-alistto display the minor mode name in the mode line. It also determines whether the minor mode keymap is active, viaminor-mode-map-alist(Controlling Active Maps). Individual commands or hooks can also check its value. - Define a command, called the mode command, whose name is the same as the mode variable. Its job is to set the value of the mode variable, plus anything else that needs to be done to actually enable or disable the mode's features. The mode command should accept one optional argument. If called interactively with no prefix argument, it should toggle the mode (i.e., enable if it is disabled, and disable if it is enabled). If called interactively with a prefix argument, it should enable the mode if the argument is positive and disable it otherwise. If the mode command is called from Lisp (i.e., non-interactively), it should enable the mode if the argument is omitted or
nil; it should toggle the mode if the argument is the symboltoggle; otherwise it should treat the argument in the same way as for an interactive call with a numeric prefix argument, as described above. The following example shows how to implement this behavior (it is similar to the code generated by thedefine-minor-modemacro): (interactive (list (or current-prefix-arg 'toggle))) (let ((enable (if (eq arg 'toggle) (not foo-mode) ; this is the mode's mode variable (> (prefix-numeric-value arg) 0)))) (if enable do-enable do-disable)) The reason for this somewhat complex behavior is that it lets users easily toggle the minor mode interactively, and also lets the minor mode be easily enabled in a mode hook, like this: (add-hook 'text-mode-hook 'foo-mode) This behaves correctly whether or notfoo-modewas already enabled, since thefoo-modemode command unconditionally enables the minor mode when it is called from Lisp with no argument. Disabling a minor mode in a mode hook is a little uglier: (add-hook 'text-mode-hook (lambda () (foo-mode -1))) However, this is not very commonly done. Enabling or disabling a minor mode twice in direct succession should not fail and should do the same thing as enabling or disabling it only once. In other words, the minor mode command should be idempotent. - Add an element to
minor-mode-alistfor each minor mode (Definition of minor-mode-alist), if you want to indicate the minor mode in the mode line. This element should be a list of the following form: (mode-variable string) Here mode-variable is the variable that controls enabling of the minor mode, and string is a short string, starting with a space, to represent the mode in the mode line. These strings must be short so that there is room for several of them at once. When you add an element tominor-mode-alist, useassqto check for an existing element, to avoid duplication. For example: (unless (assq 'leif-mode minor-mode-alist) (push '(leif-mode " Leif") minor-mode-alist)) or like this, usingadd-to-list(List Variables): (add-to-list 'minor-mode-alist '(leif-mode " Leif"))
In addition, several major mode conventions (Major Mode Conventions) apply to minor modes as well: those regarding the names of global symbols, the use of a hook at the end of the initialization function, and the use of keymaps and other tables. The minor mode should, if possible, support enabling and disabling via Custom (Customization). To do this, the mode variable should be defined with defcustom, usually with :type 'boolean. If just setting the variable is not sufficient to enable the mode, you should also specify a :set method which enables the mode by invoking the mode command. Note in the variable's documentation string that setting the variable other than via Custom may not take effect. Also, mark the definition with an autoload cookie (autoload cookie), and specify a :require so that customizing the variable will load the library that defines the mode. For example:
;;;###autoload (defcustom msb-mode nil "Toggle msb-mode. Setting this variable directly does not take effect; use either \\[customize] or the function `msb-mode'." :set 'custom-set-minor-mode :initialize 'custom-initialize-default :version "20.4" :type 'boolean :group 'msb :require 'msb)
Keymaps and Minor Modes
Each minor mode can have its own keymap, which is active when the mode is enabled. To set up a keymap for a minor mode, add an element to the alist minor-mode-map-alist. Definition of minor-mode-map-alist. One use of minor mode keymaps is to modify the behavior of certain self-inserting characters so that they do something else as well as self-insert. (Another way to customize self-insert-command is through post-self-insert-hook, see Commands for Insertion. Apart from this, the facilities for customizing self-insert-command are limited to special cases, designed for abbrevs and Auto Fill mode. Do not try substituting your own definition of self-insert-command for the standard one. The editor command loop handles this function specially.) Minor modes may bind commands to key sequences consisting of C-c followed by a punctuation character. However, sequences consisting of C-c followed by one of {}<>:;, or a control character or digit, are reserved for major modes. Also, C-c LETTER is reserved for users. Key Binding Conventions.
Defining Minor Modes
The macro define-minor-mode offers a convenient way of implementing a mode in one self-contained definition.
-
define-minor-mode - This macro defines a new minor mode whose name is mode (a symbol). It defines a command named mode to toggle the minor mode, with doc as its documentation string. The toggle command takes one optional (prefix) argument. If called interactively with no argument it toggles the mode on or off. A positive prefix argument enables the mode, any other prefix argument disables it. From Lisp, an argument of
toggletoggles the mode, whereas an omitted ornilargument enables the mode. This makes it easy to enable the minor mode in a major mode hook, for example. If doc isnil, the macro supplies a default documentation string explaining the above. By default, it also defines a variable named mode, which is set totornilby enabling or disabling the mode. The keyword-args consist of keywords followed by corresponding values. A few keywords have special meanings: -
:global GLOBAL - If non-
nil, this specifies that the minor mode should be global rather than buffer-local. It defaults tonil. One of the effects of making a minor mode global is that the mode variable becomes a customization variable. Toggling it through the Customize interface turns the mode on and off, and its value can be saved for future Emacs sessions (Saving Customizations. For the saved variable to work, you should ensure that the minor mode function is available each time Emacs starts; usually this is done by marking thedefine-minor-modeform as autoloaded. -
:init-value INIT-VALUE - This is the value to which the mode variable is initialized. Except in unusual circumstances (see below), this value must be
nil. Note thatdefine-minor-modedoes not automatically run the body of the minor mode to ensure the mode is really enabled according to this value, so if the mode is global (see above) and the initial value is non-nil, you should consider forcing Emacs to run the mode function when loading the mode, like this: :initialize #'custom-initialize-after-file-load otherwise, the minor mode might say it's enabled even though it has not been properly set up. -
:lighter LIGHTER - The string lighter says what to display in the mode line when the mode is enabled; if it is
nil, the mode is not displayed in the mode line. -
:keymap KEYMAP - The optional argument keymap specifies the keymap for the minor mode. If non-
nil, it should be a variable name (whose value is a keymap), a keymap, or an alist of the form (key-sequence . definition) where each key-sequence and definition are arguments suitable for passing todefine-key(Changing Key Bindings). If keymap is a keymap or an alist, this also defines the variableMODE-map. -
:variable PLACE - This replaces the default variable mode, used to store the state of the mode. If you specify this, the mode variable is not defined, and any init-value argument is unused. place can be a different named variable (which you must define yourself), or anything that can be used with the
setffunction (Generalized Variables). place can also be a cons(GET . SET), where get is an expression that returns the current state, and set is a function of one argument (a state) which should be assigned to place. -
:after-hook AFTER-HOOK - This defines a single Lisp form which is evaluated after the mode hooks have run. It should not be quoted.
-
:interactive VALUE - Minor modes are interactive commands by default. If value is
nil, this is inhibited. If value is a list of symbols, it's used to say which major modes this minor mode is useful in.
Any other keyword arguments are passed directly to the defcustom generated for the variable mode. Variable Definitions, for the description of those keywords and their values. The command named mode first performs the standard actions such as setting the variable named mode and then executes the body forms, if any. It then runs the mode hook variable MODE-hook and finishes by evaluating any form in :after-hook. (Note that all of this, including running the hook, is done both when the mode is enabled and disabled.) The initial value must be nil except in cases where (1) the mode is preloaded in Emacs, or (2) it is painless for loading to enable the mode even though the user did not request it. For instance, if the mode has no effect unless something else is enabled, and will always be loaded by that time, enabling it by default is harmless. But these are unusual circumstances. Normally, the initial value must be nil. Here is an example of using define-minor-mode:
(define-minor-mode hungry-mode "Toggle Hungry mode. Interactively with no argument, this command toggles the mode. A positive prefix argument enables the mode, any other prefix argument disables it. From Lisp, argument omitted or nil enables the mode, `toggle' toggles the state. When Hungry mode is enabled, the control delete key gobbles all preceding whitespace except the last. See the command \\[hungry-electric-delete]." ;; The initial value. nil ;; The indicator for the mode line. " Hungry" ;; The minor mode bindings. '(([C-backspace] . hungry-electric-delete)))
This defines a minor mode named "Hungry mode", a command named hungry-mode to toggle it, a variable named hungry-mode which indicates whether the mode is enabled, and a variable named hungry-mode-map which holds the keymap that is active when the mode is enabled. It initializes the keymap with a key binding for C-DEL. There are no body formsโmany minor modes don't need any. Here's an equivalent way to write it:
(define-minor-mode hungry-mode
"Toggle Hungry mode.
...rest of documentation as before..."
;; The initial value.
:init-value nil
;; The indicator for the mode line.
:lighter " Hungry"
;; The minor mode bindings.
:keymap
'(([C-backspace] . hungry-electric-delete)
([C-M-backspace]
. (lambda ()
(interactive)
(hungry-electric-delete t)))))
-
define-globalized-minor-mode - This defines a global toggle named global-mode whose meaning is to enable or disable the buffer-local minor mode mode in all (or some; see below) buffers. It also executes the body forms. To turn on the minor mode in a buffer, it uses the function turn-on; to turn off the minor mode, it calls mode with โ1 as argument. (The function turn-on is a separate function so it could determine whether to enable the minor mode or not when it is not a priori clear that it should always be enabled.) Globally enabling the mode affects only those buffers subsequently created that use a major mode which follows the convention to run
run-mode-hooks. The minor mode will not be enabled in those major modes which fail to follow this convention. This macro defines the customization option global-mode (Customization), which can be toggled via the Customize interface to turn the minor mode on and off. As withdefine-minor-mode, you should ensure that thedefine-globalized-minor-modeform is evaluated each time Emacs starts, for example by providing a:requirekeyword. Use:group GROUPin keyword-args to specify the custom group for the mode variable of the global minor mode. By default, the buffer-local minor mode variable that says whether the mode is switched on or off is the same as the name of the mode itself. Use:variable VARIABLEif that's not the caseโsome minor modes use a different variable to store this state information. Generally speaking, when you define a globalized minor mode, you should also define a non-globalized version, so that people could use it (or disable it) in individual buffers. This also allows them to disable a globally enabled minor mode in a specific major mode, by using that mode's hook. If the macro is given a:predicatekeyword, it will create a user option called the same as the global mode variable, but with-modesinstead of-modeat the end, i.e.GLOBAL-MODEs. This variable will be used in a predicate function that determines whether the minor mode should be activated in a particular major mode, and users can customize the value of the variable to control the modes in which the minor mode will be switched on. Valid values of:predicate(and thus valid values of the user option it creates) includet(use in all major modes),nil(don't use in any major modes), or a list of mode names, optionally preceded withnot(as in(not MODE-NAME ...)). These elements can be mixed, as shown in the following examples.
(c-mode (not mail-mode message-mode) text-mode)
This means "use in modes derived from c-mode, and not in modes derived from message-mode or mail-mode, but do use in modes derived from text-mode, and otherwise no other modes".
((not c-mode) t)
This means "don't use in modes derived from c-mode, but do use everywhere else".
(text-mode)
This means "use in modes derived from text-mode, but nowhere else". (There's an implicit nil element at the end.)
-
buffer-local-set-state - Minor modes often set buffer-local variables that affect some features in Emacs. When a minor mode is switched off, the mode is expected to restore the previous state of these variables. This convenience macro helps with doing that: It works much like
setq-local, but returns an object that can be used to restore these values back to their previous values/states (using the companion functionbuffer-local-restore-state).
Mode Line Format
Each Emacs window (aside from minibuffer windows) typically has a mode line at the bottom, which displays status information about the buffer displayed in the window. The mode line contains information about the buffer, such as its name, associated file, depth of recursive editing, and major and minor modes. A window can also have a header line and a tab line, which are much like the mode line but they appear at the top of the window. This section describes how to control the contents of the mode line, header line, and tab line. We include it in this chapter because much of the information displayed in the mode line relates to the enabled major and minor modes.
Mode Line Basics
The contents of each mode line are specified by the buffer-local variable mode-line-format (Mode Line Top). This variable holds a mode line construct: a template that controls what is displayed on the buffer's mode line. The value of header-line-format and tab-line-format specifies the buffer's header line and tab line in the same way. All windows for the same buffer use the same mode-line-format, header-line-format, and tab-line-format unless a mode-line-format, header-line-format, or tab-line-format parameter has been specified for that window (Window Parameters). For efficiency, Emacs does not continuously recompute each window's mode line and header line. It does so when circumstances appear to call for itโfor instance, if you change the window configuration, switch buffers, narrow or widen the buffer, scroll, or modify the buffer. If you alter any of the variables referenced by mode-line-format or header-line-format (Mode Line Variables), or any other data structures that affect how text is displayed (Display), you should use the function force-mode-line-update to update the display.
-
force-mode-line-update - This function forces Emacs to update the current buffer's mode line and header line, based on the latest values of all relevant variables, during its next redisplay cycle. If the optional argument all is non-
nil, it forces an update for all mode lines and header lines. This function also forces an update of the menu bar and frame title.
The selected window's mode line is usually displayed in a different color using the face mode-line-active. Other windows' mode lines appear in the face mode-line-inactive instead. Faces.
-
mode-line-window-selected-p - If you want to have more extensive differences between the mode lines in selected and non-selected windows, you can use this predicate in an
:evalconstruct. For instance, if you want to display the buffer name in bold in selected windows, but in italics in the other windows, you can say something like:
(setq-default
mode-line-buffer-identification
'(:eval (propertize "%12b"
'face (if (mode-line-window-selected-p)
'bold
'italic))))The Data Structure of the Mode Line
The mode line contents are controlled by a data structure called a mode line construct, made up of lists, strings, symbols, and numbers kept in buffer-local variables. Each data type has a specific meaning for the mode line appearance, as described below. The same data structure is used for constructing frame titles (Frame Titles), header lines (Header Lines), and tab lines (Tab Lines). A mode line construct may be as simple as a fixed string of text, but it usually specifies how to combine fixed strings with variables' values to construct the text. Many of these variables are themselves defined to have mode line constructs as their values. Here are the meanings of various data types as mode line constructs:
-
STRING - A string as a mode line construct appears verbatim except for
%-constructs in it. These stand for substitution of other data; see %-Constructs. If parts of the string havefaceproperties, they control display of the text just as they would text in the buffer. Any characters which have nofaceproperties are displayed, by default, in the facemode-lineormode-line-inactive(Standard Faces). Thehelp-echoandkeymapproperties in string have special meanings. Properties in Mode. -
SYMBOL - A symbol as a mode line construct stands for its value. The value of symbol is used as a mode line construct, in place of symbol. However, the symbols
tandnilare ignored, as is any symbol whose value is void. There is one exception: if the value of symbol is a string, it is displayed verbatim: the%-constructs are not recognized. Unless symbol is marked as risky (i.e., it has a non-nilrisky-local-variableproperty), all text properties specified in symbol's value are ignored. This includes the text properties of strings in symbol's value, as well as all:evaland:propertizeforms in it. (The reason for this is security: non-risky variables could be set automatically from file variables without prompting the user.) -
(STRING REST...),(LIST REST...) - A list whose first element is a string or list means to process all the elements recursively and concatenate the results. This is the most common form of mode line construct. (Note that text properties are handled specially (for reasons of efficiency) when displaying strings in the mode line: Only the text property on the first character of the string are considered, and they are then used over the entire string. If you need a string with different text properties, you have to use the special
:propertizemode line construct.) -
(:eval FORM) - A list whose first element is the symbol
:evalsays to evaluate form, and use the result as a string to display. Make sure this evaluation neither loads any files nor calls functions likeposn-at-pointorwindow-in-direction, which themselves evaluate the mode line, as doing so could cause infinite recursion. -
(:propertize ELT PROPS...) - A list whose first element is the symbol
:propertizesays to process the mode line construct elt recursively, then add the text properties specified by props to the result. The argument props should consist of zero or more pairs text-property value. If elt is or produces a string with text properties, all the characters of that string should have the same properties, or else some of them might be removed by:propertize. -
(SYMBOL THEN ELSE) - A list whose first element is a symbol that is not a keyword specifies a conditional. Its meaning depends on the value of symbol. If symbol has a non-
nilvalue, the second element, then, is processed recursively as a mode line construct. Otherwise, the third element, else, is processed recursively. You may omit else; then the mode line construct displays nothing if the value of symbol isnilor void. -
(WIDTH REST...) - A list whose first element is an integer specifies truncation or padding of the results of rest. The remaining elements rest are processed recursively as mode line constructs and concatenated together. When width is positive, the result is space filled on the right if its width is less than width. When width is negative, the result is truncated on the right to โ/width/ columns if its width exceeds โ/width/. For example, the usual way to show what percentage of a buffer is above the top of the window is to use a list like this:
(-3 "%p").
The Top Level of Mode Line Control
The variable in overall control of the mode line is mode-line-format.
-
mode-line-format - The value of this variable is a mode line construct that controls the contents of the mode-line. It is always buffer-local in all buffers. If you set this variable to
nilin a buffer, that buffer does not have a mode line. (A window that is just one line tall also does not display a mode line.)
The default value of mode-line-format is designed to use the values of other variables such as mode-line-position and mode-line-modes (which in turn incorporates the values of the variables mode-name and minor-mode-alist). Very few modes need to alter mode-line-format itself. For most purposes, it is sufficient to alter some of the variables that mode-line-format either directly or indirectly refers to. If you do alter mode-line-format itself, the new value should use the same variables that appear in the default value (Mode Line Variables), rather than duplicating their contents or displaying the information in another fashion. This way, customizations made by the user or by Lisp programs (such as display-time and major modes) via changes to those variables remain effective. Here is a hypothetical example of a mode-line-format that might be useful for Shell mode (in reality, Shell mode does not set mode-line-format):
(setq mode-line-format
(list "-"
'mode-line-mule-info
'mode-line-modified
'mode-line-frame-identification
"%b--"
;; Note that this is evaluated while making the list.
;; It makes a mode line construct which is just a string.
(getenv "HOST")
":"
'default-directory
" "
'global-mode-string
" %[("
'(:eval (format-time-string "%F"))
'mode-line-process
'minor-mode-alist
"%n"
")%]--"
'(which-function-mode ("" which-func-format "--"))
'(line-number-mode "L%l--")
'(column-number-mode "C%c--")
'(-3 "%p")))
(The variables line-number-mode, column-number-mode and which-function-mode enable particular minor modes; as usual, these variable names are also the minor mode command names.)
Variables Used in the Mode Line
This section describes variables incorporated by the standard value of mode-line-format into the text of the mode line. There is nothing inherently special about these variables; any other variables could have the same effects on the mode line if the value of mode-line-format is changed to use them. However, various parts of Emacs set these variables on the understanding that they will control parts of the mode line; therefore, practically speaking, it is essential for the mode line to use them. Also see Optional Mode Line.
-
mode-line-mule-info - This variable holds the value of the mode line construct that displays information about the language environment, buffer coding system, and current input method. Non-ASCII Characters.
-
mode-line-modified - This variable holds the value of the mode line construct that displays whether the current buffer is modified. Its default value displays
**if the buffer is modified,--if the buffer is not modified,%%if the buffer is read only, and%*if the buffer is read only and modified. Changing this variable does not force an update of the mode line. -
mode-line-frame-identification - This variable identifies the current frame. Its default value displays
" "if you are using a window system which can show multiple frames, or"-%F "on an ordinary terminal which shows only one frame at a time. -
mode-line-buffer-identification - This variable identifies the buffer being displayed in the window. Its default value displays the buffer name, padded with spaces to at least 12 columns.
-
mode-line-position - This variable indicates the position in the buffer. Its default value displays the buffer percentage and, optionally, the buffer size, the line number and the column number.
-
mode-line-percent-position - This option is used in
mode-line-position. Its value specifies both the buffer percentage to display (one ofnil,"%o","%p","%P"or"%q", %-Constructs) and a width to space-fill or truncate to. You are recommended to set this option with thecustomize-variablefacility. -
vc-mode - The variable
vc-mode, buffer-local in each buffer, records whether the buffer's visited file is maintained with version control, and, if so, which kind. Its value is a string that appears in the mode line, ornilfor no version control. -
mode-line-modes - This variable displays the buffer's major and minor modes. Its default value also displays the recursive editing level, information on the process status, and whether narrowing is in effect.
-
mode-line-remote - This variable is used to show whether
default-directoryfor the current buffer is remote. -
mode-line-client - This variable is used to identify
emacsclientframes. -
mode-line-format-right-align - Anything following this symbol in
mode-line-formatwill be right-aligned. -
mode-line-right-align-edge - This variable controls exactly
mode-line-format-right-alignaligns content to.
The following three variables are used in mode-line-modes:
-
mode-name - This buffer-local variable holds the "pretty" name of the current buffer's major mode. Each major mode should set this variable so that the mode name will appear in the mode line. The value does not have to be a string, but can use any of the data types valid in a mode-line construct (Mode Line Data). To compute the string that will identify the mode name in the mode line, use
format-mode-line(Emulating Mode Line). -
mode-line-process - This buffer-local variable contains the mode line information on process status in modes used for communicating with subprocesses. It is displayed immediately following the major mode name, with no intervening space. For example, its value in the
*shell*buffer is(":%s"), which allows the shell to display its status along with the major mode as:(Shell:run). Normally this variable isnil. -
mode-line-front-space - This variable is displayed at the front of the mode line. By default, this construct is displayed right at the beginning of the mode line, except that if there is a memory-full message, it is displayed first.
-
mode-line-end-spaces - This variable is displayed at the end of the mode line.
-
mode-line-misc-info - Mode line construct for miscellaneous information. By default, this shows the information specified by
global-mode-string. -
mode-line-position-line-format - The format used to display line numbers when
line-number-mode(Optional Mode Line) is switched on.%lin the format will be replaced with the line number. -
mode-line-position-column-format - The format used to display column numbers when
column-number-mode(Optional Mode Line) is switched on.%cin the format will be replaced with a zero-based column number, and%Cwill be replaced with a one-based column number. -
mode-line-position-column-line-format - The format used to display column numbers when both
line-number-modeandcolumn-number-modeare switched on. See the previous two variables for the meaning of the%l,%cand%Cformat specs. -
minor-mode-alist - This variable holds an association list whose elements specify how the mode line should indicate that a minor mode is active. Each element of the
minor-mode-alistshould be a two-element list:
(MINOR-MODE-VARIABLE MODE-LINE-STRING)
More generally, mode-line-string can be any mode line construct. It appears in the mode line when the value of minor-mode-variable is non-nil, and not otherwise. These strings should begin with spaces so that they don't run together. Conventionally, the minor-mode-variable for a specific mode is set to a non-nil value when that minor mode is activated. minor-mode-alist itself is not buffer-local. Each variable mentioned in the alist should be buffer-local if its minor mode can be enabled separately in each buffer.
-
global-mode-string - This variable holds a mode line construct that, by default, appears in the mode line as part of
mode-line-misc-info, just after thewhich-function-modeinformation if that minor mode is enabled, else aftermode-line-modes. Elements that are added to this construct should normally end in a space (to ensure that consecutiveglobal-mode-stringelements display properly). For instance, the commanddisplay-timesetsglobal-mode-stringto refer to the variabledisplay-time-string, which holds a string containing the time and load information. The%Mconstruct substitutes the value ofglobal-mode-string. This construct is not used by the default mode line, as the variable itself is used inmode-line-misc-info.
Here is a simplified version of the default value of mode-line-format. The real default value also specifies addition of text properties.
("-"
mode-line-mule-info
mode-line-modified
mode-line-frame-identification
mode-line-buffer-identification
" "
mode-line-position
(vc-mode vc-mode)
" "
mode-line-modes
(which-function-mode ("" which-func-format "--"))
(global-mode-string ("--" global-mode-string))
"-%-")
%-Constructs in the Mode Line
Strings used as mode line constructs can use certain %-constructs to substitute various kinds of data. The following is a list of the defined %-constructs, and what they mean. In any construct except %%, you can add a decimal integer after the % to specify a minimum field width. If the width is less, the field is padded to that width. Purely numeric constructs (c, i, I, and l) are padded by inserting spaces to the left, and others are padded by inserting spaces to the right.
-
%b - The current buffer name, obtained with the
buffer-namefunction. Buffer Names. -
%c - The current column number of point, counting from zero starting at the left margin of the window.
-
%C - The current column number of point, counting from one starting at the left margin of the window.
-
%e - When Emacs is nearly out of memory for Lisp objects, a brief message saying so. Otherwise, this is empty.
-
%f - The visited file name, obtained with the
buffer-file-namefunction. Buffer File Name. -
%F - The title (only on a window system) or the name of the selected frame. Basic Parameters.
-
%i - The size of the accessible part of the current buffer; basically
(- (point-max) (point-min)). -
%I - Like
%i, but the size is printed in a more readable way by usingkfor 10^3,Mfor 10^6,Gfor 10^9, etc., to abbreviate. -
%l - The current line number of point, counting within the accessible portion of the buffer.
-
%M - The value of
global-mode-string(which is part ofmode-line-misc-infoby default). -
%n Narrowwhen narrowing is in effect; nothing otherwise (seenarrow-to-regionin Narrowing).-
%o - The degree of travel of the window through (the visible portion of) the buffer, i.e. the size of the text above the top of the window expressed as a percentage of all the text outside the window, or
Top,BottomorAll. -
%p - The percentage of the buffer text above the top of window, or
Top,BottomorAll. Note that the default mode line construct truncates this to three characters. -
%P - The percentage of the buffer text that is above the bottom of the window (which includes the text visible in the window, as well as the text above the top), plus
Topif the top of the buffer is visible on screen; orBottomorAll. -
%q - The percentages of text above both the top and the bottom of the window, separated by
-, orAll. -
%s - The status of the subprocess belonging to the current buffer, obtained with
process-status. Process Information. -
%z - The mnemonics of keyboard, terminal, and buffer coding systems.
-
%Z - Like
%z, but including the end-of-line format. -
%& *if the buffer is modified, and-otherwise.-
%* %if the buffer is read only (seebuffer-read-only);*if the buffer is modified (seebuffer-modified-p);-otherwise. Buffer Modification.-
%+ *if the buffer is modified (seebuffer-modified-p);%if the buffer is read only (seebuffer-read-only);-otherwise. This differs from%*only for a modified read-only buffer. Buffer Modification.-
%@ @if the buffer'sdefault-directory(File Name Expansion) is on a remote machine, and-otherwise.-
%[ - An indication of the depth of recursive editing levels (not counting minibuffer levels): one
[for each editing level. Recursive Editing. -
%] - One
]for each recursive editing level (not counting minibuffer levels). -
%- - Dashes sufficient to fill the remainder of the mode line.
-
%% - The character
%โthis is how to include a literal%in a string in which%-constructs are allowed.
@subsubheading Obsolete %-Constructs The following constructs should no longer be used.
-
%m - Obsolete; use the
mode-namevariable instead. The%mconstruct is inadequate, as it produces an empty string if the value ofmode-nameis a non-string mode-line construct (as inemacs-lisp-mode, for example).
Properties in the Mode Line
Certain text properties are meaningful in the mode line. The face property affects the appearance of text; the help-echo property associates help strings with the text, and keymap can make the text mouse-sensitive. There are four ways to specify text properties for text in the mode line:
- Put a string with a text property directly into the mode line data structure, but see Mode Line Data for caveats for that.
- Put a text property on a mode line %-construct such as
%12b; then the expansion of the %-construct will have that same text property. - Use a
(:propertize ELT PROPS...)construct to give elt a text property specified by props. - Use a list containing
:eval FORMin the mode line data structure, and make form evaluate to a string that has a text property.
You can use the keymap property to specify a keymap. This keymap only takes real effect for mouse clicks; binding character keys and function keys to it has no effect, since it is impossible to move point into the mode line. When the mode line refers to a variable which does not have a non-nil risky-local-variable property, any text properties given or specified within that variable's values are ignored. This is because such properties could otherwise specify functions to be called, and those functions could come from file local variables.
Window Header Lines
A window can have a header line at the top, just as it can have a mode line at the bottom. The header line feature works just like the mode line feature, except that it's controlled by header-line-format:
-
header-line-format - This variable, local in every buffer, specifies how to display the header line, for windows displaying the buffer. The format of the value is the same as for
mode-line-format(Mode Line Data). It is normallynil, so that ordinary buffers have no header line.
If display-line-numbers-mode is turned on in a buffer (display-line-numbers-mode), the buffer text is indented on display by the amount of screen space needed to show the line numbers. By contrast, text of the header line is not automatically indented, because a header line never displays a line number, and because the text of the header line is not necessarily directly related to buffer text below it. If a Lisp program needs the header-line text to be aligned with buffer text (for example, if the buffer displays columnar data, like tabulated-list-mode does, Tabulated List Mode), it should turn on the minor mode header-line-indent-mode.
-
Command header-line-indent-mode - This buffer-local minor mode tracks the changes of the width of the line-number display on screen (which may vary depending on the range of line numbers shown in the window), and allows Lisp programs to arrange that header-line text is always aligned with buffer text when the line-number width changes. Such Lisp programs should turn on this mode in the buffer, and use the variables
header-line-indentandheader-line-indent-widthin theheader-line-formatto ensure it is adjusted to the text indentation at all times. -
header-line-indent - This variable's value is a whitespace string whose width is kept equal to the current width of line-numbers on display, provided that
header-line-indent-modeis turned on in the buffer shown in the window. The number of spaces is calculated under the assumption that the face of the header-line text uses the same font, including size, as the frame's default font; if that assumption is false, useheader-line-indent-width, described below, instead. This variable is intended to be used in simple situations where the header-line text needs to be indented as a whole to be realigned with buffer text, by prepending this variable's value to the actual header-line text. For example, the following definition ofheader-line-format:
(setq header-line-format
`("" header-line-indent ,my-header-line))
where my-header-line is the format string that produces the actual text of the header line, will make sure the header-line text is always indented like the buffer text below it.
-
header-line-indent-width - This variable's value is kept updated to provide the current width, in units of the frame's canonical character width, used for displaying the line numbers, provided that
header-line-indent-modeis turned on in the buffer shown in the window. It can be used for aligning the header-line text with the buffer text whenheader-line-indentis not flexible enough. For example, if the header line uses a font whose metrics is different from the default face's font, your Lisp program can calculate the width of line-number display in pixels, by multiplying the value of this variable by the value returned byframe-char-width(Frame Font), and then use the result to align header-line text using the:align-todisplay property spec (Specified Space) in pixels on the relevant parts ofheader-line-format. -
window-header-line-height - This function returns the height in pixels of window's header line. window must be a live window, and defaults to the selected window.
A window that is just one line tall never displays a header line. A window that is two lines tall cannot display both a mode line and a header line at once; if it has a mode line, then it does not display a header line.
Window Tab Lines
A window can have a tab line at the top. If both the tab line and header line are visible, the tab line appears above the header line. The tab line feature is controlled like the mode line feature, except that it's controlled by tab-line-format. Unlike the mode line, the tab line is only expected to be used to display a list of tabs (Tab Line) or the window tool bar (Window Tool Bar):
-
tab-line-format - This variable, local in every buffer, specifies how to display the tab line, for windows displaying the buffer. The format of the value is the same as for
mode-line-format(Mode Line Data). It is normallynil, so that ordinary buffers have no tab line. -
window-tab-line-height - This function returns the height in pixels of window's tab line. window must be a live window, and defaults to the selected window.
Emulating Mode Line Formatting
You can use the function format-mode-line to compute the text that would appear in a mode line or header line based on a certain mode line construct.
-
format-mode-line - This function formats a line of text according to format as if it were generating the mode line for window, but it also returns the text as a string. The argument window defaults to the selected window. If buffer is non-
nil, all the information used is taken from buffer; by default, it comes from window's buffer. The value string normally has text properties that correspond to the faces, keymaps, etc., that the mode line would have. Any character for which nofaceproperty is specified by format gets a default value determined by face. If face ist, that stands for eithermode-lineif window is selected, otherwisemode-line-inactive. If face isnilor omitted, that stands for the default face. If face is an integer, the value returned by this function will have no text properties. You can also specify other valid faces as the value of face. If specified, that face provides thefaceproperty for characters whose face is not specified by format. Note that usingmode-line,mode-line-inactive, orheader-lineas face will actually redisplay the mode line or the header line, respectively, using the current definitions of the corresponding face, in addition to returning the formatted string. (Other faces do not cause redisplay.) For example,(format-mode-line header-line-format)returns the text that would appear in the selected window's header line (""if it has no header line).(format-mode-line header-line-format 'header-line)returns the same text, with each character carrying the face that it will have in the header line itself, and also redraws the header line.
Outline Minor Mode
Outline minor mode is a buffer-local minor mode that hides parts of the buffer and leaves only heading lines visible. This minor mode can be used in conjunction with other major modes (Outline Minor Mode). There are two ways to define which lines are headings: with the variable outline-regexp or outline-search-function.
-
outline-regexp - This variable is a regular expression. Any line whose beginning has a match for this regexp is considered a heading line. Matches that start within a line (not at the left margin) do not count.
-
outline-search-function - Alternatively, when it's impossible to create a regexp that matches heading lines, you can define a function that helps Outline minor mode to find heading lines. The variable
outline-search-functionspecifies the function with four arguments: bound, move, backward, and looking-at. The function completes two tasks: to match the current heading line, and to find the next or the previous heading line. If the argument looking-at is non-nil, it should return non-nilwhen point is at the beginning of the outline header line. If the argument looking-at isnil, the first three arguments are used. The argument bound is a buffer position that bounds the search. The match found must not end after that position. A value of nil means search to the end of the accessible portion of the buffer. If the argument move is non-nil, the failed search should move to the limit of search and return nil. If the argument backward is non-nil, this function should search for the previous heading backward. -
outline-level - This variable is a function that takes no arguments and should return the level of the current heading. It's required in both cases: whether you define
outline-regexporoutline-search-function.
If built with tree-sitter, Emacs can automatically use Outline minor mode if the major mode sets one of the following variables.
-
treesit-outline-predicate - This variable instructs Emacs how to find lines with outline headings. It should be a predicate that matches the node on the heading line.
-
treesit-aggregated-outline-predicate - This variable allows major modes to configure outlines for multiple languages. Its value is an alist mapping language symbols to outline headings as described above for the value of
treesit-outline-predicate. If this variable is non-nil, it overridestreesit-outline-predicatefor setting up outline headings.
Font Lock Mode
Font Lock mode is a buffer-local minor mode that automatically attaches face properties to certain parts of the buffer based on their syntactic role. How it parses the buffer depends on the major mode; most major modes define syntactic criteria for which faces to use in which contexts. This section explains how to customize Font Lock for a particular major mode. Font Lock mode finds text to highlight in three ways: through parsing based on a full-blown parser (usually, via an external library or program), through syntactic parsing based on the Emacs's built-in syntax table, or through searching (usually for regular expressions). If enabled, parser-based fontification happens first (Parser-based Font Lock). Syntactic fontification happens next; it finds comments and string constants and highlights them. Search-based fontification happens last.
Font Lock Basics
The Font Lock functionality is based on several basic functions. Each of these calls the function specified by the corresponding variable. This indirection allows major and minor modes to modify the way fontification works in the buffers of that mode, and even use the Font Lock mechanisms for features that have nothing to do with fontification. (This is why the description below says "should" when it describes what the functions do: the mode can customize the values of the corresponding variables to do something entirely different.) The variables mentioned below are described in Other Font Lock Variables.
-
font-lock-fontify-buffer - This function should fontify the current buffer's accessible portion, by calling the function specified by
font-lock-fontify-buffer-function. -
font-lock-unfontify-buffer - Used when turning Font Lock off to remove the fontification. Calls the function specified by
font-lock-unfontify-buffer-function. -
font-lock-fontify-region beg end &optional loudly - Should fontify the region between beg and end. If loudly is non-
nil, should display status messages while fontifying. Calls the function specified byfont-lock-fontify-region-function. -
font-lock-unfontify-region beg end - Should remove fontification from the region between beg and end. Calls the function specified by
font-lock-unfontify-region-function. -
font-lock-flush &optional beg end - This function should mark the fontification of the region between beg and end as outdated. If not specified or
nil, beg and end default to the beginning and end of the buffer's accessible portion. Calls the function specified byfont-lock-flush-function. -
font-lock-ensure &optional beg end - This function should make sure the region between beg and end has been fontified. The optional arguments beg and end default to the beginning and the end of the buffer's accessible portion. Calls the function specified by
font-lock-ensure-function. -
font-lock-debug-fontify - This is a convenience command meant to be used when developing font locking for a mode, and should not be called from Lisp code. It recomputes all the relevant variables and then calls
font-lock-fontify-regionon the entire buffer.
There are several variables that control how Font Lock mode highlights text. But major modes should not set any of these variables directly. Instead, they should set font-lock-defaults as a buffer-local variable. The value assigned to this variable is used, if and when Font Lock mode is enabled, to set all the other variables.
-
font-lock-defaults - This variable is set by modes to specify how to fontify text in that mode. It automatically becomes buffer-local when set. If its value is
nil, Font Lock mode does no highlighting. If non-nil, the value should look like this:
(KEYWORDS [KEYWORDS-ONLY [CASE-FOLD [SYNTAX-ALIST OTHER-VARS...]]])
The first element, keywords, indirectly specifies the value of font-lock-keywords which directs search-based fontification. It can be a symbol, a variable or a function whose value is the list to use for font-lock-keywords. It can also be a list of several such symbols, one for each possible level of fontification. The first symbol specifies the mode default level of fontification, the next symbol level 1 fontification, the next level 2, and so on. The mode default level is normally the same as level 1. It is used when font-lock-maximum-decoration has a nil value. Levels of Font Lock. The second element, keywords-only, specifies the value of the variable font-lock-keywords-only. If this is omitted or nil, syntactic fontification (of strings and comments) is also performed. If this is non-nil, syntactic fontification is not performed. Syntactic Font Lock. The third element, case-fold, specifies the value of font-lock-keywords-case-fold-search. If it is non-nil, Font Lock mode ignores case during search-based fontification. If the fourth element, syntax-alist, is non-nil, it should be a list of cons cells of the form (CHAR-OR-STRING . STRING). These are used to set up a syntax table during fontification; the resulting syntax table is stored in font-lock-syntax-table. If syntax-alist is omitted or nil, fontification uses the syntax table returned by the syntax-table function. Syntax Table Functions. The most common uses of syntax-alist simply change the syntax of a few chars from symbol constituent to word constituent so that the fontification rules can use regexp operators based on word boundaries. When syntax-alist describes more intrusive changes that can change what is recognized as a string or a comment, this prevents an important optimization in syntactic fontification, so it's better to either refrain from using such settings or to additionally set syntax-ppss-table to a non-nil value, which takes precedence during syntactic fontification. All the remaining elements (if any) are collectively called other-vars. Each of these elements should have the form (VARIABLE . VALUE)โwhich means, make variable buffer-local and then set it to value. You can use these other-vars to set other variables that affect fontification, aside from those you can control with the first five elements. Other Font Lock Variables. If your mode fontifies text explicitly by adding font-lock-face properties, it can specify (nil t) for font-lock-defaults to turn off all automatic fontification. However, this is not required; it is possible to fontify some things using font-lock-face properties and set up automatic fontification for other parts of the text.
Search-based Fontification
The variable which directly controls search-based fontification is font-lock-keywords, which is typically specified via the keywords element in font-lock-defaults.
-
font-lock-keywords - The value of this variable is a list of the keywords to highlight. Lisp programs should not set this variable directly. Normally, the value is automatically set by Font Lock mode, using the keywords element in
font-lock-defaults. The value can also be altered using the functionsfont-lock-add-keywordsandfont-lock-remove-keywords(Customizing Keywords).
Each element of font-lock-keywords specifies how to find certain cases of text, and how to highlight those cases. Font Lock mode processes the elements of font-lock-keywords one by one, and for each element, it finds and handles all matches. Ordinarily, once part of the text has been fontified already, this cannot be overridden by a subsequent match in the same text; but you can specify different behavior using the override element of a subexp-highlighter. Each element of font-lock-keywords should have one of these forms:
-
REGEXP - Highlight all matches for regexp using
font-lock-keyword-face. For example, ;; Highlight occurrences of the wordfoo;; usingfont-lock-keyword-face. "\\<foo\\>" Be careful when composing these regular expressions; a poorly written pattern can dramatically slow things down! The functionregexp-opt(Regexp Functions) is useful for calculating optimal regular expressions to match several keywords. -
FUNCTION - Find text by calling function, and highlight the matches it finds using
font-lock-keyword-face. When function is called, it receives one argument, the limit of the search; it should begin searching at point, and not search beyond the limit. It should return non-nilif it succeeds, and set the match data to describe the match that was found. Returningnilindicates failure of the search. Fontification will call function repeatedly with the same limit, and with point where the previous invocation left it, until function fails. On failure, function need not reset point in any particular way. function can also take on the responsibility of performing the highlighting of the region between point and the limit argument it receives. In that case, it should returnnil, otherwise font-lock will highlight the match described by the match data and may call the function again with the same limit. -
(MATCHER . SUBEXP) - In this kind of element, matcher is either a regular expression or a function, as described above. The CDR, subexp, specifies which subexpression of matcher should be highlighted (instead of the entire text that matcher matched). ;; Highlight the
barin each occurrence offubar;; usingfont-lock-keyword-face. ("fu\\(bar\\)" . 1) -
(MATCHER . FACESPEC) - In this kind of element, facespec is an expression whose value specifies the face to use for highlighting. In the simplest case, facespec is a Lisp variable (a symbol) whose value is a face name. ;; Highlight occurrences of
fubar;; using the face which is the value offubar-face. ("fubar" . fubar-face) However, facespec can also evaluate to a list of this form: (subexp (face face prop1 val1 prop2 val2โฆ)) to specify the face face and various additional text properties to put on the text that matches. If you do this, be sure to add the other text property names that you set in this way to the value offont-lock-extra-managed-propsso that the properties will also be cleared out when they are no longer appropriate. Alternatively, you can set the variablefont-lock-unfontify-region-functionto a function that clears these properties. Other Font Lock Variables. -
(MATCHER . SUBEXP-HIGHLIGHTER) - In this kind of element, subexp-highlighter is a list which specifies how to highlight matches found by matcher. It has the form: (subexp facespec [/override/ [/laxmatch/]]) The CAR, subexp, is an integer specifying which subexpression of the match to fontify (0 means the entire matching text). The second subelement, facespec, is an expression whose value specifies the face, as described above. The last two values in subexp-highlighter, override and laxmatch, are optional flags. If override is
t, this element can override existing fontification made by previous elements offont-lock-keywords. If it iskeep, then each character is fontified if it has not been fontified already by some other element. If it isprepend, the face specified by facespec is added to the beginning of thefont-lock-faceproperty. If it isappend, the face is added to the end of thefont-lock-faceproperty. If laxmatch is non-nil, it means there should be no error if there is no subexpression numbered subexp in matcher. Obviously, fontification of the subexpression numbered subexp will not occur. However, fontification of other subexpressions (and other regexps) will continue. If laxmatch isnil, and the specified subexpression is missing, then an error is signaled which terminates search-based fontification. Here are some examples of elements of this kind, and what they do: ;; Highlight occurrences of eitherfooorbar;;foo-bar-face;;foo-bar-faceshould be a variable whose value is a face. ("foo\\|bar" 0 foo-bar-face t) ;; Highlight the first subexpression within each occurrence ;; that the functionfubar-matchfinds ;; using the face which is the value offubar-face. (fubar-match 1 fubar-face) -
(MATCHER . ANCHORED-HIGHLIGHTER) - In this kind of element, anchored-highlighter specifies how to highlight text that follows a match found by matcher. So a match found by matcher acts as the anchor for further searches specified by anchored-highlighter. anchored-highlighter is a list of the following form: (anchored-matcher pre-form post-form subexp-highlightersโฆ) Here, anchored-matcher, like matcher, is either a regular expression or a function. After a match of matcher is found, point is at the end of the match. Now, Font Lock evaluates the form pre-form. Then it searches for matches of anchored-matcher and uses subexp-highlighters to highlight these. A subexp-highlighter is as described above. Finally, Font Lock evaluates post-form. The forms pre-form and post-form can be used to initialize before, and cleanup after, anchored-matcher is used. Typically, pre-form is used to move point to some position relative to the match of matcher, before starting with anchored-matcher. post-form might be used to move back, before resuming with matcher. After Font Lock evaluates pre-form, it does not search for anchored-matcher beyond the end of the line. However, if pre-form returns a buffer position that is greater than the position of point after pre-form is evaluated, then the position returned by pre-form is used as the limit of the search instead. It is generally a bad idea to return a position greater than the end of the line; in other words, the anchored-matcher search should not span lines. For example, ;; Highlight occurrences of the word
itemfollowing ;; an occurrence of the wordanchor(on the same line) ;; in the value ofitem-face. ("\\<anchor\\>" "\\<item\\>" nil nil (0 item-face)) Here, pre-form and post-form arenil. Therefore searching foritemstarts at the end of the match ofanchor, and searching for subsequent instances ofanchorresumes from where searching foritemconcluded. -
(MATCHER HIGHLIGHTERS...) - This sort of element specifies several highlighter lists for a single matcher. A highlighter list can be of the type subexp-highlighter or anchored-highlighter as described above. For example, ;; Highlight occurrences of the word
anchorin the value ;; ofanchor-face;;item(on the same line) in the value ofitem-face. ("\\<anchor\\>" (0 anchor-face) ("\\<item\\>" nil nil (0 item-face))) -
(eval . FORM) - Here form is an expression to be evaluated the first time this value of
font-lock-keywordsis used in a buffer. Its value should have one of the forms described in this table.
Warning: Do not design an element of font-lock-keywords to match text which spans lines; this does not work reliably. For details, Multiline Font Lock. You can use case-fold in font-lock-defaults to specify the value of font-lock-keywords-case-fold-search which says whether search-based fontification should be case-insensitive.
-
font-lock-keywords-case-fold-search - Non-
nilmeans that regular expression matching for the sake offont-lock-keywordsshould be case-insensitive.
Customizing Search-Based Fontification
You can use font-lock-add-keywords to add additional search-based fontification rules to a major mode, and font-lock-remove-keywords to remove rules. You can also customize the font-lock-ignore option to selectively disable fontification rules for keywords that match certain criteria.
-
font-lock-add-keywords - This function adds highlighting keywords, for the current buffer or for major mode mode. The argument keywords should be a list with the same format as the variable
font-lock-keywords. If mode is a symbol which is a major mode command name, such asc-mode, the effect is that enabling Font Lock mode in mode will add keywords tofont-lock-keywords. Calling with a non-nilvalue of mode is correct only in your~/.emacsfile. If mode isnil, this function adds keywords tofont-lock-keywordsin the current buffer. This way of callingfont-lock-add-keywordsis usually used in mode hook functions. By default, keywords are added at the beginning offont-lock-keywords. If the optional argument how isset, they are used to replace the value offont-lock-keywords. If how is any other non-nilvalue, they are added at the end offont-lock-keywords. Some modes provide specialized support you can use in additional highlighting patterns. See the variablesc-font-lock-extra-types,c++-font-lock-extra-types, andjava-font-lock-extra-types, for example. Warning: Major mode commands must not callfont-lock-add-keywordsunder any circumstances, either directly or indirectly, except through their mode hooks. (Doing so would lead to incorrect behavior for some minor modes.) They should set up their rules for search-based fontification by settingfont-lock-keywords. -
font-lock-remove-keywords - This function removes keywords from
font-lock-keywordsfor the current buffer or for major mode mode. As infont-lock-add-keywords, mode should be a major mode command name ornil. All the caveats and requirements forfont-lock-add-keywordsapply here too. The argument keywords must exactly match the one used by the correspondingfont-lock-add-keywords.
For example, the following code adds two fontification patterns for C mode: one to fontify the word FIXME, even in comments, and another to fontify the words and, or and not as keywords.
(font-lock-add-keywords 'c-mode
'(("\\<\\(FIXME\\):" 1 font-lock-warning-face prepend)
("\\<\\(and\\|or\\|not\\)\\>" . font-lock-keyword-face)))
This example affects only C mode proper. To add the same patterns to C mode and all modes derived from it, do this instead:
(add-hook 'c-mode-hook
(lambda ()
(font-lock-add-keywords nil
'(("\\<\\(FIXME\\):" 1 font-lock-warning-face prepend)
("\\<\\(and\\|or\\|not\\)\\>" .
font-lock-keyword-face)))))
-
font-lock-ignore - This option defines conditions for selectively disabling fontifications due to certain Font Lock keywords. If non-
nil, its value is a list of elements of the following form:
(SYMBOL CONDITION ...)
Here, symbol is a symbol, usually a major or minor mode. The subsequent condition/s of a /symbol's list element will be in effect if symbol is bound and its value is non-nil. For a mode's symbol, it means that the current major mode is derived from that mode, or that minor mode is enabled in the buffer. When a condition is in effect, any fontifications caused by font-lock-keywords elements that match the condition will be disabled. Each condition can be one of the following:
- a symbol
- This condition matches any element of Font Lock keywords that references the symbol. This is usually a face, but can be any symbol referenced by an element of the
font-lock-keywordslist. The symbol can contain wildcards:*matches any string in the symbol's name,?matches a single character, and[CHAR-SET], where char-set is a string of one or more characters, matches a single character from the set. - a string
- This condition matches any element of Font Lock keywords whose matcher is a regexp which matches the string. In other words, this condition matches a Font Lock rule which highlights the string. Thus, the string could be a specific program keyword whose highlighting you want to disable.
-
(pred FUNCTION) - This condition matches any element of Font Lock keywords for which function, when called with the element as the argument, returns non-
nil. -
(not CONDITION) - This matches if condition doesn't.
-
(and CONDITION ...) - This matches if each of the /condition/s matches.
-
(or CONDITION ...) - This matches if at least one of the /condition/s matches.
-
(except CONDITION) - This condition can only be used at top level or inside an
orclause. It undoes the effect of a previously matching condition on the same level.
As an example, consider the following setting:
(setq font-lock-ignore
'((prog-mode font-lock-*-face
(except help-echo))
(emacs-lisp-mode (except ";;;###autoload)")
(whitespace-mode whitespace-empty-at-bob-regexp)
(makefile-mode (except *))))
Line by line, this does the following:
- In all programming modes, disable fontifications due to all font-lock keywords that apply one of the standard font-lock faces (excluding strings and comments, which are covered by syntactic Font Lock).
- However, keep any keywords that add a
help-echotext property. - In Emacs Lisp mode, also keep the highlighting of autoload cookies, which would have been excluded by the first condition.
- When
whitespace-mode(a minor mode) is enabled, also don't highlight an empty line at beginning of buffer. - Finally, in Makefile mode, don't apply any conditions.
Other Font Lock Variables
This section describes additional variables that a major mode can set by means of other-vars in font-lock-defaults (Font Lock Basics).
-
font-lock-mark-block-function - If this variable is non-
nil, it should be a function that is called with no arguments, to choose an enclosing range of text for refontification for the commandM-x font-lock-fontify-block. The function should report its choice by placing the region around it. A good choice is a range of text large enough to give proper results, but not too large so that refontification becomes slow. Typical values aremark-defunfor programming modes ormark-paragraphfor textual modes. -
font-lock-extra-managed-props - This variable specifies additional properties (other than
font-lock-face) that are being managed by Font Lock mode. It is used byfont-lock-default-unfontify-region, which normally only manages thefont-lock-faceproperty. If you want Font Lock to manage other properties as well, you must specify them in a facespec infont-lock-keywordsas well as add them to this list. Search-based Fontification. -
font-lock-fontify-buffer-function - Function to use for fontifying the buffer. The default value is
font-lock-default-fontify-buffer. -
font-lock-unfontify-buffer-function - Function to use for unfontifying the buffer. This is used when turning off Font Lock mode. The default value is
font-lock-default-unfontify-buffer. -
font-lock-fontify-region-function - Function to use for fontifying a region. It should take two arguments, the beginning and end of the region, and an optional third argument verbose. If verbose is non-
nil, the function should print status messages. The default value isfont-lock-default-fontify-region. -
font-lock-unfontify-region-function - Function to use for unfontifying a region. It should take two arguments, the beginning and end of the region. The default value is
font-lock-default-unfontify-region. -
font-lock-flush-function - Function to use for declaring that a region's fontification is out of date. It takes two arguments, the beginning and end of the region. The default value of this variable is
font-lock-after-change-function. -
font-lock-ensure-function - Function to use for making sure a region of the current buffer has been fontified. It is called with two arguments, the beginning and end of the region. The default value of this variable is a function that calls
font-lock-default-fontify-bufferif the buffer is not fontified; the effect is to make sure the entire accessible portion of the buffer is fontified. -
jit-lock-register - This function tells Font Lock mode to run the Lisp function function any time it has to fontify or refontify part of the current buffer. It calls function before calling the default fontification functions, and gives it two arguments, start and end, which specify the region to be fontified or refontified. If function performs fontifications, it can return a list of the form
(jit-lock-bounds BEG . END), to indicate the bounds of the region it actually fontified; Just-In-Time (a.k.a. "JIT") font-lock will use this information to optimize subsequent redisplay cycles and regions of buffer text it will pass to future calls to function. The optional argument contextual, if non-nil, forces Font Lock mode to always refontify a syntactically relevant part of the buffer, and not just the modified lines. This argument can usually be omitted. When Font Lock is activated in a buffer, it calls this function with a non-nilvalue of contextual if the value offont-lock-keywords-only(Syntactic Font Lock) isnil. -
jit-lock-unregister - If function was previously registered as a fontification function using
jit-lock-register, this function unregisters it. -
Command jit-lock-debug-mode - This is a minor mode whose purpose is to help in debugging code that is run by JIT font-lock. When this mode is enabled, most of the code that JIT font-lock normally runs during redisplay cycles, where Lisp errors are suppressed, is instead run by a timer. Thus, this mode allows using debugging aids such as
debug-on-error(Error Debugging) and Edebug (Edebug) for finding and fixing problems in font-lock code and any other code run by JIT font-lock. Another command that could be useful when developing and debugging font-lock isfont-lock-debug-fontify, see Font Lock Basics.
Levels of Font Lock
Some major modes offer three different levels of fontification. You can define multiple levels by using a list of symbols for keywords in font-lock-defaults. Each symbol specifies one level of fontification; it is up to the user to choose one of these levels, normally by setting font-lock-maximum-decoration (Font Lock). The chosen level's symbol value is used to initialize font-lock-keywords. Here are the conventions for how to define the levels of fontification:
- Level 1: highlight function declarations, file directives (such as include or import directives), strings and comments. The idea is speed, so only the most important and top-level components are fontified.
- Level 2: in addition to level 1, highlight all language keywords, including type names that act like keywords, as well as named constant values. The idea is that all keywords (either syntactic or semantic) should be fontified appropriately.
- Level 3: in addition to level 2, highlight the symbols being defined in function and variable declarations, and all builtin function names, wherever they appear.
Precalculated Fontification
Some major modes such as list-buffers and occur construct the buffer text programmatically. The easiest way for them to support Font Lock mode is to specify the faces of text when they insert the text in the buffer. The way to do this is to specify the faces in the text with the special text property font-lock-face (Special Properties). When Font Lock mode is enabled, this property controls the display, just like the face property. When Font Lock mode is disabled, font-lock-face has no effect on the display. It is ok for a mode to use font-lock-face for some text and also use the normal Font Lock machinery. But if the mode does not use the normal Font Lock machinery, it should not set the variable font-lock-defaults. In this case the face property will not be overridden, so using the face property could work too. However, using font-lock-face is generally preferable as it allows the user to control the fontification by toggling font-lock-mode, and lets the code work regardless of whether the mode uses Font Lock machinery or not.
Faces for Font Lock
Font Lock mode can highlight using any face, but Emacs defines several faces specifically for Font Lock to use to highlight text. These Font Lock faces are listed below. They can also be used by major modes for syntactic highlighting outside of Font Lock mode (Major Mode Conventions). The faces are listed with descriptions of their typical usage, and in order of greater to lesser prominence. If a mode's syntactic categories do not fit well with the usage descriptions, the faces can be assigned using the ordering as a guide.
-
font-lock-warning-face - for a construct that is peculiar (e.g., an unescaped confusable quote in an Emacs Lisp symbol like
โfoo), or that greatly changes the meaning of other text, like;;;###autoloadin Emacs Lisp and#errorin C. -
font-lock-function-name-face - for the name of a function being defined or declared.
-
font-lock-function-call-face - for the name of a function being called. This face inherits, by default, from
font-lock-function-name-face. -
font-lock-variable-name-face - for the name of a variable being defined or declared.
-
font-lock-variable-use-face - for the name of a variable being referenced. This face inherits, by default, from
font-lock-variable-name-face. -
font-lock-keyword-face - for a keyword with special syntactic significance, like
forandifin C. -
font-lock-comment-face - for comments.
-
font-lock-comment-delimiter-face - for comments delimiters, like
/*and*/in C. On most terminals, this inherits fromfont-lock-comment-face. -
font-lock-type-face - for the names of user-defined data types.
-
font-lock-constant-face - for the names of constants, like
NULLin C. -
font-lock-builtin-face - for the names of built-in functions.
-
font-lock-preprocessor-face - for preprocessor commands. This inherits, by default, from
font-lock-builtin-face. -
font-lock-string-face - for string constants.
-
font-lock-doc-face - for documentation embedded in program code inside specially-formed comments or strings. This face inherits, by default, from
font-lock-string-face. -
font-lock-doc-markup-face - for mark-up elements in text using
font-lock-doc-face. It is typically used for the mark-up constructs in documentation embedded in program code, following conventions such as Haddock, Javadoc or Doxygen. This face inherits, by default, fromfont-lock-constant-face. -
font-lock-negation-char-face - for easily-overlooked negation characters.
-
font-lock-escape-face - for escape sequences in strings. This face inherits, by default, from
font-lock-regexp-grouping-backslash. Here is an example in Python, where the escape sequence\nis used: print('Hello world!\n') -
font-lock-number-face - for numbers.
-
font-lock-operator-face - for operators.
-
font-lock-property-name-face - for properties of an object, such as the declaration of fields in a struct. This face inherits, by default, from
font-lock-variable-name-face. -
font-lock-property-use-face - for properties of an object, such as use of fields in a struct. This face inherits, by default, from
font-lock-property-name-face. For example, typedef struct { int prop; / ^ property } obj; int main() { obj o; o.prop = 3; / ^ property } -
font-lock-punctuation-face - for punctuation such as brackets and delimiters.
-
font-lock-bracket-face - for brackets (e.g.,
(),[],{}). This face inherits, by default, fromfont-lock-punctuation-face. -
font-lock-delimiter-face - for delimiters (e.g.,
;,:,=). This face inherits, by default, from =font-lock-punctuation-face. -
font-lock-misc-punctuation-face - for punctuation that is not a bracket or delimiter. This face inherits, by default, from
font-lock-punctuation-face.
Syntactic Font Lock
Syntactic fontification uses a syntax table (Syntax Tables) to find and highlight syntactically relevant text. If enabled, it runs prior to search-based fontification. The variable font-lock-syntactic-face-function, documented below, determines which syntactic constructs to highlight. There are several variables that affect syntactic fontification; you should set them by means of font-lock-defaults (Font Lock Basics). Whenever Font Lock mode performs syntactic fontification on a stretch of text, it first calls the function specified by syntax-propertize-function. Major modes can use this to apply syntax-table text properties to override the buffer's syntax table in special cases. Syntax Properties.
-
font-lock-keywords-only - If the value of this variable is non-
nil, Font Lock does not do syntactic fontification, only search-based fontification based onfont-lock-keywords; this will usually have the effect of not fontifying comments and strings. This variable is normally set by Font Lock mode based on the keywords-only element infont-lock-defaults. If the value isnil, Font Lock will calljit-lock-register(Other Font Lock Variables) to set up for automatic refontification of buffer text following a modified line to reflect the new syntactic context due to the change. To use only syntactic fontification, both this variable andfont-lock-keywordsshould be set tonil(Font Lock Basics). -
font-lock-syntax-table - This variable holds the syntax table to use for fontification of comments and strings. It is normally set by Font Lock mode based on the syntax-alist element in
font-lock-defaults. If this value isnil, syntactic fontification uses the buffer's syntax table (the value returned by the functionsyntax-table; Syntax Table Functions). -
font-lock-syntactic-face-function - If this variable is non-
nil, it should be a function to determine which face to use for a given syntactic element (a string or a comment). The function is called with one argument, the parse state at point returned byparse-partial-sexp, and should return a face. The default value returnsfont-lock-comment-facefor comments andfont-lock-string-facefor strings (Faces for Font Lock). This variable is normally set through the "other" elements infont-lock-defaults:
(setq-local font-lock-defaults
`(,python-font-lock-keywords
nil nil nil
(font-lock-syntactic-face-function
. python-font-lock-syntactic-face-function)))Multiline Font Lock Constructs
Normally, elements of font-lock-keywords should not match across multiple lines; that doesn't work reliably, because Font Lock usually scans just part of the buffer, and it can miss a multi-line construct that crosses the line boundary where the scan starts. (The scan normally starts at the beginning of a line.) Making elements that match multiline constructs work properly has two aspects: correct identification and correct rehighlighting. The first means that Font Lock finds all multiline constructs. The second means that Font Lock will correctly rehighlight all the relevant text when a multiline construct is changedโfor example, if some of the text that was previously part of a multiline construct ceases to be part of it. The two aspects are closely related, and often getting one of them to work will appear to make the other also work. However, for reliable results you must attend explicitly to both aspects. There are three ways to ensure correct identification of multiline constructs:
- Add a function to
font-lock-extend-region-functionsthat does the identification and extends the scan so that the scanned text never starts or ends in the middle of a multiline construct. - Use the
font-lock-fontify-region-functionhook similarly to extend the scan so that the scanned text never starts or ends in the middle of a multiline construct. - Somehow identify the multiline construct right when it gets inserted into the buffer (or at any point after that but before font-lock tries to highlight it), and mark it with a
font-lock-multilinewhich will instruct font-lock not to start or end the scan in the middle of the construct.
There are several ways to do rehighlighting of multiline constructs:
- Place a
font-lock-multilineproperty on the construct. This will rehighlight the whole construct if any part of it is changed. In some cases you can do this automatically by setting thefont-lock-multilinevariable, which see. - Make sure
jit-lock-contextuallyis set and rely on it doing its job. This will only rehighlight the part of the construct that follows the actual change, and will do it after a short delay. This only works if the highlighting of the various parts of your multiline construct never depends on text in subsequent lines. Sincejit-lock-contextuallyis activated by default, this can be an attractive solution. - Place a
jit-lock-defer-multilineproperty on the construct. This works only ifjit-lock-contextuallyis used, and with the same delay before rehighlighting, but likefont-lock-multiline, it also handles the case where highlighting depends on subsequent lines. - If parsing the syntax of a construct depends on it being parsed in one single chunk, you can add the
syntax-multilinetext property over the construct in question. The most common use for this is when the syntax property to apply toFOOdepend on some later textBAR: By placing this text property over the whole ofFOO...BAR, you make sure that any change ofBARwill also cause the syntax property ofFOOto be recomputed. Note: For this to work, the mode needs to addsyntax-propertize-multilinetosyntax-propertize-extend-region-functions.
Font Lock Multiline
One way to ensure reliable rehighlighting of multiline Font Lock constructs is to put on them the text property font-lock-multiline. It should be present and non-nil for text that is part of a multiline construct. When Font Lock is about to highlight a range of text, it first extends the boundaries of the range as necessary so that they do not fall within text marked with the font-lock-multiline property. Then it removes any font-lock-multiline properties from the range, and highlights it. The highlighting specification (mostly font-lock-keywords) must reinstall this property each time, whenever it is appropriate. Warning: don't use the font-lock-multiline property on large ranges of text, because that will make rehighlighting slow.
-
font-lock-multiline - If the
font-lock-multilinevariable is set tot, Font Lock will try to add thefont-lock-multilineproperty automatically on multiline constructs. This is not a universal solution, however, since it slows down Font Lock somewhat. It can miss some multiline constructs, or make the property larger or smaller than necessary. For elements whose matcher is a function, the function should ensure that submatch 0 covers the whole relevant multiline construct, even if only a small subpart will be highlighted. It is often just as easy to add thefont-lock-multilineproperty by hand.
The font-lock-multiline property is meant to ensure proper refontification; it does not automatically identify new multiline constructs. Identifying them requires that Font Lock mode operate on large enough chunks at a time. This will happen by accident on many cases, which may give the impression that multiline constructs magically work. If you set the font-lock-multiline variable non-nil, this impression will be even stronger, since the highlighting of those constructs which are found will be properly updated from then on. But that does not work reliably. To find multiline constructs reliably, you must either manually place the font-lock-multiline property on the text before Font Lock mode looks at it, or use font-lock-fontify-region-function.
Region to Fontify after a Buffer Change
When a buffer is changed, the region that Font Lock refontifies is by default the smallest sequence of whole lines that spans the change. While this works well most of the time, sometimes it doesn'tโfor example, when a change alters the syntactic meaning of text on an earlier line. You can enlarge (or even reduce) the region to refontify by setting the following variable:
-
font-lock-extend-after-change-region-function - This buffer-local variable is either
nilor a function for Font Lock mode to call to determine the region to scan and fontify. The function is given three parameters, the standard beg, end, and old-len fromafter-change-functions(Change Hooks). It should return either a cons of the beginning and end buffer positions (in that order) of the region to fontify, ornil(which means choose the region in the standard way). This function needs to preserve point, the match-data, and the current restriction. The region it returns may start or end in the middle of a line. Since this function is called after every buffer change, it should be reasonably fast.
Parser-based Font Lock
Besides simple syntactic font lock and regexp-based font lock, Emacs also provides complete syntactic font lock with the help of a parser. Currently, Emacs uses the tree-sitter library (Parsing Program Source) for this purpose. Parser-based font lock and other font lock mechanisms are not mutually exclusive. By default, if enabled, parser-based font lock runs first, replacing syntactic font lock, followed by regexp-based font lock. Although parser-based font lock doesn't share the same customization variables with regexp-based font lock, it uses similar customization schemes. The tree-sitter counterpart of font-lock-keywords is treesit-font-lock-settings. In general, tree-sitter fontification works as follows:
- A Lisp program (usually, part of a major mode) provides a query consisting of patterns, each pattern associated with a capture name.
- The tree-sitter library finds the nodes in the parse tree that match these patterns, tags the nodes with the corresponding capture names, and returns them to the Lisp program.
- The Lisp program uses the returned nodes to highlight the portions of buffer text corresponding to each node as appropriate, using the tagged capture names of the nodes to determine the correct fontification. For example, a node tagged
font-lock-keywordwould be highlighted infont-lock-keywordface.
For more information about queries, patterns, and capture names, see Pattern Matching. To set up tree-sitter fontification, a major mode should first set treesit-font-lock-settings with the output of treesit-font-lock-rules, then call treesit-major-mode-setup.
-
treesit-font-lock-rules - This function is used to set
treesit-font-lock-settings. It takes care of compiling queries and other post-processing, and outputs a value thattreesit-font-lock-settingsaccepts. Here's an example:
(treesit-font-lock-rules :language 'javascript :feature 'constant :override t '((true) @font-lock-constant-face (false) @font-lock-constant-face) :language 'html :feature 'script "(script_element) @font-lock-builtin-face")
This function takes a series of query-spec/s, where each /query-spec is a query preceded by one or more keyword///value pairs. Each query is a tree-sitter query in either the string, s-expression, or compiled form. For each query, the keyword///value pairs that precede it add meta information to it. The :language keyword declares query's language. The :feature keyword sets the feature name of query. Users can control which features are enabled with treesit-font-lock-level and treesit-font-lock-feature-list (described below). These two keywords are mandatory (with exceptions). Other keywords are optional: @multitable @columnfractions .15 .15 .6 @headitem Keyword @tab Value @tab Description - :override @tab nil @tab If the region already has a face, discard the new face - @tab t @tab Always apply the new face - @tab append @tab Append the new face to existing ones - @tab prepend @tab Prepend the new face to existing ones - @tab keep @tab Fill-in regions without an existing face - :reversed @tab t @tab Enable query when feature is not in the feature list. - :default-language @tab language @tab Every query after this keyword will use language by default. @end multitable Lisp programs mark patterns in query with capture names (names that start with @), and tree-sitter will return matched nodes tagged with those same capture names. For the purpose of fontification, capture names in query should be face names like font-lock-keyword-face. The captured node will be fontified with that face. A capture name can also be a function name, in which case the function is called with 4 arguments: node and override, start and end, where node is the node itself, override is the :override property of the rule which captured this node, and start and end limit the region which this function should fontify. (If this function wants to respect the override argument, it can use treesit-fontify-with-override.) Beyond the 4 arguments presented, this function should accept more arguments as optional arguments for future extensibility. If a capture name is both a face and a function, the face takes priority. If a capture name is neither a face nor a function, it is ignored. Sometimes, to support different versions of the same grammar, it's useful to conditionally include some optional query, or choose the first valid query from a list of queries. Functions like treesit-query-with-optional and treesit-query-with-fallback can come in handy.
-
treesit-font-lock-feature-list - This is a list of lists of feature symbols. Each element of the list is a list that represents a decoration level. The
treesit-font-lock-leveluser option controls which levels are activated. Each element of the list is a list of the form(FEATURE ...), where each feature corresponds to the:featurevalue of a query defined intreesit-font-lock-rules. Removing a feature symbol from this list disables the corresponding query during font-lock. Common feature names, for many programming languages, includedefinition,type,assignment,builtin,constant,keyword,string-interpolation,comment,doc,string,operator,preprocessor,escape-sequence, andkey. Major modes are free to subdivide or extend these common features. Some of these features warrant some explanation:definitionhighlights whatever is being defined, e.g., the function name in a function definition, the struct name in a struct definition, the variable name in a variable definition;assignmenthighlights whatever is being assigned to, e.g., the variable or field in an assignment statement;keyhighlights keys in key-value pairs, e.g., keys in a JSON object or Python dictionary;dochighlights docstrings or doc-comments. For example, the value of this variable could be:
((comment string doc) ; level 1 (function-name keyword type builtin constant) ; level 2 (variable-name string-interpolation key)) ; level 3
Major modes should set this variable before calling treesit-major-mode-setup. For this variable to take effect, a Lisp program should call treesit-font-lock-recompute-features (which resets treesit-font-lock-settings accordingly), or treesit-major-mode-setup (which calls treesit-font-lock-recompute-features).
-
treesit-font-lock-settings - A list of settings for tree-sitter based font lock. The exact format of each individual setting is considered internal. One should always use
treesit-font-lock-rulesto set this variable. Even though the setting object is opaque, Emacs provides accessors for the setting's query, feature, enable flag and override flag:treesit-font-lock-setting-query,treesit-font-lock-setting-feature,treesit-font-lock-setting-enable,treesit-font-lock-setting-override,treesit-font-lock-setting-reversed.
Multi-language major modes should provide range functions in treesit-range-functions, and Emacs will set the ranges accordingly before fontifing a region (Multiple Languages).
Automatic Indentation of code
For programming languages, an important feature of a major mode is to provide automatic indentation. There are two parts: one is to decide what is the right indentation of a line, and the other is to decide when to reindent a line. By default, Emacs reindents a line whenever you type a character in electric-indent-chars, which by default only includes Newline. Major modes can add chars to electric-indent-chars according to the syntax of the language. Deciding what is the right indentation is controlled in Emacs by indent-line-function (Mode-Specific Indent). For some modes, the right indentation cannot be known reliably, typically because indentation is significant so several indentations are valid but with different meanings. In that case, the mode should set electric-indent-inhibit to make sure the line is not constantly re-indented against the user's wishes. Writing a good indentation function can be difficult and to a large extent it is still a black art. Many major mode authors will start by writing a simple indentation function that works for simple cases, for example by comparing with the indentation of the previous text line. For most programming languages that are not really line-based, this tends to scale very poorly: improving such a function to let it handle more diverse situations tends to become more and more difficult, resulting in the end with a large, complex, unmaintainable indentation function which nobody dares to touch. A good indentation function will usually need to actually parse the text, according to the syntax of the language. Luckily, it is not necessary to parse the text in as much detail as would be needed for a compiler, but on the other hand, the parser embedded in the indentation code will want to be somewhat friendly to syntactically incorrect code. Good maintainable indentation functions usually fall into two categories: either parsing forward from some safe starting point until the position of interest, or parsing backward from the position of interest. Neither of the two is a clearly better choice than the other: parsing backward is often more difficult than parsing forward because programming languages are designed to be parsed forward, but for the purpose of indentation it has the advantage of not needing to guess a safe starting point, and it generally enjoys the property that only a minimum of text will be analyzed to decide the indentation of a line, so indentation will tend to be less affected by syntax errors in some earlier unrelated piece of code. Parsing forward on the other hand is usually easier and has the advantage of making it possible to reindent efficiently a whole region at a time, with a single parse. Rather than write your own indentation function from scratch, it is often preferable to try and reuse some existing ones or to rely on a generic indentation engine. There are sadly few such engines. The CC-mode indentation code (used with C, C++, Java, Awk and a few other such modes) has been made more generic over the years, so if your language seems somewhat similar to one of those languages, you might try to use that engine. @c FIXME: documentation? Another one is SMIE which takes an approach in the spirit of Lisp sexps and adapts it to non-Lisp languages. Yet another one is to rely on a full-blown parser, for example, the tree-sitter library.
Simple Minded Indentation Engine
SMIE is a package that provides a generic navigation and indentation engine. Based on a very simple parser using an operator precedence grammar, it lets major modes extend the sexp-based navigation of Lisp to non-Lisp languages as well as provide a simple to use but reliable auto-indentation. Operator precedence grammar is a very primitive technology for parsing compared to some of the more common techniques used in compilers. It has the following characteristics: its parsing power is very limited, and it is largely unable to detect syntax errors, but it has the advantage of being algorithmically efficient and able to parse forward just as well as backward. In practice that means that SMIE can use it for indentation based on backward parsing, that it can provide both forward-sexp and backward-sexp functionality, and that it will naturally work on syntactically incorrect code without any extra effort. The downside is that it also means that most programming languages cannot be parsed correctly using SMIE, at least not without resorting to some special tricks (SMIE Tricks).
SMIE Setup and Features
SMIE is meant to be a one-stop shop for structural navigation and various other features which rely on the syntactic structure of code, in particular automatic indentation. The main entry point is smie-setup which is a function typically called while setting up a major mode.
-
smie-setup - Setup SMIE navigation and indentation. grammar is a grammar table generated by
smie-prec2->grammar. rules-function is a set of indentation rules for use onsmie-rules-function. keywords are additional arguments, which can include the following keywords: - ?
:forward-tokenfun: Specify the forward lexer to use.- ?
:backward-tokenfun: Specify the backward lexer to use.
Calling this function is sufficient to make commands such as forward-sexp, backward-sexp, and transpose-sexps be able to properly handle structural elements other than just the paired parentheses already handled by syntax tables. For example, if the provided grammar is precise enough, transpose-sexps can correctly transpose the two arguments of a + operator, taking into account the precedence rules of the language. Calling smie-setup is also sufficient to make TAB indentation work in the expected way, extends blink-matching-paren to apply to elements like begin...end, and provides some commands that you can bind in the major mode keymap.
-
Command smie-close-block - This command closes the most recently opened (and not yet closed) block.
-
Command smie-down-list - This command is like
down-listbut it also pays attention to nesting of tokens other than parentheses, such asbegin...end.
Operator Precedence Grammars
SMIE's precedence grammars simply give to each token a pair of precedences: the left-precedence and the right-precedence. We say T1 < T2 if the right-precedence of token T1 is less than the left-precedence of token T2. A good way to read this < is as a kind of parenthesis: if we find ... T1 something T2 ... then that should be parsed as ... T1 (something T2 ... rather than as ... T1 something) T2 .... The latter interpretation would be the case if we had T1 > T2. If we have T1 = T2, it means that token T2 follows token T1 in the same syntactic construction, so typically we have "begin" = "end". Such pairs of precedences are sufficient to express left-associativity or right-associativity of infix operators, nesting of tokens like parentheses and many other cases.
-
smie-prec2->grammar - This function takes a prec2 grammar table and returns an alist suitable for use in
smie-setup. The prec2 table is itself meant to be built by one of the functions below. -
smie-merge-prec2s - This function takes several prec2 tables and merges them into a new prec2 table.
-
smie-precs->prec2 - This function builds a prec2 table from a table of precedences precs. precs should be a list, sorted by precedence (for example
"+"will come before"*"), of elements of the form(ASSOC OP ...), where each op is a token that acts as an operator; assoc is their associativity, which can be eitherleft,right,assoc, ornonassoc. All operators in a given element share the same precedence level and associativity. -
smie-bnf->prec2 - This function lets you specify the grammar using a BNF notation. It accepts a bnf description of the grammar along with a set of conflict resolution rules resolvers, and returns a prec2 table. bnf is a list of nonterminal definitions of the form
(NONTERM RHS1 RHS2 ...)where each rhs is a (non-empty) list of terminals (aka tokens) or non-terminals. Not all grammars are accepted: - ?
- An rhs cannot be an empty list (an empty list is never needed, since SMIE allows all non-terminals to match the empty string anyway).
- ?
- An rhs cannot have 2 consecutive non-terminals: each pair of non-terminals needs to be separated by a terminal (aka token). This is a fundamental limitation of operator precedence grammars.
Additionally, conflicts can occur:
- The returned prec2 table holds constraints between pairs of tokens, and for any given pair only one constraint can be present: T1 < T2, T1 = T2, or T1 > T2.
- A token can be an
opener(something similar to an open-paren), acloser(like a close-paren), orneitherof the two (e.g., an infix operator, or an inner token like"else").
Precedence conflicts can be resolved via resolvers, which is a list of precs tables (see smie-precs->prec2): for each precedence conflict, if those precs tables specify a particular constraint, then the conflict is resolved by using this constraint instead, else a conflict is reported and one of the conflicting constraints is picked arbitrarily and the others are simply ignored.
Defining the Grammar of a Language
The usual way to define the SMIE grammar of a language is by defining a new global variable that holds the precedence table by giving a set of BNF rules. For example, the grammar definition for a small Pascal-like language could look like:
(require 'smie)
(defvar sample-smie-grammar
(smie-prec2->grammar
(smie-bnf->prec2
'((id)
(inst ("begin" insts "end")
("if" exp "then" inst "else" inst)
(id ":=" exp)
(exp))
(insts (insts ";" insts) (inst))
(exp (exp "+" exp)
(exp "*" exp)
("(" exps ")"))
(exps (exps "," exps) (exp)))
'((assoc ";"))
'((assoc ","))
'((assoc "+") (assoc "*")))))
A few things to note:
- The above grammar does not explicitly mention the syntax of function calls: SMIE will automatically allow any sequence of sexps, such as identifiers, balanced parentheses, or
begin ... endblocks to appear anywhere anyway. - The grammar category
idhas no right hand side: this does not mean that it can match only the empty string, since as mentioned any sequence of sexps can appear anywhere anyway. - Because non terminals cannot appear consecutively in the BNF grammar, it is difficult to correctly handle tokens that act as terminators, so the above grammar treats
";"as a statement separator instead, which SMIE can handle very well. - Separators used in sequences (such as
"and";"above) are best defined with BNF rules such as(foo (foo "separator" foo) ...)which generate precedence conflicts which are then resolved by giving them an explicit(assoc "separator"). - The
("(" exps ")")rule was not needed to pair up parens, since SMIE will pair up any characters that are marked as having paren syntax in the syntax table. What this rule does instead (together with the definition ofexps) is to make it clear that"should not appear outside of parentheses. - Rather than have a single precs table to resolve conflicts, it is preferable to have several tables, so as to let the BNF part of the grammar specify relative precedences where possible.
- Unless there is a very good reason to prefer
leftorright, it is usually preferable to mark operators as associative, usingassoc. For that reason"+"and"*"are defined above asassoc, although the language defines them formally as left associative.
Defining Tokens
SMIE comes with a predefined lexical analyzer which uses syntax tables in the following way: any sequence of characters that have word or symbol syntax is considered a token, and so is any sequence of characters that have punctuation syntax. This default lexer is often a good starting point but is rarely actually correct for any given language. For example, it will consider "2 to be composed of 3 tokens: "2", ", and "3". To describe the lexing rules of your language to SMIE, you need 2 functions, one to fetch the next token, and another to fetch the previous token. Those functions will usually first skip whitespace and comments and then look at the next chunk of text to see if it is a special token. If so it should skip the token and return a description of this token. Usually this is simply the string extracted from the buffer, but it can be anything you want. For example:
(defvar sample-keywords-regexp
(regexp-opt '("+" "*" "," ";" ">" ">=" "<" "<=" ":=" "=")))
(defun sample-smie-forward-token ()
(forward-comment (point-max))
(cond
((looking-at sample-keywords-regexp)
(goto-char (match-end 0))
(match-string-no-properties 0))
(t (buffer-substring-no-properties
(point)
(progn (skip-syntax-forward "w_")
(point))))))
(defun sample-smie-backward-token ()
(forward-comment (- (point)))
(cond
((looking-back sample-keywords-regexp (- (point) 2) t)
(goto-char (match-beginning 0))
(match-string-no-properties 0))
(t (buffer-substring-no-properties
(point)
(progn (skip-syntax-backward "w_")
(point))))))
Notice how those lexers return the empty string when in front of parentheses. This is because SMIE automatically takes care of the parentheses defined in the syntax table. More specifically if the lexer returns nil or an empty string, SMIE tries to handle the corresponding text as a sexp according to syntax tables.
Living With a Weak Parser
The parsing technique used by SMIE does not allow tokens to behave differently in different contexts. For most programming languages, this manifests itself by precedence conflicts when converting the BNF grammar. Sometimes, those conflicts can be worked around by expressing the grammar slightly differently. For example, for Modula-2 it might seem natural to have a BNF grammar that looks like this:
...
(inst ("IF" exp "THEN" insts "ELSE" insts "END")
("CASE" exp "OF" cases "END")
...)
(cases (cases "|" cases)
(caselabel ":" insts)
("ELSE" insts))
...
But this will create conflicts for "ELSE": on the one hand, the IF rule implies (among many other things) that "ELSE" = "END"; but on the other hand, since "ELSE" appears within cases, which appears left of "END", we also have "ELSE" > "END". We can solve the conflict either by using:
...
(inst ("IF" exp "THEN" insts "ELSE" insts "END")
("CASE" exp "OF" cases "END")
("CASE" exp "OF" cases "ELSE" insts "END")
...)
(cases (cases "|" cases) (caselabel ":" insts))
...
or
...
(inst ("IF" exp "THEN" else "END")
("CASE" exp "OF" cases "END")
...)
(else (insts "ELSE" insts))
(cases (cases "|" cases) (caselabel ":" insts) (else))
...
Reworking the grammar to try and solve conflicts has its downsides, tho, because SMIE assumes that the grammar reflects the logical structure of the code, so it is preferable to keep the BNF closer to the intended abstract syntax tree. Other times, after careful consideration you may conclude that those conflicts are not serious and simply resolve them via the resolvers argument of smie-bnf->prec2. Usually this is because the grammar is simply ambiguous: the conflict does not affect the set of programs described by the grammar, but only the way those programs are parsed. This is typically the case for separators and associative infix operators, where you want to add a resolver like '((assoc "|")). Another case where this can happen is for the classic dangling else problem, where you will use '((assoc "else" "then")). It can also happen for cases where the conflict is real and cannot really be resolved, but it is unlikely to pose a problem in practice. Finally, in many cases some conflicts will remain despite all efforts to restructure the grammar. Do not despair: while the parser cannot be made more clever, you can make the lexer as smart as you want. So, the solution is then to look at the tokens involved in the conflict and to split one of those tokens into 2 (or more) different tokens. E.g., if the grammar needs to distinguish between two incompatible uses of the token "begin", make the lexer return different tokens (say "begin-fun" and "begin-plain") depending on which kind of "begin" it finds. This pushes the work of distinguishing the different cases to the lexer, which will thus have to look at the surrounding text to find ad-hoc clues.
Specifying Indentation Rules
Based on the provided grammar, SMIE will be able to provide automatic indentation without any extra effort. But in practice, this default indentation style will probably not be good enough. You will want to tweak it in many different cases. SMIE indentation is based on the idea that indentation rules should be as local as possible. To this end, it relies on the idea of virtual indentation, which is the indentation that a particular program point would have if it were at the beginning of a line. Of course, if that program point is indeed at the beginning of a line, its virtual indentation is its current indentation. But if not, then SMIE uses the indentation algorithm to compute the virtual indentation of that point. Now in practice, the virtual indentation of a program point does not have to be identical to the indentation it would have if we inserted a newline before it. To see how this works, the SMIE rule for indentation after a { in C does not care whether the { is standing on a line of its own or is at the end of the preceding line. Instead, these different cases are handled in the indentation rule that decides how to indent before a {. Another important concept is the notion of parent: The parent of a token, is the head token of the nearest enclosing syntactic construct. For example, the parent of an else is the if to which it belongs, and the parent of an if, in turn, is the lead token of the surrounding construct. The command backward-sexp jumps from a token to its parent, but there are some caveats: for openers (tokens which start a construct, like if), you need to start with point before the token, while for others you need to start with point after the token. backward-sexp stops with point before the parent token if that is the opener of the token of interest, and otherwise it stops with point after the parent token. SMIE indentation rules are specified using a function that takes two arguments method and arg where the meaning of arg and the expected return value depend on method. method can be:
:after, in which case arg is a token and the function should return the offset to use for indentation after arg.:before, in which case arg is a token and the function should return the offset to use to indent arg itself.:elem, in which case the function should return either the offset to use to indent function arguments (if arg is the symbolargs) or the basic indentation step (if arg is the symbolbasic).:list-intro, in which case arg is a token and the function should return non-nilif the token is followed by a list of expressions (not separated by any token) rather than an expression.
When arg is a token, the function is called with point just before that token. A return value of nil always means to fallback on the default behavior, so the function should return nil for arguments it does not expect. offset can be:
nil: use the default indentation rule.(column . COLUMN): indent to column column.- number: offset by number, relative to a base token which is the current token for
:afterand its parent for:before.
Helper Functions for Indentation Rules
SMIE provides various functions designed specifically for use in the indentation rules function (several of those functions break if used in another context). These functions all start with the prefix smie-rule-.
-
smie-rule-bolp - Return non-
nilif the current token is the first on the line. -
smie-rule-hanging-p - Return non-
nilif the current token is hanging. A token is hanging if it is the last token on the line and if it is preceded by other tokens: a lone token on a line is not hanging. -
smie-rule-next-p - Return non-
nilif the next token is among tokens. -
smie-rule-prev-p - Return non-
nilif the previous token is among tokens. -
smie-rule-parent-p - Return non-
nilif the current token's parent is among parents. -
smie-rule-sibling-p - Return non-
nilif the current token's parent is actually a sibling. This is the case for example when the parent of a"is just the previous". -
smie-rule-parent - Return the proper offset to align the current token with the parent. If non-
nil, offset should be an integer giving an additional offset to apply. -
smie-rule-separator - Indent current token as a separator. By separator, we mean here a token whose sole purpose is to separate various elements within some enclosing syntactic construct, and which does not have any semantic significance in itself (i.e., it would typically not exist as a node in an abstract syntax tree). Such a token is expected to have an associative syntax and be closely tied to its syntactic parent. Typical examples are
"in lists of arguments (enclosed inside parentheses), or";"in sequences of instructions (enclosed in a{...}orbegin...endblock). method should be the method name that was passed tosmie-rules-function.
Sample Indentation Rules
Here is an example of an indentation function:
(defun sample-smie-rules (kind token)
(pcase (cons kind token)
(`(:elem . basic) sample-indent-basic)
(`(,_ . ",") (smie-rule-separator kind))
(`(:after . ":=") sample-indent-basic)
(`(:before . ,(or `"begin" `"(" `"{"))
(if (smie-rule-hanging-p) (smie-rule-parent)))
(`(:before . "if")
(and (not (smie-rule-bolp)) (smie-rule-prev-p "else")
(smie-rule-parent)))))
A few things to note:
- The first case indicates the basic indentation increment to use. If
sample-indent-basicisnil, then SMIE uses the global settingsmie-indent-basic. The major mode could have setsmie-indent-basicbuffer-locally instead, but that is discouraged. - The rule for the token
"make SMIE try to be more clever when the comma separator is placed at the beginning of lines. It tries to outdent the separator so as to align the code after the comma; for example: x = longfunctionname ( arg1 , arg2 ); - The rule for indentation after
":"= exists because otherwise SMIE would treat":"= as an infix operator and would align the right argument with the left one. - The rule for indentation before
"begin"is an example of the use of virtual indentation: This rule is used only when"begin"is hanging, which can happen only when"begin"is not at the beginning of a line. So this is not used when indenting"begin"itself but only when indenting something relative to this"begin". Concretely, this rule changes the indentation from: if x > 0 then begin dosomething(x); end to if x > 0 then begin dosomething(x); end - The rule for indentation before
"if"is similar to the one for"begin", but where the purpose is to treat"else if"as a single unit, so as to align a sequence of tests rather than indent each test further to the right. This function does this only in the case where the"if"is not placed on a separate line, hence thesmie-rule-bolptest. If we know that the"else"is always aligned with its"if"and is always at the beginning of a line, we can use a more efficient rule: ((equal token "if") (and (not (smie-rule-bolp)) (smie-rule-prev-p "else") (save-excursion (sample-smie-backward-token) (cons 'column (current-column))))) The advantage of this formulation is that it reuses the indentation of the previous"else", rather than going all the way back to the first"if"of the sequence.
Customizing Indentation
If you are using a mode whose indentation is provided by SMIE, you can customize the indentation to suit your preferences. You can do this on a per-mode basis (using the option smie-config), or a per-file basis (using the function smie-config-local in a file-local variable specification).
-
smie-config - This option lets you customize indentation on a per-mode basis. It is an alist with elements of the form
(MODE . RULES). For the precise form of rules, see the variable's documentation; but you may find it easier to use the commandsmie-config-guess. -
Command smie-config-guess - This command tries to work out appropriate settings to produce your preferred style of indentation. Simply call the command while visiting a file that is indented with your style.
-
Command smie-config-save - Call this command after using
smie-config-guess, to save your settings for future sessions. -
Command smie-config-show-indent - This command displays the rules that are used to indent the current line.
-
Command smie-config-set-indent - This command adds a local rule to adjust the indentation of the current line.
-
smie-config-local - This function adds rules as indentation rules for the current buffer. These add to any mode-specific rules defined by the
smie-configoption. To specify custom indentation rules for a specific file, add an entry to the file's local variables of the form:eval: (smie-config-local '(RULES)).
Parser-based Indentation
When built with the tree-sitter library (Parsing Program Source), Emacs is capable of parsing the program source and producing a syntax tree. This syntax tree can be used for guiding the program source indentation commands. For maximum flexibility, it is possible to write a custom indentation function that queries the syntax tree and indents accordingly for each language, but that is a lot of work. It is more convenient to use the simple indentation engine described below: then the major mode needs only write some indentation rules, and the engine takes care of the rest. To enable the parser-based indentation engine, set either treesit-simple-indent-rules or treesit-indent-function, then call treesit-major-mode-setup. (All that treesit-major-mode-setup does is set the value of indent-line-function to treesit-indent, and indent-region-function to treesit-indent-region.)
-
treesit-indent-function - This variable stores the actual function called by
treesit-indent. By default, its value istreesit-simple-indent. In the future we might add other, more complex indentation engines.
Writing indentation rules
-
treesit-simple-indent-rules - This local variable stores indentation rules for every language. It is an list of elements of the form
(LANGUAGE RULE...), where language is a language symbol, and each rule is either a list with elements of the form(MATCHER ANCHOR OFFSET), or a function. Here's the description of the list variant, followed by the function variant. First, Emacs passes the smallest tree-sitter node at the beginning of the current line to matcher; if it returns non-nil, this rule is applicable. Then Emacs passes the node to anchor, which returns a buffer position. Emacs takes the column number of that position, adds offset to it, and the result is the indentation column for the current line. The matcher and anchor are functions, and Emacs provides convenient defaults for them. Each matcher or anchor is a function that takes three arguments: node, parent, and bol. The argument bol is the buffer position whose indentation is required: the position of the first non-whitespace character after the beginning of the line. The argument node is the largest node that starts at that position (and is not a root node); and parent is the parent of node. However, when that position is in a whitespace or inside a multi-line string, no node can start at that position, so node isnil. In that case, parent would be the smallest node that spans that position. matcher should return non-nilif the rule is applicable, and anchor should return a buffer position. offset can be an integer, a variable whose value is an integer, or a function that returns an integer. If it is a function, it is passed node, parent, and bol, like matchers and anchors. If rule/is a function, it is useful for the complex cases where a rule needs to consider the matching rule and the anchor together. The /rule function is passed the same argument as matcher: node, parent, and bol. If it matches, rule should return a cons(ANCHOR-POS . OFFSET), where anchor-pos is a buffer position, and offset is the indent offset. If rule doesn't match, it should returnnil. -
treesit-simple-indent-presets - This is a list of defaults for matcher/s and /anchor/s in
treesit-simple-indent-rules. Each of them represents a function that takes 3 arguments: /node, parent, and bol. The available default functions are: -
no-node - This matcher is a function that is called with 3 arguments: node, parent, and bol. It returns non-
nil, indicating a match, if node isnil, i.e., there is no node that starts at bol. This is the case when bol is on an empty line or inside a multi-line string, etc. -
parent-is - This matcher is a function of one argument, type; it returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nil(i.e., a match) if parent's type matches regexp type. -
node-is - This matcher is a function of one argument, type; it returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif node's type matches regexp type. -
field-is - This matcher is a function of one argument, name; it returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif node's field name in parent matches regexp name. -
query - This matcher is a function of one argument, query; it returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif querying parent with query captures node (Pattern Matching). -
match - This matcher is a function of 5 arguments: node-type, parent-type, node-field, node-index-min, and node-index-max). It returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif node's type matches regexp node-type, parent's type matches regexp parent-type, node's field name in parent matches regexp node-field, and node's index among its siblings is between node-index-min and node-index-max. If the value of an argument isnil, this matcher doesn't check that argument. For example, to match the first child where parent isargument_list, use (match nil "argument_list" nil 0 0) In addition, node-type can be a special valuenull, which matches when the value of node isnil. -
n-p-gp - Short for "node-parent-grandparent", this matcher is a function of 3 arguments: node-type, parent-type, and grandparent-type. It returns a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif: (1) node-type matches node's type, and (2) parent-type matches parent's type, and (3) grandparent-type matches parent's parent's type. If any of node-type, parent-type, and grandparent-type isnil, this function doesn't check for it. -
comment-end - This matcher is a function that is called with 3 arguments: node, parent, and bol, and returns non-
nilif point is before a comment-ending token. Comment-ending tokens are defined by regexpcomment-end-skip. -
catch-all - This matcher is a function that is called with 3 arguments: node, parent, and bol. It always returns non-
nil, indicating a match. -
first-sibling - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of the first child of parent.
-
nth-sibling - This anchor is a function of two arguments: n, and an optional argument named. It returns a function that is called with 3 arguments: node, parent, and bol, and returns the start of the n/th child of /parent. If named is non-
nil, only named children are counted (named node). -
parent - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of parent.
-
grand-parent - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of parent's parent.
-
great-grand-parent - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of parent's parent's parent.
-
parent-bol - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the first non-space character on the line which parent's start is on.
-
standalone-parent - This anchor is a function that is called with 3 arguments: node, parent, and bol. It finds the first ancestor node (parent, grandparent, etc.) of node that starts on its own line, and return the start of that node. "Starting on its own line" means there is only whitespace character before the node on the line which the node's start is on. The exact definition of "Starting on its own line" can be relaxed by setting
treesit-simple-indent-standalone-predicate, some major mode might want to do that for easier indentation for method chaining. -
prev-sibling - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of the previous sibling of node.
-
no-indent - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the start of node.
-
prev-line - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the first non-whitespace character on the previous line.
-
column-0 - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the beginning of the current line, which is at column 0.
-
comment-start - This anchor is a function that is called with 3 arguments: node, parent, and bol, and returns the position after the comment-start token. Comment-start tokens are defined by regular expression
comment-start-skip. This function assumes parent is the comment node. -
prev-adaptive-prefix - This anchor is a function that is called with 3 arguments: node, parent, and bol. It tries to match
adaptive-fill-regexpto the text at the beginning of the previous non-empty line. If there is a match, this function returns the end of the match, otherwise it returnsnil. However, if the current line begins with a prefix (e.g.,-), return the beginning of the prefix of the previous line instead, so that the two prefixes align. This anchor is useful for anindent-relative-like indent behavior for block comments.
Indentation utilities
Here are some utility functions that can help writing parser-based indentation rules.
-
Command treesit-check-indent - This command checks the current buffer's indentation against major mode mode. It indents the current buffer according to mode and compares the results with the current indentation. Then it pops up a buffer showing the differences. Correct indentation (target) is shown in green color, current indentation is shown in red color. @c Are colors customizable? faces?
It is also helpful to use treesit-inspect-mode (Language Grammar) when writing indentation rules.
Desktop Save Mode
Desktop Save Mode is a feature to save the state of Emacs from one session to another. The user-level commands for using Desktop Save Mode are described in the GNU Emacs Manual (Saving Emacs Sessions). Modes whose buffers visit a file, don't have to do anything to use this feature. For buffers not visiting a file to have their state saved, the major mode must bind the buffer local variable desktop-save-buffer to a non-nil value.
-
desktop-save-buffer - If this buffer-local variable is non-
nil, the buffer will have its state saved in the desktop file at desktop save. If the value is a function, it is called at desktop save with argument desktop-dirname, and its value is saved in the desktop file along with the state of the buffer for which it was called. When file names are returned as part of the auxiliary information, they should be formatted using the call
(desktop-file-name FILE-NAME DESKTOP-DIRNAME)
For buffers not visiting a file to be restored, the major mode must define a function to do the job, and that function must be listed in the alist desktop-buffer-mode-handlers.
-
desktop-buffer-mode-handlers - Alist with elements
(MAJOR-MODE . RESTORE-BUFFER-FUNCTION)
The function restore-buffer-function will be called with argument list
(BUFFER-FILE-NAME BUFFER-NAME DESKTOP-BUFFER-MISC)
and it should return the restored buffer. Here desktop-buffer-misc is the value returned by the function optionally bound to desktop-save-buffer.