An input method is a kind of character conversion designed specifically for interactive input. This section describes input methods that come with Emacs; for native input methods provided by the underlying OS, see Unibyte Editing Mode.
In Emacs, typically each language has its own input method; sometimes several languages that use the same characters can share one input method. A few languages support several input methods.
The simplest kind of input method works by mapping ASCII letters into another alphabet; this allows you to use one other alphabet instead of ASCII. The Greek and Russian input methods work this way.
A more powerful technique is composition: converting sequences of characters into one letter. Many European input methods use composition to produce a single non-ASCII letter from a sequence that consists of a letter followed by accent characters (or vice versa). For example, some methods convert the sequence o ^ into a single accented letter. These input methods have no special commands of their own; all they do is compose sequences of printing characters.
The input methods for syllabic scripts typically use mapping followed by composition. The input methods for Thai and Korean work this way. First, letters are mapped into symbols for particular sounds or tone marks; then, sequences of these that make up a whole syllable are mapped into one syllable sign.
Chinese and Japanese require more complex methods. In Chinese input
methods, first you enter the phonetic spelling of a Chinese word (in
input method chinese-py
, among others), or a sequence of
portions of the character (input methods chinese-4corner
and
chinese-sw
, and others). One input sequence typically
corresponds to many possible Chinese characters. You select the one
you mean using keys such as C-f, C-b, C-n,
C-p (or the arrow keys), and digits, which have special meanings
in this situation.
The possible characters are conceptually arranged in several rows,
with each row holding up to 10 alternatives. Normally, Emacs displays
just one row at a time, in the echo area; (i/j)
appears at the beginning, to indicate that this is the ith row
out of a total of j rows. Type C-n or C-p to
display the next row or the previous row.
Type C-f and C-b to move forward and backward among the alternatives in the current row. As you do this, Emacs highlights the current alternative with a special color; type C-SPC to select the current alternative and use it as input. The alternatives in the row are also numbered; the number appears before the alternative. Typing a number selects the associated alternative of the current row and uses it as input.
TAB in these Chinese input methods displays a buffer showing all the possible characters at once; then clicking mouse-2 on one of them selects that alternative. The keys C-f, C-b, C-n, C-p, and digits continue to work as usual, but they do the highlighting in the buffer showing the possible characters, rather than in the echo area.
To enter characters according to the pīnyīn transliteration
method instead, use the chinese-sisheng
input method. This is
a composition based method, where e.g. pi1 results in ‘pī’.
In Japanese input methods, first you input a whole word using phonetic spelling; then, after the word is in the buffer, Emacs converts it into one or more characters using a large dictionary. One phonetic spelling corresponds to a number of different Japanese words; to select one of them, use C-n and C-p to cycle through the alternatives.
Sometimes it is useful to cut off input method processing so that the
characters you have just entered will not combine with subsequent
characters. For example, in input method latin-1-postfix
, the
sequence o ^ combines to form an ‘o’ with an accent. What if
you want to enter them as separate characters?
One way is to type the accent twice; this is a special feature for
entering the separate letter and accent. For example, o ^ ^ gives
you the two characters ‘o^’. Another way is to type another letter
after the o—something that won’t combine with that—and
immediately delete it. For example, you could type o o DEL
^ to get separate ‘o’ and ‘^’. Another method, more
general but not quite as easy to type, is to use C-\ C-\ between
two characters to stop them from combining. This is the command
C-\ (toggle-input-method
) used twice.
See Selecting an Input Method.
C-\ C-\ is especially useful inside an incremental search, because it stops waiting for more characters to combine, and starts searching for what you have already entered.
To find out how to input the character after point using the current input method, type C-u C-x =. See Cursor Position Information.
The variables input-method-highlight-flag
and
input-method-verbose-flag
control how input methods explain
what is happening. If input-method-highlight-flag
is
non-nil
, the partial sequence is highlighted in the buffer (for
most input methods—some disable this feature). If
input-method-verbose-flag
is non-nil
, the list of
possible characters to type next is displayed in the echo area (but
not when you are in the minibuffer).
You can modify how an input method works by making your changes in a
function that you add to the hook variable quail-activate-hook
.
See Hooks. For example, you can redefine some of the input
method’s keys by defining key bindings in the keymap returned by the
function quail-translation-keymap
, using define-key
.
See Rebinding Keys in Your Init File.
Input methods are inhibited when the text in the buffer is read-only
for some reason. This is so single-character key bindings work in
modes that make buffer text or parts of it read-only, such as
read-only-mode
and image-mode
, even when an input method
is active.
Another facility for typing characters not on your keyboard is by
using C-x 8 RET (insert-char
) to insert a single
character based on its Unicode name or code-point; see Inserting Text.
There are specialized commands for inserting Emoji, and these can be
found on the C-x 8 e keymap. C-x 8 e e
(emoji-insert
) will let you navigate through different Emoji
categories and then choose one. C-x 8 e l (emoji-list
)
will pop up a new buffer and list all the Emoji; clicking (or using
RET) on an emoji character will insert it in the current buffer.
Finally, C-x 8 e s (emoji-search
) will allow you to
search for Emoji based on their names.
describe-char
displays a lot of information about the
character/glyphs under point (including emojis). It’s sometimes
useful to get a quick description of the name, and you can use the
C-x 8 e d (emoji-describe
) command to do that. It’s
meant primarily to help distinguish between different Emoji
variants (which can look very similar), but it will also tell you
the names of non-Emoji characters.