[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5. Character Set for Text

Text in Emacs buffers is a sequence of characters. In the simplest case, these are ASCII characters, each stored in one 8-bit byte. Both ASCII control characters (octal codes 000 through 037, and 0177) and ASCII printing characters (codes 040 through 0176) are allowed. The other modifier flags used in keyboard input, such as Meta, are not allowed in buffers.

Non-ASCII printing characters can also appear in buffers, when multibyte characters are enabled. They have character codes starting at 256, octal 0400, and each one is represented as a sequence of two or more bytes. See section International Character Set Support. Single-byte characters with codes 128 through 255 can also appear in multibyte buffers. However, non-ASCII control characters cannot appear in a buffer.

Some ASCII control characters serve special purposes in text, and have special names. For example, the newline character (octal code 012) is used in the buffer to end a line, and the tab character (octal code 011) is used for indenting to the next tab stop column (normally every 8 columns). See section How Text Is Displayed.

If you disable multibyte characters, then you can use only one alphabet of non-ASCII characters, which all fit in one byte. They use octal codes 0200 through 0377. See section Unibyte Editing Mode.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Mark Kaminski on July, 3 2008 using texi2html 1.70.