Интерактивная система просмотра системных руководств (man-ов)
groff_char (7)
>> groff_char (7) ( FreeBSD man: Макропакеты и соглашения )
groff_char (7) ( Linux man: Макропакеты и соглашения )
NAME
groff_char - groff glyph names
DESCRIPTION
This manual page lists the standard
groff
glyph names and the default input mapping, latin-1.
The glyphs in this document will look different depending
on which output device was chosen (with option
-T
for the
man(1)
program or the roff formatter).
Glyphs not available for the device that
is being used to print or view this manual page will be marked with
`(N/A)'.
In the actual version,
groff
provides only 8-bit characters for direct input and named entities
for further glyphs.
On ASCII platforms, input character codes in the range 0 to 127 (decimal)
represent the usual 7-bit ASCII characters, while codes between 127
and 255 are interpreted as the corresponding characters in the
Latin-1
(ISO-8859-1)
code set by default.
This mapping is contained in the file latin1.tmac
and can be changed by loading a different input encoding.
Note that some of the input characters are reserved by
groff,
either for internal use or for special input purposes.
On EBCDIC platforms, only code page
cp1047
is supported (which contains the same characters as Latin-1; the
input encoding file is called cp1047.tmac).
Again, some input characters are reserved for internal and special purposes.
It is rather straightforward (for the experienced user) to set up other
8-bit encodings like
Latin-2;
since
groff
will use Unicode in the next major version, no additional encodings
are provided.
All roff systems provide the concept of named glyphs.
In traditional roff systems, only names of length 2 were used, while
groff also provides support for longer names.
It is strongly suggested that only named glyphs are used for all
character representations outside of the printable 7-bit ASCII range.
Some of the predefined groff escape sequences (with names of length 1)
also produce single characters; these exist for historical reasons or
are printable versions of syntactical characters.
They include `\\', `\'', `\`', `\-',
`\.', and `\e'; see
groff(7).
In groff, all of these different types of characters and glyphs can be
tested positively with the `.if c' conditional.
REFERENCE
In this section, the glyphs in groff are specified in tabular
form.
The meaning of the columns is as follows.
Output
shows how the glyph is printed for the current device; although
this can have quite a different shape on other devices, it always
represents the same glyph.
Input name
specifies how the glyph is input either directly by a key on the
keyboard, or by a groff escape sequence.
Input code
applies to glyphs which can be input with a single character, and
gives the ISO Latin-1 decimal code of that input character.
Note that this code is equivalent to the lowest 256 Unicode characters,
including 7-bit ASCII in the range 0 to 127.
PostScript name
gives the usual PostScript name of the glyph.
Unicode decomposed
is the glyph name used in composite glyph names.
7-bit Character Codes 32-126
These are the basic glyphs having 7-bit ASCII code values assigned.
They are identical to the printable characters of the
character standards ISO-8859-1 (Latin-1) and Unicode (range
C0 Controls and Basic Latin).
The glyph names used in composite glyph names are `u0020' up to `u007E'.
Note that input characters in the range 0-31 and character 127 are
not
printable characters.
Most of them are invalid input characters for
groff
anyway, and the valid ones have special meaning.
For EBCDIC, the printable characters are in the range 66-255.
48-57
Decimal digits 0 to 9 (print as themselves).
65-90
Upper case letters A-Z (print as themselves).
97-122
Lower case letters a-z (print as themselves).
Most of the remaining characters not in the just described ranges print as
themselves; the only exceptions are the following characters:
`
the ISO Latin-1 `Grave Accent' (code 96) prints as `, a left single
quotation mark; the original character can be obtained with `\`'.
aq
the ISO Latin-1 `Apostrophe' (code 39) prints as ', a right single
quotation mark; the original character can be obtained with `\(aq'.
-
the ISO Latin-1 `Hyphen, Minus Sign' (code 45) prints as a hyphen; a
minus sign can be obtained with `\-'.
~
the ISO Latin-1 `Tilde' (code 126) is reduced in size to be usable as
a diacritic; a larger glyph can be obtained with `\(ti'.
^
the ISO Latin-1 `Circumflex Accent' (code 94) is reduced in size to be
usable as a diacritic; a larger glyph can be obtained with `\(ha'.
OutputInputInputPostScriptUnicodeNotes
namecodenamedecomposed
8-bit Character Codes 160 to 255
They are interpreted as printable characters according to the
Latin-1
(iso-8859-1)
code set, being identical to the Unicode range
C1 Controls and Latin-1 Supplement.
Input characters in range 128-159 (on non-EBCDIC hosts) are not printable
characters.
160
the ISO Latin-1
no-break space
is mapped to `\~', the stretchable space character.
173
the soft hyphen control character.
groff
never uses this character for output (thus it is omitted in the
table below); the input character 173 is mapped onto `\%'.
The remaining ranges (161-172, 174-255)
are printable characters that print as themselves.
Although they can be specified directly with the keyboard on systems
with a Latin-1 code page, it is better to use their glyph names;
see next section.
OutputInputInputPostScriptUnicodeNotes
namecodenamedecomposed
Named Glyphs
Glyph names can be embedded into the document text by using escape
sequences.
groff(7)
describes how these escape sequences look.
Glyph names can consist of quite arbitrary characters from the
ASCII or Latin-1 code set, not only alphanumeric characters.
Here some examples:
\c
A glyph having the name
c,
which consists of a single character (length 1).
\(ch
A glyph having the 2-character name
ch.
\[char_name]
A glyph having the name
char_name
(having length 1, 2, 3, ...).
\[base_glyph composite_1 composite_2 ...]
A composite glyph; see below for a more detailed description.
In groff, each 8-bit input character can also referred to by the construct
`\[charn]' where
n
is the decimal code of the character, a number between 0 and 255
without leading zeros (those entities are
not
glyph names).
They are normally mapped onto glyphs using the .trin request.
Another special convention is the handling of glyphs with names directly
derived from a Unicode code point; this is discussed below.
Moreover, new glyph names can be created by the .char request; see
groff(7).
In the following, a plus sign in the `Notes' column indicates that this
particular glyph name appears in the PS version of the original troff
documentation, CSTR 54.
OutputInputPostScriptUnicodeNotes
namenamedecomposed
Ligatures and Other Latin GlyphsAccented CharactersAccents
The
composite
request is used to map most of the accents to non-spacing glyph names;
the values given in parentheses are the original (spacing) ones.
OutputInputPostScriptUnicodeNotes
namenamedecomposed
QuotesPunctuationBrackets
The extensible bracket pieces are font-invariant glyphs.
In classical troff only one glyph was available to vertically extend
brackets, braces, and parentheses: `bv'.
We map it rather arbitrarily to u23AA.
Note that not all devices contain extensible bracket pieces which can
be piled up with `\b' due to the restrictions of the escape's
piling algorithm.
A general solution to build brackets out of pieces is the following
macro:
.\" Make a pile centered vertically 0.5em
.\" above the baseline.
.\" The first argument is placed at the top.
.\" The pile is returned in string `pile'
.eo
.de pile-make
. nr pile-wd 0
. nr pile-ht 0
. ds pile-args
.
. nr pile-# \n[.$]
. while \n[pile-#] \{\
. nr pile-wd (\n[pile-wd] >? \w'\$[\n[pile-#]]')
. nr pile-ht +(\n[rst] - \n[rsb])
. as pile-args \v'\n[rsb]u'\"
. as pile-args \Z'\$[\n[pile-#]]'\"
. as pile-args \v'-\n[rst]u'\"
. nr pile-# -1
. \}
.
. ds pile \v'(-0.5m + (\n[pile-ht]u / 2u))'\"
. as pile \*[pile-args]\"
. as pile \v'((\n[pile-ht]u / 2u) + 0.5m)'\"
. as pile \h'\n[pile-wd]u'\"
..
.ec
Another complication is the fact that some glyphs which represent bracket
pieces in original troff can be used for other mathematical symbols also,
for example `lf' and `rf' which provide the `floor' operator.
Other devices (most notably for DVI output) don't unify such glyphs.
For this reason, the four glyphs `lf', `rf', `lc', and `rc' are not
unified with similarly looking bracket pieces.
In
groff,
only glyphs with long names are guaranteed to pile up correctly for all
devices (provided those glyphs exist).
OutputInputPostScriptUnicodeNotes
namenamedecomposed
ArrowsLines
The font-invariant glyphs `br', `ul', and `rn' form corners;
they can be used to build boxes.
Note that both the PostScript and the Unicode-derived names of
these three glyphs are just rough approximations.
`rn' also serves in classical troff as the horizontal extension of the
square root sign.
`ru' is a font-invariant glyph, namely a rule of length 0.5m.
OutputInputPostScriptUnicodeNotes
namenamedecomposed
Text markersLegal SymbolsCurrency symbolsUnitsLogical SymbolsMathematical SymbolsGreek characters
These glyphs are intended for technical use, not for real Greek; normally,
the uppercase letters have upright shape, and the lowercase ones are
slanted.
There is a problem with the mapping of letter phi to Unicode.
Prior to Unicode version 3.0, the difference between U+03C6, GREEK
SMALL LETTER PHI, and U+03D5, GREEK PHI SYMBOL, was not clearly described;
only the glyph shapes in the Unicode book could be used as a reference.
Starting with Unicode 3.0, the reference glyphs have been exchanged and
described verbally also: In mathematical context, U+03D5 is the stroked
variant and U+03C5 the curly glyph.
Unfortunately, most font vendors didn't update their fonts to
this (incompatible) change in Unicode.
At the time of this writing (February 2003), it is not clear yet whether
the Adobe Glyph Names `phi' and `phi1' also change its meaning if used for
mathematics, thus compatibility problems are likely to happen - being
conservative, groff currently assumes that `phi' in a PostScript symbol
font is the stroked version.
In groff, symbol `\[*f]' always denotes the stroked version of
phi, and `\[+f]' the curly variant.
Card symbols
a short reference of the groff formatting language.
An extension to the troff character set for Europe,
E.G. Keizer, K.J. Simonsen, J. Akkerhuis; EUUG Newsletter, Volume 9,
No. 2, Summer 1989
The Unicode Standard <http://www.unicode.org>