NAME
iconv_ja - code set conversions in ja locale
DESCRIPTION
The following code set conversions are supported:
____________________________________________________________
| Code Set Conversions Supported |
| Source Code | Target Code |
| eucJP | PCK |
| eucJP | ISO-2022-JP |
| eucJP | ISO-2022-JP.RFC1468 |
| eucJP | JIS7 |
| eucJP | SJIS |
| eucJP | UTF-8 |
| eucJP | UTF-8-Java |
| eucJP | jis |
| eucJP | ibmj |
| eucJP | ibmj-EBCDIK |
| SJIS | eucJP |
| SJIS | ISO-2022-JP |
| SJIS | UTF-8 |
| SJIS | jis |
| SJIS | ibmj |
| PCK | eucJP |
| PCK | UTF-8 |
| PCK | UTF-8-Java |
| PCK | ISO-2022-JP |
| PCK | ISO-2022-JP.RFC1468 |
| PCK | jis |
| PCK | ibmj |
| PCK | ibmj-EBCDIK |
| ISO-2022-JP | eucJP |
| ISO-2022-JP | PCK |
| ISO-2022-JP | SJIS |
| ISO-2022-JP | UTF-8 |
| UTF-8 | eucJP |
| UTF-8 | SJIS |
| UTF-8 | PCK |
| UTF-8 | ISO-2022-JP |
| UTF-8 | ISO-2022-JP.RFC1468 |
| UTF-8-Java | eucJP |
| UTF-8-Java | PCK |
| JIS7 | eucJP |
| jis | eucJP |
| jis | PCK |
| jis | SJIS |
| ibmj | eucJP |
| ibmj | PCK |
| ibmj | SJIS |
| ibmj-EBCDIK | eucJP |
| ibmj-EBCDIK | PCK |
|______________________|____________________________________|
____________________________________________________________
| Code Set Conversions Supported |
| Source Code | Target Code |
| eucJP | ibm930 |
| eucJP | ibm931 |
| eucJP | ibm939 |
| eucJP | ibm5026 |
| eucJP | ibm5035 |
| PCK | ibm930 |
| PCK | ibm931 |
| PCK | ibm939 |
| PCK | ibm5026 |
| PCK | ibm5035 |
| UTF-8 | ibm930 |
| UTF-8 | ibm931 |
| UTF-8 | ibm939 |
| UTF-8 | ibm5026 |
| UTF-8 | ibm5035 |
| UTF-8 | ms932 |
| UTF-8 | UTF-8-ms932 |
| UTF-8-ms932 | UTF-8 |
| ibm930 | eucJP |
| ibm930 | PCK |
| ibm930 | UTF-8 |
| ibm931 | eucJP |
| ibm931 | PCK |
| ibm931 | UTF-8 |
| ibm939 | eucJP |
| ibm939 | PCK |
| ibm939 | UTF-8 |
| ibm5026 | eucJP |
| ibm5026 | PCK |
| ibm5026 | UTF-8 |
| ibm5035 | eucJP |
| ibm5035 | PCK |
| ibm5035 | UTF-8 |
| ms932 | UTF-8 |
|_____________________|_____________________________________|
The descriptions of each code sets in the above table are
followings:
____________________________________________________________
Description of Supported Code Sets
Codeset Description
eucJP Japanese EUC
PCK PC kanji
SJIS the same as PC kanji (eol in
future)
ISO-2022-JP Coded representation of the
character sets ISO 646 IRV
or JIS X 0201, JIS X 0208,
and JIS X 0212 according to
UI/OSF Application Platform
Profile for Japanese
Environment Version 1.1 item
7.1 using the designation
sequence to G0 specified by
ISO 2022
JIS7 same as ISO-2022-JP
ISO-2022-JP.RFC1468 Coded representation of the
character sets ISO 646 IRV
or JIS X 0201-1976 (except
for figure character set for
katakana), and JIS X
0208-1983 according to
RFC1468 (Request for Com-
ments: 1468 Japanese Char-
acter Encoding for Internet
Messages) using the designa-
tion sequence to G0 speci-
fied by ISO 2022
jis JIS 7bit code used in JLE,
JFP 2.4 and the preceding
releases
ibmj IBM Kanji code
ibmj-EBCDIK Maps single-byte code set
(SBCS) of IBM host code to
the character set that is
called the EBCDIK code set
in general. The character
code set includes the IBM
code page 290 and threee
more characters '`'
(0x79),'{' (0xc0), and '}'
(0xd0). Japanese katakana
characters are included, but
lowercase alphabet letters
are not. In case of
double-byte code set (DBCS),
the description is the same
as the code set "ibmj."
UTF-8 UNI CODE
UTF-8-Java UNI CODE implemented in Java
____________________________________________________________
|
____________________________________________________________
| Description of Supported Code Sets |
| Codeset | Description |
| ibm930 | IBM CCSID 930: SBSC code page 290 |
| | (extended), character set 1172, DBCS |
| | code page 300, character set 1001 |
| | 4370 user defined characters |
| ibm931 | IBM CCSID 931: SBSC code page 37, |
| | character set 101, DBCS code page |
| | 300, character set 1001 |
| | 4370 user defined characters |
| ibm939 | IBM CCSID 930: SBSC code page 1027, |
| | character set 1172, DBCS code page |
| | 300, character set 1001 4370 user |
| | defined characters |
| ibm5026 | IBM CCSID 5026: same as ibm930, |
| | except this code set supports 1880 |
| | user defined characters |
| ibm5035 | IBM CCSID 5035: same as ibm939, |
| | except this code set supports 1880 |
| | user defined characters |
| ms932 | Shift JIS codeset which is supported |
| | by Windows NT 3.51. Conversion |
| | betwenn this codeset and UTF-8 is |
| | done in the same way Windows NT 3.51 |
| | does. |
| UTF-8-ms932 | UTF-8 encoded Unicode which was con- |
| | verted from ms932 |
|____________________|______________________________________|
Conversions are performed as described below. For all
conversions, if the source code set includes characters not
included in the target code set, conversion and output for
all such characters will be done using a substitute charac-
ter.
eucJP to PCK (SJIS) and PCK (SJIS) to eucJP
Conversion between eucJP and PCK (SJIS) can be used to con-
vert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined
and vendor-defined characters based on TOG Japanese Vendors
Council (TOG/JVC) Recommended Code Set Conversion Specifica-
tion between Japanese EUC and Shift-JIS. If input data
which does not belong to the source code set is encountered,
iconv(3C) will return EILSEQ for errno. iconv(1) stops at
the last point of successful conversion.
eucJP to ISO-2022-JP(JIS7) and ISO-2022-JP(JIS7) to eucJP
Conversion between eucJP and ISO-2022-JP(JIS7) can be used
to convert JIS X 0201, JIS X 0208 and JIS X 0212. If input
data which does not belong to the source code set is encoun-
tered, iconv(3C) will return EILSEQ for errno. iconv(1)
stops at the last point of successful conversion.
eucJP to ISO-2022-JP.RFC1468
Conversion from eucJP to ISO-2022-JP.RFC1468 can be used to
convert JIS X 0201 (except for figure character set for
katakana) and JIS X 0208. If JIS X 0201 (figure character
set for katakana), JIS X 0212, a user-defined, or a vendor-
defined character is encountered among input data, it will
be replaced with the substitute character ` ? ' (0x3f). If
input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
eucJP to jis and jis to eucJP
Conversion between eucJP and jis is provided for the compa-
tibility with ujtojis7() and jis7touj() libraries ,and
euctojis and jistoeuc utilities. It is extended to handle
JIS X 0212. See jisconv(3X) and jistoeuc(1).
eucJP to UTF-8 and UTF-8 to eucJP
Conversion between eucJP and UTF-8 can be used to convert
JIS X 0201, JIS X 0208, JIS X 0212, a user-defined, and a
vendor-defined character. If input data which does not have
the corresponding character in the target code set is
encountered, it will be replaced with the substitute charac-
ter (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If
input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
eucJP to UTF-8-Java and UTF-8-Java to eucJP
Conversion between eucJP and UTF-8-Java can be used to con-
vert JIS X 0201, JIS X 0208, and JIS X 0212. If a user-
defined or vendor-defined character is encountered among
input data, it will be replaced with the substitute charac-
ter (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If
input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
eucJP to ibmj and ibmj to eucJP
Conversion between eucJP and ibmj is based on the IBM Kanji
codebook (4th edition - September 1987), JIS X 0201, and
JIS X 0208-1983. If you convert eucJP to ibmj, JISX 0201
and JIS X 0201 are all converted to substitute character.
See ibmjcode(3X).
eucJP to ibmj-EBCDIK and ibmj-EBCDIK to eucJP
Conversion between eucJP and ibmj-EBCDIK is based on the IBM
Kanji codebook (4th edition - September 1987), JIS X 0201,
and JIS X 0208-1983. If you convert eucJP to ibmj-EBCDIK,
JISX 0201 and JIS X 0201 that have not correspondence char-
acters with ibmj-EBCDIKare all converted to substitute char-
acter.
PCK (SJIS) to ISO-2022-JP and ISO-2022-JP to PCK (SJIS)
Conversion between PCK (SJIS) and ISO-2022-JP can be used to
convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined
and vendor-defined characters based on TOG Japanese Vendors
Council (TOG/JVC) Recommended Code Set Conversion Specifica-
tion between Japanese EUC and Shift-JIS. If input data which
does not belong to the source code set is encountered,
iconv(3C) will return EILSEQ for errno. iconv(1) stops at
the last point of successful conversion.
PCK (SJIS) to ISO-2022-JP.RFC1468
Conversion from PCK (SJIS) to ISO-2022-JP.RFC1468 can be
used to convert JIS X 0201 (except for figure character set
for katakana) and JIS X 0208. If JIS X 0201 (figure charac-
ter set for katakana), a user-defined, or a vendor-defined
character is encountered among input data, it will be
replaced with the substitute character ` ? ' (0x3f). If
input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
PCK (SJIS) to UTF-8 and UTF-8 to PCK (SJIS)
Conversion between PCK (SJIS) and UTF-8 can be used to con-
vert JIS X 0201, JIS X 0208, a user-defined, and a vendor-
defined character. If input data which does not have the
corresponding character in the target code set is encoun-
tered, it will be replaced with the substitute character
(PCK: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data
which does not belong to these code sets is encountered,
iconv(3C) will return EILSEQ for errno. iconv(1) stops at
the last point of successful conversion.
PCK (SJIS) to UTF-8-Java and UTF-8-Java to PCK (SJIS)
Conversion between PCK (SJIS) and UTF-8-Java can be used to
convert JIS X 0201 and JIS X 0208. If a user-defined or
vendor-defined character is encountered among input data, it
will be replaced with the substitute character (PCK: ` ? '
(0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does
not belong to these code sets is encountered, iconv(3C) will
return EILSEQ for errno. iconv(1) stops at the last point of
successful conversion.
PCK (SJIS) to jis and jis to PCK (SJIS)
Conversion between PCK (SJIS) and jis is provided for the
compatibility with sjtojis7() and jis7tosj() libraries , and
sjtojis jistosj utilities. It is extended besed on TOG
Japanese Vendors Council (TOG/JVC) Recommended Code Set
Conversion Specification between Japanese EUC and Shift-JIS.
See jisconv(3X) and jistosj(1).
PCK (SJIS) to ibmj and ibmj to PCK (SJIS)
Conversion between PCK (SJIS) and ibmj is based on the IBM
Kanji codebook (4th edition - September 1987), JIS X 0201,
and JIS X 0208-1983. If you convert PCK (SJIS) to ibmj, all
characters converted to JIS X 0212 by kana characters (0xa1
to 0xdf) and TOG Japanese Vendors Council (TOG/JVC) Recom-
mended Code Set Conversion Specification between Japanese
EUC and Shift-JIS are all converted to substitute character.
See ibmjcode(3X).
PCK to ibmj-EBCDIK and ibmj-EBCDIK to PCK
Conversion between PCK and ibmj-EBCDIK is based on the IBM
Kanji codebook (4th edition - September 1987), JIS X 0201,
and JIS X 0208-1983. If you convert PCK to ibmj-EBCDIK, all
characters converted to JIS X 0212 by JIS X 0212 and TOG
Japanese Vendors Council (TOG/JVC) Recommended Code Set
Conversion Specification between Japanese EUC and Shift-JIS
are all converted to substitute character.
ISO-2022-JP to UTF-8 and UTF-8 to ISO-2022-JP
Conversion between ISO-2022-JP and UTF-8 can be used to con-
vert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined
and vendor-defined characters. If input data which does not
have the corresponding character in the target code set is
encountered, it will be replaced with the substitute charac-
ter (ISO-2022-JP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)).
If input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
UTF-8 to ISO-2022-JP.RFC1468
Conversion from UTF-8 to ISO-2022-JP.RFC1468 can be used to
convert JIS X 0201 (except for figure character set for
katakana) and JIS X 0208. If JIS X 0201 (figure character
set for katakana), JIS X 0212, a user-defined, or a vendor-
defined character is encountered among input data, it will
be replaced with the substitute character ` ? ' (0x3f). If
input data which does not belong to these code sets is
encountered, iconv(3C) will return EILSEQ for errno.
iconv(1) stops at the last point of successful conversion.
eucJP,PCK,UTF-8 to ibm930,ibm931,ibm939,ibm5026,ibm5035
Conversion from eucJP, PCK, or UTF-8 to
ibm930,ibm931,ibm939, ibm5026,ibm5035 can be used to convert
JIS X 0201, JIS X 0208, JIS X 0212, IBM extension
characters, and user defined character. Input data which
does not have corresponding character in the target code set
is replaced with the substitute character. Since ibm931 does
not support Kana characters in its single byte code set
(SBCS), JIS X 0201 Kana characters are replaced with substi-
tute characters in conversion to ibm931.
ibm930,ibm931,ibm939,ibm5026,ibm5035 to eucJP,PCK,UTF-8
Conversion from ibm930,ibm931,ibm939,ibm5026,ibm5035 to
eucJP,PCK, UTF-8 can be used to convert SBCS/DBCS characters
defined in input code set. Input data which does not have
corresponding character in the target code set is replaced
with the substitute character.
ms932 to UTF-8 and UTF-8 to ms932
Conversion between ms932 and UTF-8 is done using same
way of mapping characters between the two codesets as Win-
dows NT 3.51 does.
UTF-8 to UTF-8-ms932 and UTF-8-ms932 to UTF-8
This converts between "UTF-8" and "UTF-8-ms932", which are
UTF-8 encoded Unicode converted from PCK, and that converted
from ms932.
SEE ALSO
iconv(1), jistoeuc(1), jistosj(1), iconv(3C), jisconv(3X),
ibmjcode(3X), iconv(5), iconv_en_US.UTF-8(5),
iconv_unicode(5)
|
Закладки на сайте Проследить за страницей |
Created 1996-2025 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |