Проект OpenNet: MAN iconv_ja (5) Форматы файлов (FreeBSD и Linux)

Интерактивная система просмотра системных руководств (man-ов)

iconv_ja (5)
>> iconv_ja (5) ( Solaris man: Форматы файлов )

NAME
     iconv_ja - code set conversions in ja locale

DESCRIPTION
     The following code set conversions are supported:



     ____________________________________________________________
    |               Code Set Conversions Supported              |
    |      Source Code     |             Target Code            |
    | eucJP                |  PCK                               |
    | eucJP                |  ISO-2022-JP                       |
    | eucJP                |  ISO-2022-JP.RFC1468               |
    | eucJP                |  JIS7                              |
    | eucJP                |  SJIS                              |
    | eucJP                |  UTF-8                             |
    | eucJP                |  UTF-8-Java                        |
    | eucJP                |  jis                               |
    | eucJP                |  ibmj                              |
    | eucJP                |  ibmj-EBCDIK                       |
    | SJIS                 |  eucJP                             |
    | SJIS                 |  ISO-2022-JP                       |
    | SJIS                 |  UTF-8                             |
    | SJIS                 |  jis                               |
    | SJIS                 |  ibmj                              |
    | PCK                  |  eucJP                             |
    | PCK                  |  UTF-8                             |
    | PCK                  |  UTF-8-Java                        |
    | PCK                  |  ISO-2022-JP                       |
    | PCK                  |  ISO-2022-JP.RFC1468               |
    | PCK                  |  jis                               |
    | PCK                  |  ibmj                              |
    | PCK                  |  ibmj-EBCDIK                       |
    | ISO-2022-JP          |  eucJP                             |
    | ISO-2022-JP          |  PCK                               |
    | ISO-2022-JP          |  SJIS                              |
    | ISO-2022-JP          |  UTF-8                             |
    | UTF-8                |  eucJP                             |
    | UTF-8                |  SJIS                              |
    | UTF-8                |  PCK                               |
    | UTF-8                |  ISO-2022-JP                       |
    | UTF-8                |  ISO-2022-JP.RFC1468               |
    | UTF-8-Java           |  eucJP                             |
    | UTF-8-Java           |  PCK                               |
    | JIS7                 |  eucJP                             |
    | jis                  |  eucJP                             |
    | jis                  |  PCK                               |
    | jis                  |  SJIS                              |
    | ibmj                 |  eucJP                             |
    | ibmj                 |  PCK                               |
    | ibmj                 |  SJIS                              |
    | ibmj-EBCDIK          |  eucJP                             |
    | ibmj-EBCDIK          |  PCK                               |
    |______________________|____________________________________|



     ____________________________________________________________
    |               Code Set Conversions Supported              |
    |     Source Code     |             Target Code             |
    | eucJP               | ibm930                              |
    | eucJP               | ibm931                              |
    | eucJP               | ibm939                              |
    | eucJP               | ibm5026                             |
    | eucJP               | ibm5035                             |
    | PCK                 | ibm930                              |
    | PCK                 | ibm931                              |
    | PCK                 | ibm939                              |
    | PCK                 | ibm5026                             |
    | PCK                 | ibm5035                             |
    | UTF-8               | ibm930                              |
    | UTF-8               | ibm931                              |
    | UTF-8               | ibm939                              |
    | UTF-8               | ibm5026                             |
    | UTF-8               | ibm5035                             |
    | UTF-8               | ms932                               |
    | UTF-8               | UTF-8-ms932                         |
    | UTF-8-ms932         | UTF-8                               |
    | ibm930              | eucJP                               |
    | ibm930              | PCK                                 |
    | ibm930              | UTF-8                               |
    | ibm931              | eucJP                               |
    | ibm931              | PCK                                 |
    | ibm931              | UTF-8                               |
    | ibm939              | eucJP                               |
    | ibm939              | PCK                                 |
    | ibm939              | UTF-8                               |
    | ibm5026             | eucJP                               |
    | ibm5026             | PCK                                 |
    | ibm5026             | UTF-8                               |
    | ibm5035             | eucJP                               |
    | ibm5035             | PCK                                 |
    | ibm5035             | UTF-8                               |
    | ms932               | UTF-8                               |
    |_____________________|_____________________________________|


     The descriptions of each code sets in the  above  table  are
     followings:



     ____________________________________________________________
                  Description of Supported Code Sets
                Codeset                     Description
      eucJP                         Japanese EUC
      PCK                           PC kanji
      SJIS                          the same as PC kanji (eol in
                                    future)
     ISO-2022-JP                   Coded representation of  the
                                   character  sets  ISO 646 IRV
                                   or JIS X 0201,  JIS X  0208,
                                   and  JIS X 0212 according to
                                   UI/OSF Application  Platform
                                   Profile     for     Japanese
                                   Environment Version 1.1 item
                                   7.1  using  the  designation
                                   sequence to G0 specified  by
                                   ISO 2022
    JIS7                          same as ISO-2022-JP
   ISO-2022-JP.RFC1468           Coded representation of  the
                                 character  sets  ISO 646 IRV
                                 or JIS X  0201-1976  (except
                                 for figure character set for
                                 katakana),     and     JIS X
                                 0208-1983    according    to
                                 RFC1468  (Request  for  Com-
                                 ments:   1468 Japanese Char-
                                 acter Encoding for  Internet
                                 Messages) using the designa-
                                 tion sequence to  G0  speci-
                                 fied by ISO 2022
  jis                           JIS 7bit code used  in  JLE,
                                JFP  2.4  and  the preceding
                                releases
 ibmj                          IBM Kanji code
 ibmj-EBCDIK                   Maps  single-byte  code  set
                               (SBCS)  of  IBM host code to
                               the character  set  that  is
                               called  the  EBCDIK code set
                               in general.   The  character
                               code  set  includes  the IBM
                               code  page  290  and  threee
                               more      characters     '`'
                               (0x79),'{' (0xc0),  and  '}'
                               (0xd0).   Japanese  katakana
                               characters are included, but
                               lowercase  alphabet  letters
                               are   not.    In   case   of
                               double-byte code set (DBCS),
                               the description is the  same
                               as the code set "ibmj."
 UTF-8                         UNI CODE
 UTF-8-Java                    UNI CODE implemented in Java

____________________________________________________________
                             |

     ____________________________________________________________
    |             Description of Supported Code Sets            |
    |      Codeset       |              Description             |
    | ibm930             | IBM CCSID 930: SBSC  code  page  290 |
    |                    | (extended), character set 1172, DBCS |
    |                    | code page 300,  character  set  1001 |
    |                    | 4370 user defined characters         |
    | ibm931             | IBM CCSID 931: SBSC  code  page  37, |
    |                    | character  set  101,  DBCS code page |
    |                    | 300, character set 1001              |
    |                    |  4370 user defined characters        |
    | ibm939             | IBM CCSID 930: SBSC code page  1027, |
    |                    | character  set  1172, DBCS code page |
    |                    | 300, character set  1001  4370  user |
    |                    | defined characters                   |
    | ibm5026            | IBM  CCSID  5026:  same  as  ibm930, |
    |                    | except  this  code set supports 1880 |
    |                    | user defined characters              |
    | ibm5035            | IBM  CCSID  5035:  same  as  ibm939, |
    |                    | except  this  code set supports 1880 |
    |                    | user defined characters              |
    | ms932              | Shift JIS codeset which is supported |
    |                    | by   Windows   NT  3.51.  Conversion |
    |                    | betwenn this codeset  and  UTF-8  is |
    |                    | done in the same way Windows NT 3.51 |
    |                    | does.                                |
    | UTF-8-ms932        | UTF-8 encoded Unicode which was con- |
    |                    | verted from ms932                    |
    |____________________|______________________________________|


     Conversions  are  performed  as  described  below.  For  all
     conversions,  if the source code set includes characters not
     included in the target code set, conversion and  output  for
     all  such characters will be done using a substitute charac-
     ter.

eucJP to PCK (SJIS) and PCK (SJIS) to eucJP
     Conversion between eucJP and PCK (SJIS) can be used to  con-
     vert  JIS X  0201,  JIS X 0208, JIS X 0212, and user-defined
     and vendor-defined characters based on TOG Japanese  Vendors
     Council (TOG/JVC) Recommended Code Set Conversion Specifica-
     tion  between Japanese EUC  and  Shift-JIS.  If  input  data
     which does not belong to the source code set is encountered,
     iconv(3C) will return EILSEQ for errno.  iconv(1)  stops  at
     the last point of successful conversion.

eucJP to ISO-2022-JP(JIS7) and ISO-2022-JP(JIS7) to eucJP

     Conversion between eucJP and ISO-2022-JP(JIS7) can  be  used
     to  convert  JIS X 0201, JIS X 0208 and JIS X 0212. If input
     data which does not belong to the source code set is encoun-
     tered,  iconv(3C)  will  return  EILSEQ  for errno. iconv(1)
     stops at the last point of successful conversion.

eucJP to ISO-2022-JP.RFC1468
     Conversion from eucJP to ISO-2022-JP.RFC1468 can be used  to
     convert  JIS X  0201  (except  for  figure character set for
     katakana)  and JIS X 0208. If JIS X 0201  (figure  character
     set  for katakana), JIS X 0212, a user-defined, or a vendor-
     defined character is encountered among input data,  it  will
     be  replaced  with the substitute character ` ? ' (0x3f). If
     input data which does not  belong  to  these  code  sets  is
     encountered,   iconv(3C)   will  return  EILSEQ  for  errno.
     iconv(1) stops at the last point of successful conversion.

eucJP to jis and jis to eucJP
     Conversion between eucJP and jis is provided for the  compa-
     tibility  with   ujtojis7()  and   jis7touj() libraries ,and
     euctojis and jistoeuc utilities. It is  extended  to  handle
     JIS X 0212. See jisconv(3X) and jistoeuc(1).

eucJP to UTF-8 and UTF-8 to eucJP
     Conversion between eucJP and UTF-8  can be used  to  convert
     JIS X  0201,  JIS X 0208, JIS X 0212,  a user-defined, and a
     vendor-defined character.  If input data which does not have
     the  corresponding  character   in  the  target  code set is
     encountered, it will be replaced with the substitute charac-
     ter  (eucJP:  `  ?  ' (0x3f), UTF-8: U+FFFD (0xefbfbd)).  If
     input data which does not  belong  to  these  code  sets  is
     encountered,   iconv(3C)   will  return  EILSEQ  for  errno.
     iconv(1) stops at the last point of successful conversion.

eucJP to UTF-8-Java and UTF-8-Java to eucJP
     Conversion between eucJP and UTF-8-Java can be used to  con-
     vert  JIS X  0201,  JIS X  0208,  and JIS X 0212. If a user-
     defined or vendor-defined  character  is  encountered  among
     input  data, it will be replaced with the substitute charac-
     ter (eucJP: ` ? '  (0x3f),  UTF-8:  U+FFFD  (0xefbfbd)).  If
     input  data  which  does   not  belong to these code sets is
     encountered,  iconv(3C)  will  return   EILSEQ  for   errno.
     iconv(1) stops at the last point of successful conversion.

eucJP to ibmj and ibmj to eucJP
     Conversion between eucJP and ibmj is based on the IBM  Kanji
     codebook   (4th  edition  - September 1987), JIS X 0201, and
     JIS X 0208-1983.  If you convert eucJP to ibmj,   JISX  0201
     and  JIS X  0201  are all converted to substitute character.
     See ibmjcode(3X).


eucJP to ibmj-EBCDIK and ibmj-EBCDIK to eucJP
     Conversion between eucJP and ibmj-EBCDIK is based on the IBM
     Kanji  codebook  (4th edition - September 1987), JIS X 0201,
     and JIS X 0208-1983.  If you convert eucJP  to  ibmj-EBCDIK,
     JISX  0201 and JIS X 0201 that have not correspondence char-
     acters with ibmj-EBCDIKare all converted to substitute char-
     acter.

PCK (SJIS) to ISO-2022-JP and ISO-2022-JP to PCK (SJIS)
     Conversion between PCK (SJIS) and ISO-2022-JP can be used to
     convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined
     and vendor-defined characters based on TOG Japanese  Vendors
     Council (TOG/JVC) Recommended Code Set Conversion Specifica-
     tion between Japanese EUC and Shift-JIS. If input data which
     does  not  belong  to  the  source  code set is encountered,
     iconv(3C) will return EILSEQ for errno.  iconv(1)  stops  at
     the last point of successful conversion.

PCK (SJIS) to ISO-2022-JP.RFC1468
     Conversion from PCK (SJIS)  to  ISO-2022-JP.RFC1468  can  be
     used  to convert JIS X 0201 (except for figure character set
     for katakana) and JIS X 0208. If JIS X 0201 (figure  charac-
     ter  set  for katakana), a user-defined, or a vendor-defined
     character is  encountered  among  input  data,  it  will  be
     replaced  with  the  substitute  character  ` ? ' (0x3f). If
     input data which does not  belong  to  these  code  sets  is
     encountered,   iconv(3C)   will  return  EILSEQ  for  errno.
     iconv(1) stops at the last point of successful conversion.

PCK (SJIS) to UTF-8 and UTF-8 to PCK (SJIS)
     Conversion between PCK (SJIS) and UTF-8 can be used to  con-
     vert  JIS X  0201, JIS X 0208, a user-defined, and a vendor-
     defined character. If input data which  does  not  have  the
     corresponding  character  in  the target code set is encoun-
     tered, it will be replaced  with  the  substitute  character
     (PCK: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data
     which does not belong to these  code  sets  is  encountered,
     iconv(3C)  will  return  EILSEQ for errno. iconv(1) stops at
     the last point of successful conversion.

PCK (SJIS) to UTF-8-Java and UTF-8-Java to PCK (SJIS)
     Conversion between PCK (SJIS) and UTF-8-Java can be used  to
     convert  JIS X  0201  and  JIS X  0208. If a user-defined or
     vendor-defined character is encountered among input data, it
     will  be  replaced with the substitute character (PCK: ` ? '
     (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which  does
     not belong to these code sets is encountered, iconv(3C) will
     return EILSEQ for errno. iconv(1) stops at the last point of
     successful conversion.

PCK (SJIS) to jis and jis to PCK (SJIS)

     Conversion between PCK (SJIS) and jis is  provided  for  the
     compatibility with sjtojis7() and jis7tosj() libraries , and
     sjtojis jistosj utilities.  It  is  extended  besed  on  TOG
     Japanese  Vendors  Council  (TOG/JVC)  Recommended  Code Set
     Conversion Specification between Japanese EUC and Shift-JIS.
     See jisconv(3X) and jistosj(1).

PCK (SJIS) to ibmj and ibmj to PCK (SJIS)
     Conversion between PCK (SJIS) and ibmj is based on  the  IBM
     Kanji  codebook  (4th edition - September 1987), JIS X 0201,
     and JIS X 0208-1983. If you convert PCK (SJIS) to ibmj,  all
     characters  converted to JIS X 0212 by kana characters (0xa1
     to 0xdf) and TOG Japanese Vendors Council  (TOG/JVC)  Recom-
     mended  Code  Set  Conversion Specification between Japanese
     EUC and Shift-JIS are all converted to substitute character.
     See ibmjcode(3X).

PCK to ibmj-EBCDIK and ibmj-EBCDIK to PCK
     Conversion between PCK and ibmj-EBCDIK is based on  the  IBM
     Kanji  codebook  (4th edition - September 1987), JIS X 0201,
     and JIS X 0208-1983. If you convert PCK to ibmj-EBCDIK,  all
     characters  converted  to  JIS X  0212 by JIS X 0212 and TOG
     Japanese Vendors  Council  (TOG/JVC)  Recommended  Code  Set
     Conversion  Specification between Japanese EUC and Shift-JIS
     are all converted to substitute character.

ISO-2022-JP to UTF-8 and UTF-8 to ISO-2022-JP
     Conversion between ISO-2022-JP and UTF-8 can be used to con-
     vert  JIS X  0201,  JIS X 0208, JIS X 0212, and user-defined
     and vendor-defined characters. If input data which does  not
     have  the  corresponding character in the target code set is
     encountered, it will be replaced with the substitute charac-
     ter  (ISO-2022-JP:  ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)).
     If input data which does not belong to these  code  sets  is
     encountered,   iconv(3C)   will  return  EILSEQ  for  errno.
     iconv(1) stops at the last point of successful conversion.

UTF-8 to ISO-2022-JP.RFC1468
     Conversion from UTF-8 to ISO-2022-JP.RFC1468 can be used  to
     convert  JIS X  0201  (except  for  figure character set for
     katakana) and JIS X 0208. If JIS X  0201  (figure  character
     set  for katakana), JIS X 0212, a user-defined, or a vendor-
     defined character is encountered among input data,  it  will
     be  replaced  with the substitute character ` ? ' (0x3f). If
     input data which does not  belong  to  these  code  sets  is
     encountered,   iconv(3C)   will  return  EILSEQ  for  errno.
     iconv(1) stops at the last point of successful conversion.

eucJP,PCK,UTF-8 to ibm930,ibm931,ibm939,ibm5026,ibm5035
     Conversion    from    eucJP,    PCK,     or     UTF-8     to
     ibm930,ibm931,ibm939, ibm5026,ibm5035 can be used to convert
     JIS  X  0201,  JIS  X  0208,  JIS  X  0212,  IBM   extension
     characters,  and  user  defined  character. Input data which
     does not have corresponding character in the target code set
     is replaced with the substitute character. Since ibm931 does
     not support Kana characters in  its  single  byte  code  set
     (SBCS), JIS X 0201 Kana characters are replaced with substi-
     tute characters in conversion to ibm931.

ibm930,ibm931,ibm939,ibm5026,ibm5035 to eucJP,PCK,UTF-8
     Conversion  from   ibm930,ibm931,ibm939,ibm5026,ibm5035   to
     eucJP,PCK, UTF-8 can be used to convert SBCS/DBCS characters
     defined in input code set. Input data which  does  not  have
     corresponding  character  in the target code set is replaced
     with the substitute character.

ms932 to UTF-8 and UTF-8 to ms932
     Conversion between  ms932  and  UTF-8  is  done  using  same
     way  of  mapping characters between the two codesets as Win-
     dows NT 3.51 does.

UTF-8 to UTF-8-ms932 and UTF-8-ms932 to UTF-8
     This converts between  "UTF-8" and "UTF-8-ms932", which  are
     UTF-8 encoded Unicode converted from PCK, and that converted
     from ms932.

SEE ALSO
     iconv(1), jistoeuc(1), jistosj(1),  iconv(3C),  jisconv(3X),
     ibmjcode(3X),         iconv(5),        iconv_en_US.UTF-8(5),
     iconv_unicode(5)
Партнёры:
Хостинг:
Закладки на сайте
Проследить за страницей
Created 1996-2026 by Maxim Chirkov
Добавить, Поддержать, Вебмастеру