NAME
mdicm - main dictionary maintainance tool for Kana-Kanji
converion
SYNOPSIS
mdicm [ command ] [ arguments ]
AVAILABILITY
SUNWjc0u
DESCRIPTION
mdicm maintenances a main dictionary for cs00 Kana-Kanji
conversion. It has also some functions for an user diction-
ary which is used for maintenance of a main dictionary.
Started with command, mdicm runs in non-interactive mode.
Started without command, mdicm shows the prompt mdicm> and
runs in interactive mode. With `quit' command, you can ter-
minate mdicm. See "COMMANDS and ARGUMENTS" section below for
more detail.
COMMANDS and ARGUMENTS
Each function of mdicm will be performed by specifying a
command and some arguments. mdicm support the following com-
mands;
mshow, show, ladd, ldel, cat, create, extract,
quit, hinshi, ?, help
How to specify each command is listed below. In the follow-
ing, main_dict means a main dictionary, and user_dict, an
user dictionary. See "File Formats" section below about
synopsis, word, reading and part-of speech symbols of a
words-list file.
filename ] [ -k ]
mshow main_dict user_dict [ -s start ] [ -
e end ] [ -f
Display contents of the main dictionary. Read the
words registered in specified area, format them in a
words-list, and put it to the specified file. Argu-
ments are as follows.
-s start
Specify the reading of a word at which start to
list. Without this argument, from the first
word it will be started.
-e end
Specify the reading of a word at which end to
list. Without this argument, at the last word,
it will be done.
-f filename
Specify the words-list file to which the results
are put. Without this argument, it will be put
into standard output.
-k Print the part of speech information in the name
of part-of-speech.
See mdicm(1) in ja locale for an example.
show main_dict user_dict [ -s start ] [ -e end ]
[ -f filename ] [ -k ]
Display contents of the user dictionary. Read the
words registered in specified area, format them in a
words-list, and put it to the specified file. Argu-
ments are as follows;
-s start
Specify the reading of a word at which start to
list. Without this argument, from the first word
it will be started.
-e end
Specify the reading of a word at which end to
list. Without this argument, at the last word,
it will be done.
-f filename
Specify the words-list file to which the results
are put. Without this argument, it will be put
into standard output.
-k Print the part of speech information in the name
of part-of-speech. Without this argument, it is
printed in Part-of-Speech-Symbols.
ladd main_dict user_dict filename [ -l logfile ]
[ -m new_main_dict ] [ -u new_user_dict ]
Add multiple words at one time to the main dictionary.
Arguments are as follwos.
-l logfile
Put the results of registration to logfile. No
logfile is created without this argument.
-m new_main_dict
Specify the main dictionary to create. No argu-
ment, it is created in the name newmdic in the
current directory.
-u new_user_dict
Specify the user dictionary to create. No argu-
ment, it is created in the name newudic in the
current directory.
filename
Specify the words-list file in which words to
add are listed.
ldel main_dict user_dict filename [ -l logfile ]
[ -m new_main_dict ] [ -u new_user_dict ]
Delete multiple words at one time of the main diction-
ary. Opeions are as follows.
-l logfile
Put the results of registration to logfile. No
logfile is created without this argument.
-m new_main_dict
Specify the main dictionary to create. No log-
file, it is created in the name newmdic in the
current directory.
-u new_user_dict
Specify the user dictionary to create. If miss-
ing, it is created in the name newudic in the
current work directory.
filename
Specify the words-list file in which words to
delete are listed.
cat main_dict user_dict filename [ -l logfile ]
[ -m new_main_dict ] [ -u new_user_dict ] Merge
another user dictionary to the main dictionary. Argu-
ments are as follows.
-l logfile
Put the result to logfile. No logfile is
created without this argument.
-m new_main_dict
Specify the main dictionary to create. No argu-
ment, it is created in the name newmdic in the
current directory.
-u new_user_dict
Specify the user dictionary to create. No argu-
ment, it is created in the name newudic in the
current directory.
filename
Specify the dictionary's filename which is to be
merged to the main dictionary.
file6 [file7]
create new_main_dict new_user_dict file1 file2 file3 file4 file5
create new_main_dict new_user_dict filename
The first synopsis creates a new main dictionary from
the files 1-6 and a new user dictionary from the
words-list file ( file7 ). If file7 is not given, an
empty user dictionary is created. The files 1-7 are as
follows.
suffix
file1 Jiritsugo File (.jir)
file2 Conjunction matric file (.set)
file3 Clause Terminator file (.bun)
file4 Conjugation file (.kat)
file5 Suffix file (.sbi)
file6 Fuzokugo file (.fuz)
file7 Words list of user dictionary (.usr)
The second synopsis creates a new main dictionary
and a new user dictionary from the files whose
name consist of filename followed by the above
seven suffixs. All files with suffix such as
filename.jir should be in the same directory.
NOTE: Formats of the files or words-list are
described in "File Formats".
file6 file7
extract main_dict user_dict file1 file2 file3 file4 file5
extract main_dict user_dict filename
The first synopsis extracts the contents of a
main dictionary to the files 1-6 and those of
user dictionary to the words-list file ( file7
).
The second synopsis extracts the files whose
name consist of filename followed by the above
seven suffixs, from a main dictionary and an
user dictionary. NOTE: Formats of the files or
words-list are described in "File Formats".
quit Quit mdicm (interactive mode).
hinshi
Show the list of part-of-speech-symbols to
stdout.
?, help
Help. Show the command reference to standard
out.
File Formats
Words list file
The format of the words-list for input/output of each
command.
Comments
Lines starting with "#" are comments.
Data Consists of three fields. The first and second
fields are for reading and word , respectively.
And the last one is the part-of-speech informa-
tion, described as an enumerate of part-of-
speech-symbol. These fields are separated by
half-size Katakana (Hankaku), white sapces or
tabs.
An example is shown on the mdicm(1M) for locale
ja (japanese).
Reading
12 Hiragana characters defined in Japanese EUC
Codeset 1 can be used. However, "you-on" (such
as 'xya'), 'wi', 'we', 'wo' and 'nn' aren't
permitted, as the first character. For the
second or subsequent characters, "cho-on" ('-'
in Japanese EUC Codeset 1) can be used in addi-
tion to all Hiragana characters. "daku-on" and
"handaku-on" (such as 'da' and 'pa') are treated
as two characters.
Word Eight characters defined in Japanese EUC Codeset
1 can be used.
Part-of-Speech-Symbols
The part-of-speech information consists of the
following part ot speech symbols.
Symbols Part of speech Remarks
:N1 noun1 general noun
:N2 noun2 pronoun
:M1 person's name1 family-name
:M2 person's name2 first-name
:T1 place name1
:T2 place name2 Names of prefectures
:NM numeral
:NN supplemental numeral Mai(pieces),
Kai(times),
Nen(years), etc.
:PR prefix
:SF suffix
:AD adverb
:CN conjunction
:RT participial adjective
:AJ adjective
:AV adjective verb
:SH S-series irregular con-
jugation verb (Sahen-
Doushi)
:ZH Z-series irregular con-
jugation verb (Zahen-
Doushi)
:1V Single conjugation verb
:KV K-series five conjuga-
tion verb (Kagyou-
Godan-katsuyou-Doushi)
:GV G-series five conjuga-
tion verb (Gagyou-
Godan-katsuyou-Doushi)
:SV S-series five conjuga-
tion verb (Sagyou-
Godan-katsuyou-Doushi)
:TV T-series five conjuga-
tion verb (Tagyou-
Godan-katsuyou-Doushi)
:NV N-series five conjuga-
tion verb (Nagyou-
Godan-katsuyou-Doushi)
:BV B-series five conjuga-
tion verb (Bagyou-
Godan-katsuyou-Doushi)
:MV M-series five conjuga-
tion verb (Magyou-
Godan-katsuyou-Doushi)
:RV R-series five conjuga-
tion verb (Ragyou-
Godan-katsuyou-Doushi)
:WV W-series five conjuga-
tion verb (Wagyou-
Godan-katsuyou-Doushi)
:UN No Classification
:TK single kanji
:BS clause
file1: Jiritsugo file
A reading of Jiritsugo in a word part, a part of
speech for reading, Kanji, a part of speech for Kanji,
a reading of Kanji part or Kanji is specified in each
line. Each line has a list of readings and candidates
for words for the Kanji part.
For a word part,
RRR...R XXXXXXXX JJJJJJJ....JJJJ YYYYYYYY
For a Kanji part,
RRR....R:XXXXXXXX KKKKKKKKKKKKKKKKKKKKKK
RRR...R
Reading (Kana Code)
XXXXXXXXA
Part of speech for reading (expres-
sion in 8 digits HEX number)
JJJJJJJ...JJJJ
Kanji (Japanese EUC Code)
YYYYYYYY
A part of speech for Kanji (expres-
sion in 8 digits HEX number)
KKKKKKKKKKKKKKKKKKKKKK
A Kanji character (string of Kanji
(max 255))
(colon)
Delimiter to distinguish the homonym
and a Kanji character.
(space)
To separate each item, a half-width
space is used.
file2: Conjunction matrix file
A dimensional array (32 columns by 182 rows) of HEX
numbers.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
:
:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
X (an expression of 1 HEX digit)
file3: Clause Terminator file
An expression of 46 HEX digits
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXY
X (expression of 1 digit HEX number)
Y (expression of 1 digit HEX; the lower 2
bits of 4 bits are not used)
file4: Conjugation file
Following a reading of Conjugation, a repetition of
line number is specified in a line.
RRR....R LLL LLL LLL LLL LLL
*LLL LLL LLL LLL LLL
/
RRR...R
Reading (Kana Code; if no reading, uses *)
LLL Line number (max 3 digits decimal number)
/ Delimiter used to separate each of 15
types of Conjugation specified in the fol-
lowing order:
1. Adjective
2. Adjective verb
3. No classification
4. S-series irregular conjugation verb
(Sahen-Doushi)
5. Z-series irregular conjugation verb
(Zahen-Doushi)
6. Single conjugation verb
7. K-series five conjugation verb
(Kagyou-Godan-Doushi)
8. G-series five conjugation verb
(Gagyou-Godan-Doushi)
9. S-series five conjugation verb
(Sagyou-Godan-Doushi)
10.
T-series five conjugation verb
(Tagyou-Godan-Doushi)
11.
N-series five conjugation verb
(Nagyou-Godan-Doushi)
12.
B-series five conjugation verb
(Bagyou-Godan-Doushi)
13.
M-series five conjugation verb
(Magyou-Godan-Doushi)
14.
R-series five conjugation verb
(Ragyou-Godan-Doushi)
15.
W-series five conjugation verb
(Wagyou-Godan-Doushi)
file5: Suffix file
A reading, type and Kanji of the suffix is described
in a line.
RRR...R JJ
*
RRR...R
Reading (Kana Code)
N Type (7 types with values(1, 2, 3,..., 7))
JJ Kanji (max2 Kanji characters; when Kanji
is 1 character, the latter half has
NULL+NULL)
* When reading, type and Kanji are not
specified.
file6: Fuzokugo file
Following a reading of Fuzokugo, a repetition of
column number and row number of Conjunction matrix is
specified in a line.
RRR...RR ccc,rrr ccc,rrr ccc,rrr ccc,rrr...ccc,rrr
*ccc,rrr ccc,rrr ccc,rrr ccc,rrr...ccc,rrr
/
RRR...R
Reading (Kana Code; when no reading, use
*)
ccc Column number (max 3 digits decimal
number)
rrr Row number (max 3 digits decimal number)
(comma)
A pair of row and column numbers is
separated by a comma.
/ Terminator when neither reading, nor row
and column numbers is specified.
FILES
/usr/bin/mdicm
SEE ALSO
udicm(1), cs00(1M)
|
Закладки на сайте Проследить за страницей |
Created 1996-2025 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |