nlp - programmatically access IME -


is there way access japanese or chinese ime either command line or python? have linux/osx/win8 boxes, ever system exposes easiest accessible api fine.

i'm experimenting building japanese kana-kanji conversion algorithm , establish baseline using existing tools. have collections of kana process.

preferably along lines of

$ ime jp "きしゃのきしゃがきしゃできしゃした" 貴社の記者が汽車で帰社した 

i've looked @ anthy, mozc , dbus on linux can't find anyway interact them via terminal or scripting (such python)

anthy provides cli tool

personally, prefer google's ime / mozc better results, perhaps helps.

the source anthy (sourceforge, file anthy-9100h.tar.gz) includes simple cli program testing. download source file, extract it, run

./configure && make 

enter directory test contains binary anthy. default, reads test.txt , uses euc_jp encoding.

simple test:

input file test.txt

*にほんごにゅうりょく *もももすももももものうち。 

run (using iconv convert utf-8:

./anthy --all  | iconv -f euc-jp -t utf-8 

output:

1:(にほんごにゅうりょく) |にほんご|にゅうりょく にほんご(日本語:(1,1000,n,72089)2500,001 ,にほんご:(n,0,-)2 ,ニホンゴ:(n,0,-)1 ,): にゅうりょく(入力:(1,1000,n,62394)2500,001 ,にゅうりょく:(n,0,-)2 ,ニュウリョク:(n,0,-)1 ,):  2:(もももすももももものうち。) |ももも|すももも|もものうち|。 ももも(桃も:(,1000,ny,72089)225,279 ,ももも:(n,1000,ny,72089)220,773 ,モモも:(,1000,ny,72089)205,004 ,腿も:(,1000,ny,72089)204,722 ,股も:(,1000,ny,72089)146,431 ,モモモ:(n,0,-)1 ,): すももも(すももも:(n,1000,ny,72089)202,751 ,スモモも:(,1000,ny,72089)168,959 ,李も:(,1000,ny,72089)168,677 ,スモモモ:(n,0,-)1 ,): もものうち(桃のうち:(,1000,n,655)2,047 ,もものうち:(n,1000,n,655)2,006 ,モモのうち:(,1000,n,655)1,863 ,腿のうち:(,1000,n,655)1,861 ,股のうち:(,1000,n,655)1,331 ,モモノウチ:(n,0,-)1 ,): 。(。:(1n,100,n,70203)57,040 ,.:(1,100,n,70203)52,653 ,.:(1,100,n,70203)3,840 ,): 

you can uncomment printf statements in source files test/main.c , src-main/context.c make output more readable/parsable, eg:

1   にほんごにゅうりょく にほんご    日本語 にゅうりょく  入力  2   もももすももももものうち。 ももも 桃も すももも    すももも もものうち   桃のうち 。   。 

Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -