Package detail

@sglkc/kuroshiro

sglkc2.3kMIT1.0.1

Forked version of kuroshiro with TypeScript support

Japanese, language, convert, converter, kanji, hiragana, katakana, kana, romaji, furigana, okurigana, library, utility, tool, hepburn

readme

kuroshiro

kuroshiro是一款十分方便使用的日文轉換注音工具，主要針對日文文本，進行到平假名、片假名及羅馬字的轉換，並支持注音假名、送假名（旁註音）等注音模式。

其他說明語言：English, 日本語, 簡體中文, 繁體中文, Esperanto。

演示

你可以在這裡查看在線演示。

特性

日文文本 => 平假名、片假名、羅馬字
支持注音假名和送假名
🆕支持多種語素解析器
🆕支持多種羅馬字體系
實用日語工具

1.x版本的重大變化

從注音邏輯中分離語素解析器部分，使得我們可以使用不同的語素解析器（預定義的或自定義的）
擁抱ES8/ES2017以使用async/await方法
使用ES6 Module取代CommonJS

解析器插件

在開始工作之前，請先確認各插件的環境兼容性

解析器	Node.js支持	瀏覽器支持	倉庫	開發者
Kuromoji	✓	✓	kuroshiro-analyzer-kuromoji	Hexen Qi
Mecab	✓	✗	kuroshiro-analyzer-mecab	Hexen Qi
Yahoo Web API	✓	✗	kuroshiro-analyzer-yahoo-webapi	Hexen Qi

如何使用

Node.js (或使用Webpack等打包工具時)

首先使用npm包管理器進行安裝:

$ npm install kuroshiro

載入kuroshiro庫:

同時支持ES6 Module import 和 CommonJS require

import Kuroshiro from "kuroshiro";

實例化:

const kuroshiro = new Kuroshiro();

使用一個解析器實例來初始化kuroshiro (請參考API說明):

// 在這個示例中，首先npm install並import導入kuromoji解析器
import KuromojiAnalyzer from "kuroshiro-analyzer-kuromoji";

// ...

// 初始化
// 這裡使用了async/await, 你同樣也可以使用Promise
await kuroshiro.init(new KuromojiAnalyzer());

進行轉換操作:

const result = await kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", { to: "hiragana" });

瀏覽器

將dist/kuroshiro.min.js加入到你的工程 (你需要先後執行npm install和npm run build，以把它構建出來)，並在HTML中加入:

<script src="url/to/kuroshiro.min.js"></script>

在這個示例中, 你還需要引入kuroshiro-analyzer-kuromoji.min.js。具體獲取方法請參考kuroshiro-analyzer-kuromoji

<script src="url/to/kuroshiro-analyzer-kuromoji.min.js"></script>

實例化:

var kuroshiro = new Kuroshiro();

使用一個解析器實例來初始化kuroshiro，然後進行轉換操作:

kuroshiro.init(new KuromojiAnalyzer({ dictPath: "url/to/dictFiles" }))
    .then(function () {
        return kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", { to: "hiragana" });
    })
    .then(function(result){
        console.log(result);
    })

API說明

構造器

示例

const kuroshiro = new Kuroshiro();

實例方法

init(analyzer)

使用一個解析器實例來初始化kuroshiro。你需要首先導入並初始化一個解析器。你可以使用上面提到的已實現的解析器插件。關於解析器的初始化方法請參照相應解析器的文檔說明。

參數

analyzer - 解析器實例。

示例

await kuroshiro.init(new KuromojiAnalyzer());

convert(str, [options])

轉換指定的字元串到指定的音節文字（可在選項中配置注音模式等設置）。

參數

str - 將被轉換的字元串。
options - 可選轉換選項，見下表。

選項	類型	默認值	描述
to	String	'hiragana'	目標音節文字 `hiragana` (平假名), `katakana` (片假名), `romaji` (羅馬字)
mode	String	'normal'	轉換模式 `normal` (標準模式), `spaced` (空格分組), `okurigana` (送假名), `furigana` (注音假名)
romajiSystem^*	String	"hepburn"	羅馬字體系 `nippon` (日本式), `passport` (護照式), `hepburn` (平文式)
delimiter_start	String	'('	分隔符 (起始)
delimiter_end	String	')'	分隔符 (結束)

*: romajiSystem參數僅當to參數設置為romaji時生效。有關這一參數的更多信息, 請見羅馬字體系

示例

// normal (標準模式)
kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", {mode:"okurigana", to:"hiragana"});
// 結果：かんじとれたらてをつなごう、かさなるのはじんせいのライン and レミリアさいこう！

// spaced (空格分組)
kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", {mode:"okurigana", to:"hiragana"});
// 結果：かんじとれ たら て を つなご う 、 かさなる の は じんせい の ライン   and   レミ リア さいこう ！

// okurigana (送假名)
kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", {mode:"okurigana", to:"hiragana"});
// 結果: 感(かん)じ取(と)れたら手(て)を繋(つな)ごう、重(かさ)なるのは人生(じんせい)のライン and レミリア最高(さいこう)！

// furigana (注音假名)
kuroshiro.convert("感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高！", {mode:"furigana", to:"hiragana"});
// 結果: <ruby>感<rp>(</rp><rt>かん</rt><rp>)</rp></ruby>じ<ruby>取<rp>(</rp><rt>と</rt><rp>)</rp></ruby>れたら<ruby>手<rp>(</rp><rt>て</rt><rp>)</rp></ruby>を<ruby>繋<rp>(</rp><rt>つな</rt><rp>)</rp></ruby>ごう、<ruby>重<rp>(</rp><rt>かさ</rt><rp>)</rp></ruby>なるのは<ruby>人生<rp>(</rp><rt>じんせい</rt><rp>)</rp></ruby>のライン and レミリア<ruby>最高<rp>(</rp><rt>さいこう</rt><rp>)</rp></ruby>！

實用工具

示例

const result = Kuroshiro.Util.isHiragana("あ"));

isHiragana(char)

判斷輸入字元是否是平假名。

isKatakana(char)

判斷輸入字元是否是片假名。

isKana(char)

判斷輸入字元是否是假名。

isKanji(char)

判斷輸入字元是否是日文漢字。

isJapanese(char)

判斷輸入字元是否是日文。

hasHiragana(str)

檢查輸入字元串中是否含有平假名。

hasKatakana(str)

檢查輸入字元串中是否含有片假名。

hasKana(str)

檢查輸入字元串中是否含有假名。

hasKanji(str)

檢查輸入字元串中是否含有日文漢字。

hasJapanese(str)

檢查輸入字元串中是否含有日文。

kanaToHiragna(str)

轉換輸入假名字元串至平假名。

kanaToKatakana(str)

轉換輸入假名字元串至片假名。

kanaToRomaji(str, system)

轉換輸入假名字元串至羅馬字。參數system可選值為"nippon", "passport", "hepburn" (默認值: "hepburn")。

羅馬字體系

kuroshiro支持三種羅馬字體系。

nippon: 日本式羅馬字。參照 ISO 3602 Strict。

passport: 護照式羅馬字。參照日本外務省發布的日文羅馬字對照表。

hepburn: 平文羅馬字。參照 BS 4812 : 1972。

想快速了解這些羅馬字體系的不同，可參考這個實用的網頁。

羅馬字轉換須知

完全自動化進行注音假名到羅馬字的直接轉換是不可能的，這是因為一般的注音假名都缺乏正確的發音信息，可以參考なぜフリガナではダメなのか？。

因此kuroshiro在進行直接的注音假名->羅馬字轉換（使用任何羅馬字體系）時，不會處理長音。(但長音符會被處理)

例如，當進行假名"こうし"到羅馬字的轉換時，對於nippon, passport, hepburn三種羅馬字體系，你會分別得到"kousi", "koushi", "koushi"這幾個結果

漢字->羅馬字的轉換無論使用注音假名模式與否都不受此邏輯影響。

貢獻

請查閱文檔 CONTRIBUTING.

靈感源

kuromoji
wanakana

版權說明

MIT

changelog

1.2.0 (2021-6-7)

Bug Fixes

fix errors occurring when converting っ to romaji (#61)
fix wrong regex pattern (#60)

Dependencies

Upgrade to babel 7
Update other dependencies

Documents

Add ready to use code in readme
Add Esperanto docs

Miscellaneous

Add husky and lint-staged for pre-commit check

1.1.2 (2018-10-19)

Bug Fixes

fix conversion bug when handling chōon with passport-shiki romanization (#47)
fix kanji->romaji conversion bug when using nippon-shiki/hepburn-shiki romanization (#46)

Test

Update test specification

Miscellaneous

Update docs, add notice for romaji conversion

1.1.1 (2018-08-28)

Bug Fixes

Handle invalid parameter when initializing kuroshiro

Test

Update test specification

1.1.0 (2018-08-13)

Feature

Add support for multiple romanization systems

Bug Fixes

Add babel-runtime dependency which used by commonjs distribution

Test

Update test specification

Miscellaneous

Update docs

1.0.0 (2018-08-07)

Bump deps

Update kuroshiro-analyzer-kuromoji to version ^1.1.0

Miscellaneous

Update docs

1.0.0-rc.2 (2018-08-05)

Miscellaneous

Update docs

1.0.0-rc.1 (2018-07-26)

BREAKING CHANGE

Seperate morphological analyzer from phonetic notation logic to enable the new feature listed below
Embrace ES8/ES2017 to use async/await functions
Use ES6 Module instead of CommonJS
Refactor project structure

Feature

Ability to use different morphological analyzers (ready-made or customized)

Repo Name Change

kuroshiro.js is renamed kuroshiro for avoiding confusion between the names of kuroshiro and its plugins.

Miscellaneous

Add CONTRIBUTING.md
Add README.jp.md
Update documents

0.2.4 (2018-05-23)

Bug Fixes

Fix misparing when kana is between kanji (#31)

0.2.3 (2018-05-17)

Miscellaneous

Update .npmignore file

0.2.2 (2018-05-17)

Bug Fixes

Fix simple character conversion problem (#28)

0.2.1 (2018-01-24)

Miscellaneous

Fix coverage report problem

0.2.0 (2018-01-24)

Bug Fixes

Fix it would replace from first 'src' when getting full path of 'kuromoji/dict' (#19)

Usability

Add typescript typings (#21)

Bump deps

Update dependencies

Miscellaneous

Add README.zh-tw.md
Modify distribution logic
Other trivial modifications

0.1.5 (2017-06-05)

Bug Fixes

Fix wrong pairing of kanji and phonetic notation (reported in #10)

0.1.4 (2017-05-25)

Bug Fixes

Fix wrong recognition when encountering katakana-kanji-mixed tokens (#9)

0.1.3 (2017-01-10)

Usability

Make param callback of init function optional

0.1.2 (2016-08-22)

Bump deps

Update dependencies in package.json

Miscellaneous

Update README.md and README.zh-cn.md