Skip to content

Commit

Permalink
feat: de-weight 630 full phonetic code
Browse files Browse the repository at this point in the history
New added option `--deweight` in script `make_dicts.sh`. Apply it in
workflow as default.
  • Loading branch information
amorphobia committed Jul 3, 2023
1 parent dce1166 commit 2a1344e
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 10 deletions.
13 changes: 12 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
name: Continuous Integration
on:
workflow_dispatch:
inputs:
disable_de_weight_630:
description: '禁用权重降低选项 true / false (默认不禁用)'
required: true
default: 'false'
push:
branches:
- master
Expand All @@ -15,10 +20,16 @@ jobs:
- name: Checkout
uses: actions/checkout@v3

- name: Make Dicts
- name: Make Dicts (disable de-weight 630)
if: github.event.inputs.de_weight_630 == 'true'
run: |
bash scripts/make_dicts.sh --append dicts/cizu_append.txt --delete dicts/cizu_delete.txt --modify dicts/cizu_modify.txt --version ${{ github.ref_name }}
- name: Make Dicts (de-weight 630)
if: github.event.inputs.de_weight_630 != 'true'
run: |
bash scripts/make_dicts.sh --append dicts/cizu_append.txt --delete dicts/cizu_delete.txt --modify dicts/cizu_modify.txt --version ${{ github.ref_name }} --deweight
- name: Install Dependencies
run: sudo apt-get install -y opencc

Expand Down
28 changes: 22 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,19 @@

## 安装

四种方式可以使用,前两种方式无法对 cizu 词典进行非覆盖式的修改,后两种可用补丁形式对 cizu 进行修改,请自行选择
四种方式可以使用,前两种方式无法对 cizu 词典进行非覆盖式的修改,后两种可用补丁形式对 cizu 进行修改,请自行选择,注意每种方式都需要在 `default.custom.yaml` 里添加本方案 (jiandao)。

### 下载 Zip 包
### 1. 下载 Zip 包

请在[发布页面](https://github.com/amorphobia/rime-jiandao/releases)下载打包好的方案,解压文件到对应的目录,并在 `default.custom.yaml` 里添加本方案 (jiandao)
请在[发布页面](https://github.com/amorphobia/rime-jiandao/releases)下载打包好的方案,解压文件到对应的目录

### 东风破
### 2. 东风破

```bash
bash rime-install amorphobia/rime-jiandao@release
```

### 克隆并在本地生成词库
### 3. 克隆并在本地生成词库

> Windows 用户请使用 WSL 运行
Expand All @@ -36,10 +36,26 @@ scrips/make_dicts.sh --append <cizu_append.txt> --delete <cizu_delete.txt> --mod

需要修改为你自己的对应文件名,也可省略选项。生成的方案在 `schema` 目录中。

### 使用 Github Action 自动生成方案文件
### 4. 使用 Github Action 自动生成方案文件

Fork 本仓库后,可以把需要添加、删除、修改权重的词语按需要的格式放到 `dicts` 目录下的 `cizu_append.txt`, `cizu_delete.txt`, 和 `cizu_modify.txt` 文件中,当推送到 Github 的时候,可以自动生成方案文件,生成的文件可以在 Actions 里面找到。

## 与官方方案不同之处

### 配置的不同

- 微调了开关菜单,不再提供关闭630全码词的开关(取而代之的是在构建词库时把630全码词权重降低)
- 关闭了自动上屏,默认使用顶功上屏
- 次选使用分号键,单引号用作三选
- 一些开关的快捷键修改

### 词典的不同

- 删除 lianjie 词典,其中项目选择一部分放到了 fuhao 词典里
- 删除了 yingwen 词典,因其规则不明确(如有需要可以自行添加)
- 修改了「嫠」、「釐」两字的拆字,拆分为「𠩺」和剩余部分(其中「釐」字收录读音 xī 而非 lí)
- 默认降低了 630 词汇对应全码的权重(可以在构建词典时控制,不添加 `--deweight` 选项时保持原权重)

## 开源许可

原有的内容无开源许可声明,遵循[《中华人民共和国著作权法》](http://www.npc.gov.cn/npc/c30834/202011/848e73f58d4e4c5b82f69d25d46048c6.shtml)
Expand Down
3 changes: 2 additions & 1 deletion dicts/03.fuhao.txt
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,10 @@ ____ ;z
!” ;tg
……” ;sg
。” ;jg
# 连接
# 链接
https:// http
https://xkinput.gitee.io ogw
https://github.com/amorphobia/rime-jiandao orj
https://github.com/xkinput/Rime_JD orj
# 邮箱
@gmail.com ;ag
Expand Down
16 changes: 14 additions & 2 deletions scripts/make_dicts.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,18 @@ usage() {
echo " -m, --modify <file> the file of terms to be modified and their weight delta"
echo " -r, --rawdict <file> the raw dict file, default is \"${BASEDIR}/../dicts/cizu_raw.txt\""
echo " -v, --version <string> the target version"
echo " --deweight reduce the weight of 630 phrases"
echo " --clean clean generated files and restore the raw dict"
echo ""
echo " -h, --help display this help"
}

RAWDICT="${BASEDIR}/../dicts/cizu_raw.txt"
OUTPUT="${BASEDIR}/../schema/jiandao.base.dict.yaml"
VERSION="master"
DEWEIGHT=0

ARGS=$(getopt -o a:d:m:r:v:h --long append:,delete:,modify:,rawdict:,version:,clean,help -n "$(basename $0)" -- "$@")
ARGS=$(getopt -o a:d:m:r:v:h --long append:,delete:,modify:,rawdict:,version:,deweight,clean,help -n "$(basename $0)" -- "$@")
if [[ $? -ne 0 ]]; then
usage
exit
Expand Down Expand Up @@ -69,11 +72,15 @@ while true; do
VERSION=$2
shift 2
;;
--deweight )
DEWEIGHT=1
shift
;;
--clean )
if [[ -f "${RAWDICT}.bak" ]]; then
mv "${RAWDICT}.bak" ${RAWDICT}
fi
rm -f ${BASEDIR}/../dicts/02.cizu.txt $(dirname "${OUTPUT}")/*.dict.yaml
rm -f ${BASEDIR}/../dicts/02.cizu.txt $(dirname "${OUTPUT}")/*.dict.yaml temp.txt
exit
;;
-- )
Expand Down Expand Up @@ -101,6 +108,11 @@ fi
cat ${RAWDICT} ${APPEND} | awk '!seen[$1,$2]++' > temp.txt
mv temp.txt ${RAWDICT}

if [[ "${DEWEIGHT}" -eq 1 ]]; then
awk -v OFS='\t' 'NR==FNR {map[$1]++; next} {if (!map[$1]) print $0; else print $1,$2,$3,500,$5,$6}' ${BASEDIR}/../dicts/06.630.txt ${RAWDICT} > temp.txt
mv temp.txt ${RAWDICT}
fi

if [[ -f ${DELETE} ]]; then
awk 'NR==FNR {map[$1,$2]++; next} {if (!map[$1,$2]) print $0}' ${DELETE} ${RAWDICT} > temp.txt
mv temp.txt ${RAWDICT}
Expand Down

0 comments on commit 2a1344e

Please sign in to comment.