Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proactive and inactive detective mode #54

Open
yujinlin0224 opened this issue Aug 20, 2022 · 12 comments
Open

Add proactive and inactive detective mode #54

yujinlin0224 opened this issue Aug 20, 2022 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@yujinlin0224
Copy link

yujinlin0224 commented Aug 20, 2022

自動轉換的偵測式在非中文語言頁面仍然做了轉換,例如日文和英文網站,裡面如果有中文字元,一樣會被轉換,但這可能不是那麼的合理。
查了原始碼發現在src/background/runtime/handle-get-auto-convert.ts中,對於非中文語言頁面會直接使用目標中文來轉換,而不是直接忽略,這樣子所謂「偵測式」可能意義不大。

要解決這問題,是不是把

      case ZhType.und:
        return target;

改成

      case ZhType.und:
        return undefined;

就能解決了呢?
或是額外開新的模式,只針對中文語言頁面做自動偵測轉換

我以前也有提過相關問題:tongwentang/New-Tongwentang-for-Firefox#38

@t7yang
Copy link
Contributor

t7yang commented Aug 21, 2022

瀏覽器偵測不到目前的語言,套件能做的就是採取激進或消極的策略,兩者都不可能符合全部人的需求。
頂多是未來把激進跟消極策略做成選項,讓使用者自己選擇。

@t7yang t7yang added the enhancement New feature or request label Aug 21, 2022
@t7yang t7yang self-assigned this Aug 21, 2022
@yujinlin0224
Copy link
Author

我知道,瀏覽器能做的頂多只能透過<html>lang屬性來判斷,但也不是100%正確需要看網頁開發者是否用心,但如果能開放單純判斷<html>lang屬性只有zh開頭時才自動轉換而不fallback的功能是最好了,讓使用者有更多選擇的空間。

@t7yang
Copy link
Contributor

t7yang commented Aug 21, 2022

套件本身並沒有介入語言的判斷而是直接呼叫瀏覽器提供的 API 。
之後會考慮新增「激進」跟「保守」的選項(如上一則留言所述)。

@uttchen
Copy link

uttchen commented Aug 22, 2022

請問一下,好像更新之後會把英文的撇號 ' 自動轉換成中文的下引號 』
不知道有沒有關聯,這個有辦法改掉嗎?

@t7yang
Copy link
Contributor

t7yang commented Aug 22, 2022

@uttchen 並不是英文的單引號,而是中文的單引號(嚴格說是中國的單音號轉換成台灣的單引號)。
目前並沒有針對內建字典檔開放讓使用者選擇,未來應該會做。

@yujinlin0224
Copy link
Author

yujinlin0224 commented Aug 22, 2022

@uttchen 並不是英文的單引號,而是中文的單引號(嚴格說是中國的單音號轉換成台灣的單引號)

實際上在英文出版物上,也會使用‘ (U+2018)’ (U+2019)當作引號,和中國的引號共用字元,部分英文網站也會使用此類字元,達到較好的顯示效果

@uttchen
Copy link

uttchen commented Aug 22, 2022

@t7yang @yujinlin0224

我不確定你的意思,我現在遇到的情況是像這樣
image
左邊是裝了更新後的插件的 firefox,右邊是沒裝的 chrome

所有的英文引號都被換成了中文
image

另外,英文撇號(apostrophe)也會有誤轉
image

這是我在更新前沒遇過的情況

@t7yang
Copy link
Contributor

t7yang commented Aug 23, 2022

確實沒考慮到這點,因為我本身沒有用自動轉換,所以沒有察覺。

不過這個屬於字典檔的部分,請到 https://github.com/tongwentang/tongwen-dict/issues 開 issue
把發現到有問題的標點符號列出來,會把有問題的標點符號先註解掉

之後再考慮實作針對網站或語系來套用標點符號的字典檔。

@t7yang t7yang changed the title 自動轉換的偵測式在非中文語言頁面仍然做了轉換 Add proactive and inactive detective mode Aug 23, 2022
@alabamagan
Copy link

Hi, I think a good idea to do that is to add a simple neighbor check when converting these symbols. Obviously, you want to convert only when a Chinese character is the immediate neighbor.

@t7yang
Copy link
Contributor

t7yang commented Aug 24, 2022

what if an English article quoted a Chinese sentence

English... “這是中文” English...

the quote symbols are live beside Chinese character

instead, maybe let the user set what dict to apply on the site:

之後再考慮實作針對網站或語系來套用標點符號的字典檔

@alabamagan
Copy link

Honestly, that doesn't look too bad, both [English... 「這是中文」 English...], and [English... "這是中文" English...] looks natural, but I do agree there should be room for users to choose. And I think that also applies for the default behavior too, i.e. special cases for website + user defined rules to alter default behavior, cos the default behavior will never satisfy everybody.

@t7yang
Copy link
Contributor

t7yang commented Aug 24, 2022

cos the default behavior will never satisfy everybody.

🤝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants