Moving to UTF-8? #19

New Issue

Tr0n · 2024-04-06T01:58:18+03:00

Tr0n commented

2024-04-06 01:58:18 +03:00

Hey! Really appreciate the effort you put into this. Very fun to work on this code-base.

I was wondering if you would be interested in having the source and header files use UTF-8 encoding instead of EUC-KR?
As most tools now chose UTF-8 by default, comments and string literals using Korean characters are often a mess.
Some files already seemed to have their encoding broken in the past (char.cpp for example).

For my personal repo I converted comments to UTF-8 and used escaped hex for the string literals as they are used for the localization lookups and I didn't want to change any functionality (for now).

If that's something you'd be interested in I'd be glad to issue a PR.

Hey! Really appreciate the effort you put into this. Very fun to work on this code-base. I was wondering if you would be interested in having the source and header files use UTF-8 encoding instead of EUC-KR? As most tools now chose UTF-8 by default, comments and string literals using Korean characters are often a mess. Some files already seemed to have their encoding broken in the past ([`char.cpp`](https://git.old-metin2.com/metin2/server/src/branch/master/src/game/src/char.cpp) for example). For my personal repo I converted comments to UTF-8 and used escaped hex for the string literals as they are used for the localization lookups and I didn't want to change any functionality (for now). If that's something you'd be interested in I'd be glad to issue a PR.

Exynox commented

2024-04-06 07:40:07 +03:00

Hi there and thank you so much for spending time on the project!

As I have probably hinted at numerous times, solving the issue of the multiple encodings in the source code is pretty high up my list. Not only is it hard to tip-toe around accidentally breaking files by using editors that don't detect the encoding correctly, but there are cases (in the client source code, I believe) where EUC-KR, JIS and Big5 are being used all in the same file.

It's a mess and of course I want to get rid of it.

I checked out your repositories and it looks really promising; I like the encoding-fixer project, a quick and dirty tool that gets the job done!

Still, there's one thing I've had on my mind for a while now: instead of keeping the localization strings in Korean, I think we should better replace all of those with their English counterparts. The implementation of this feature in encoding-fixer should be pretty straightforward, only a lookup table is needed.

This will also have the side effect of getting rid of those long escaped hex strings you currently have in your fork, therefore leading to a more readable source code. What do you think?

Hi there and thank you so much for spending time on the project! As I have probably hinted at numerous times, solving the issue of the multiple encodings in the source code is pretty high up my list. Not only is it hard to tip-toe around accidentally breaking files by using editors that don't detect the encoding correctly, but there are cases (in the client source code, I believe) where EUC-KR, JIS and Big5 are being used all in the same file. It's a mess and of course I want to get rid of it. I checked out your repositories and it looks really promising; I like the encoding-fixer project, a quick and dirty tool that gets the job done! Still, there's one thing I've had on my mind for a while now: instead of keeping the localization strings in Korean, I think we should better replace all of those with [their English counterparts](https://git.old-metin2.com/metin2/server/src/branch/master/gamefiles/locale/english/locale_string.txt). The implementation of this feature in encoding-fixer should be pretty straightforward, only a lookup table is needed. This will also have the side effect of getting rid of those long escaped hex strings you currently have in your fork, therefore leading to a more readable source code. What do you think?

Exynox added the

enhancement

label 2024-04-06 07:40:14 +03:00

Tr0n commented

2024-04-06 10:39:42 +03:00

Great, Thanks for the feedback.
I absolutely agree. The hex strings were only a stopgap solution in the first place.
Had the same goal as you in replacing the Korean strings with the English version from the locale.txt.

I will try to write another script to do that substitution automatically, as I don't fancy doing that by hand for over a thousand calls 😆.

For PRs, would you rather have them target the nightly branch or the main branch? I assume nightly?

Great, Thanks for the feedback. I absolutely agree. The hex strings were only a stopgap solution in the first place. Had the same goal as you in replacing the Korean strings with the English version from the locale.txt. I will try to write another script to do that substitution automatically, as I don't fancy doing that by hand for over a thousand calls 😆. For PRs, would you rather have them target the nightly branch or the main branch? I assume nightly?

Exynox commented

2024-04-06 10:41:33 +03:00

Glad we're on the same page!

es, let's target nightly for now, as that's where I do all the testing.

Glad we're on the same page! es, let's target nightly for now, as that's where I do all the testing.

Tr0n commented

2024-04-07 00:26:04 +03:00

I gave it a go and converted the strings to the English counterparts in #20 .

Unfortunately it wasn't as straight forward as implementing a lookup table in the encoding fixer, as there were plenty of strings that were used that weren't localized in the locale_string (yet :), or where the original file's encodings were messed up and matching them automatically was impossible.

However I think I caught them all now and it compiles and runs.

I'll close the Issue, let's continue the discussion on the PR. Makes it easier to reference specific changes.

I gave it a go and converted the strings to the English counterparts in #20 . Unfortunately it wasn't as straight forward as implementing a lookup table in the encoding fixer, as there were plenty of strings that were used that weren't localized in the `locale_string` (yet :), or where the original file's encodings were messed up and matching them automatically was impossible. However I think I caught them all now and it compiles and runs. I'll close the Issue, let's continue the discussion on the PR. Makes it easier to reference specific changes.

Tr0n closed this issue

2024-04-07 00:26:04 +03:00

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: metin2/server#19