Moving to UTF-8? #19

Closed
opened 2024-04-06 01:58:18 +03:00 by Tr0n · 4 comments
Contributor

Hey! Really appreciate the effort you put into this. Very fun to work on this code-base.

I was wondering if you would be interested in having the source and header files use UTF-8 encoding instead of EUC-KR?
As most tools now chose UTF-8 by default, comments and string literals using Korean characters are often a mess.
Some files already seemed to have their encoding broken in the past (char.cpp for example).

For my personal repo I converted comments to UTF-8 and used escaped hex for the string literals as they are used for the localization lookups and I didn't want to change any functionality (for now).

If that's something you'd be interested in I'd be glad to issue a PR.

Hey! Really appreciate the effort you put into this. Very fun to work on this code-base. I was wondering if you would be interested in having the source and header files use UTF-8 encoding instead of EUC-KR? As most tools now chose UTF-8 by default, comments and string literals using Korean characters are often a mess. Some files already seemed to have their encoding broken in the past ([`char.cpp`](https://git.old-metin2.com/metin2/server/src/branch/master/src/game/src/char.cpp) for example). For my personal repo I converted comments to UTF-8 and used escaped hex for the string literals as they are used for the localization lookups and I didn't want to change any functionality (for now). If that's something you'd be interested in I'd be glad to issue a PR.
Owner

Hi there and thank you so much for spending time on the project!

As I have probably hinted at numerous times, solving the issue of the multiple encodings in the source code is pretty high up my list. Not only is it hard to tip-toe around accidentally breaking files by using editors that don't detect the encoding correctly, but there are cases (in the client source code, I believe) where EUC-KR, JIS and Big5 are being used all in the same file.

It's a mess and of course I want to get rid of it.

I checked out your repositories and it looks really promising; I like the encoding-fixer project, a quick and dirty tool that gets the job done!

Still, there's one thing I've had on my mind for a while now: instead of keeping the localization strings in Korean, I think we should better replace all of those with their English counterparts. The implementation of this feature in encoding-fixer should be pretty straightforward, only a lookup table is needed.

This will also have the side effect of getting rid of those long escaped hex strings you currently have in your fork, therefore leading to a more readable source code. What do you think?

Hi there and thank you so much for spending time on the project! As I have probably hinted at numerous times, solving the issue of the multiple encodings in the source code is pretty high up my list. Not only is it hard to tip-toe around accidentally breaking files by using editors that don't detect the encoding correctly, but there are cases (in the client source code, I believe) where EUC-KR, JIS and Big5 are being used all in the same file. It's a mess and of course I want to get rid of it. I checked out your repositories and it looks really promising; I like the encoding-fixer project, a quick and dirty tool that gets the job done! Still, there's one thing I've had on my mind for a while now: instead of keeping the localization strings in Korean, I think we should better replace all of those with [their English counterparts](https://git.old-metin2.com/metin2/server/src/branch/master/gamefiles/locale/english/locale_string.txt). The implementation of this feature in encoding-fixer should be pretty straightforward, only a lookup table is needed. This will also have the side effect of getting rid of those long escaped hex strings you currently have in your fork, therefore leading to a more readable source code. What do you think?
Exynox added the
enhancement
label 2024-04-06 07:40:14 +03:00
Author
Contributor

Great, Thanks for the feedback.
I absolutely agree. The hex strings were only a stopgap solution in the first place.
Had the same goal as you in replacing the Korean strings with the English version from the locale.txt.

I will try to write another script to do that substitution automatically, as I don't fancy doing that by hand for over a thousand calls 😆.

For PRs, would you rather have them target the nightly branch or the main branch? I assume nightly?

Great, Thanks for the feedback. I absolutely agree. The hex strings were only a stopgap solution in the first place. Had the same goal as you in replacing the Korean strings with the English version from the locale.txt. I will try to write another script to do that substitution automatically, as I don't fancy doing that by hand for over a thousand calls 😆. For PRs, would you rather have them target the nightly branch or the main branch? I assume nightly?
Owner

Glad we're on the same page!

es, let's target nightly for now, as that's where I do all the testing.

Glad we're on the same page! es, let's target nightly for now, as that's where I do all the testing.
Author
Contributor

I gave it a go and converted the strings to the English counterparts in #20 .

Unfortunately it wasn't as straight forward as implementing a lookup table in the encoding fixer, as there were plenty of strings that were used that weren't localized in the locale_string (yet :), or where the original file's encodings were messed up and matching them automatically was impossible.

However I think I caught them all now and it compiles and runs.

I'll close the Issue, let's continue the discussion on the PR. Makes it easier to reference specific changes.

I gave it a go and converted the strings to the English counterparts in #20 . Unfortunately it wasn't as straight forward as implementing a lookup table in the encoding fixer, as there were plenty of strings that were used that weren't localized in the `locale_string` (yet :), or where the original file's encodings were messed up and matching them automatically was impossible. However I think I caught them all now and it compiles and runs. I'll close the Issue, let's continue the discussion on the PR. Makes it easier to reference specific changes.
Tr0n closed this issue 2024-04-07 00:26:04 +03:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: metin2/server#19
No description provided.