Option to select title language

Posted in

#1 by arsym
2019-06-14 at 23:26
This is a request to add multiple languages to the title field. Within each person's profile would an option for "Preferred Title Language". When you browse the website, VN titles will be displayed in your selected language. If the title is not available in your selected language, it would give the romaji, and if it cannot find that then it would go back to the original.

You can start by supporting 3 options: Original, Romanized (eg. romaji) and English. If programmed properly, it should be able to accommodate more languages later.

Example (#Gyakuten Saiban):
Original:  逆転裁判
Romanized: Gyakuten Saiban
----
Japanese: 逆転裁判
English: Phoenix Wright: Ace Attorney
Chinese: 逆转裁判

Thoughts?
#2 by woodcarbuncle
2019-06-14 at 23:41
Anilist does this and it's a great feature
#3 by forestgrant
2019-06-15 at 00:45
This is an excellent idea.
#4 by yorhel
2019-06-15 at 06:31
AniDB does that too. It's been on my to-do list for ages... but never got around to implementing it. It's definitely not ideas we're lacking, it's people to implement them.
#5 by lywzc
2019-06-15 at 15:07
And it can be applied to other fields with multiple languages such as character and staff.
#6 by arsym
2019-06-17 at 17:18
Alright so I took a crack at this and managed to get it to display on most pages. Language is currently implemented as a JSON object in the form {"country_code":"title"}. A new column was created in the database to hold this object.

For example:
{"en":"Phoenix Wright: Ace Attorney", "ja":"逆転裁判"}

The problem is that I cannot sort titles in my preferred language, it always sorts by the romaji title. I'm having trouble figuring out a way of doing this that won't result in me chocking the server with DB requests. Should I just make a DB column for every country code? That just seems wasteful.Last modified on 2019-06-17 at 17:20
#7 by yorhel
2019-06-17 at 17:49
@arsym: Are you playing around with VNDB's source code? This feature isn't going to be very trivial to implement no matter how you do it, but in terms of storage the most sensible approach is normalized form:

CREATE TABLE vn_titles (
id integer NOT NULL REFERENCES vn (id),
lang language NOT NULL,
title text NOT NULL
);

Then there'd need to be some way to join that table in all queries in order to ultimately get a single 'display_title' column which can be used in select and order by clauses. Expressing that in a query in a way that is correct and fast is going to be a challenge, but should be doable.

(General design philosophy with these things: Do as much as possible in SQL)Last modified on 2019-06-17 at 17:51
#8 by arsym
2019-06-17 at 18:34
Hmmm, I'll give that a shot over the weekend. Not entirely sure how to measure the speed or scalability of the commands, is there a number I should be targeting? Thanks.
#9 by yorhel
2019-06-17 at 18:40
Focus on correctness and simplicity first, the test database isn't large enough to do any useful measurements anyway and I'm sure that optimizing the queries shouldn't be ~too~ problematic with that database schema.
#10 by arsym
2019-06-17 at 18:54
Will do.

A few more design consideration. If this works out, I'm thinking of removing the "original" and "title" fields from the vn table and incorporate those in the vn_titles table. Currently using the placeholder country codes "_o" and "_r" for those. Any thoughts on this?
#11 by yorhel
2019-06-17 at 19:02
I think it's better to attempt a conversion into the correct language, based on the VNs first release language. Any errors can then be corrected manually.

Hmm, now that I think of it, the vn_titles table should probably have an additional 'latin_script' column for languages where we romanize (Japanese/Chinese/Russian/etc). Either that or those need to be added as separate language tags. Either way will complicate the selection of the appropriate display title. :(
#12 by arsym
2019-06-17 at 19:41
I'm not sure what you mean by the first line.

You may be thinking of releases whereas I'm only talking about vns. We'd only romanize the original title. If I choose my language to be Korean, I don't care for the vn titles to be romanized Korean, I want to see hangeul. If that's not available then show the romanized version of the original title.

There may be a little redundancy in the "_o" field but this is what I'm envisioning:

TABLE vn_titles

id | lang | title

[Japanese original]
711 | _o | 逆転裁判
711 | _r | Gyakuten Saiban
711 | ja | 逆転裁判
711 | en | Phoenix Wright: Ace Attorney
711 | zh | 逆转裁判

[Russion original]
22867 | _o | 7 дней лета
22867 | _r | 7 dney leta
22867 | ru | 7 дней лета

[English original]
16849 | _o | 1N Only
16849 | en | 1N Only

[Japanese original with english title]
17 | _o | Ever17 -The Out of Infinity-
17 | ja | Ever17 -The Out of Infinity-

With this sutup it looks for your preferred language, then _r, then _o in that order. No idea if that's easily doable in SQL though.Last modified on 2019-06-17 at 19:42
#13 by yorhel
2019-06-17 at 20:33
Hmm, that approach might work, but it duplicates the original title. That's not a problem from a performance point of view, but may get messy from a database consistency point of view. What I had in mind:

TABLE vn_titles

id | lang | title | latin
711 | ja | 逆転裁判 | Gyakuten Saiban
711 | en | Phoenix Wright: Ace Attorney | NULL
711 | zh | 逆转裁判 | Nizhuan Caipan

This is, I think, more semantically useful, avoids having to add pseudo-languages to the 'language' enum, and is probably not more complex to implement. It would, however, require an additional 'main_title_language' field to the vn table in order to select the right title if the user doesn't have any preference. In terms of the querying SQL, you'd then have something like:

SELECT COALESCE(vt_pref.title, vt_orig.latin, vr_orig.title) AS display_title
FROM vn
LEFT JOIN vn_titles vt_pref ON vt_pref.id = vn.id AND vt_pref.lang = 'user_preferenced_lang'
JOIN vn_titles vt_orig ON vt_orig.id = vn.id AND vt_orig.lang = vn.main_title_language

EDIT: Additional advantage: that vn.main_title_language column could then also replace the ugly vn.c_olang hack currently used to determine the "original language" of a VN, which is used in the VN filters.Last modified on 2019-06-17 at 20:46
#14 by arsym
2019-06-17 at 20:49
Hmmm, that might do it. Might get a little funky if users fail to select the correct main_title_language and make it look for a non-existent title. I guess you can throw in a default error title in that SQL query.

Still not sure what utility the latin field would have for Chinese in your example. I'm assuming the releases are staying with the current title/original system so the field is just taking up space? No idea.
#15 by yorhel
2019-06-17 at 20:58
Might get a little funky if users fail to select the correct main_title_language and make it look for a non-existent title.
That obviously should not be possible in the first place. It might be possible to enforce that with a foreign key constraint, but checking that during editing would also work.

Still not sure what utility the latin field would have for Chinese in your example.
Well, the titles would be listed on the VN page, having the latin for all titles is nice for consistency. It also makes it possible to have more advanced user preferences in the form of "Chinese (latin) > Japanese (latin) > English" in the future without having to re-tag everything in the DB. But that's perhaps needlessly fancy...

The title/original thing will remain for releases, yeah.
#16 by kiru
2019-06-17 at 21:14
I'm just seeing this, but there's one huge oversight so far.

If I choose my language to be Korean, I don't care for the vn titles to be romanized Korean, I want to see hangeul.

I think this is wrong. I'm pretty sure that a large amount of users DON'T want to see all games' English titles, just because they use English for the GUI. (provided I correctly understand what English titles would even be. Basically whatever the heck the localization came up with. I.e. Funbag Fantasy instead of Kyonyuu Fantasy ) Same for any language really. I still remember when MAL did a major title change for whatever reason, and it was so damn terrible, even though the goal was simply to get a system into it. Something like this isn't always wanted, or even useful.

If you want to do this, you should be very aware of this kind of stuff and allow users to freely choose which style they want, completely separate from the GUI. (i.e. the current style shouldn't just get lost)Last modified on 2019-06-17 at 21:15
#17 by arsym
2019-06-17 at 21:17
I'll stick to your proposed structure since it seems cleaner with room to grow. Last thing I'd do is add a COLUMN called "official" that holds a boolean which marks whether this is an official title for that language. AniDB shows how that might appear. This could allow for the option to do "English (official only)".

Anyway, I'm happy with this structure.
#18 by arsym
2019-06-17 at 21:28
@kiru
I think there's a small misunderstanding. We're not discussing adding multiple languages to the website itself, it will still remain in English. We are adding an option to display VN titles in the language of your choosing.

By default, you will see the romanized version of the original title (ie. what we currently have). But if you prefer, you can make it display English titles, in which case you'll see Funbag Fantasy.Last modified on 2019-06-17 at 23:24
#19 by arsym
2019-06-17 at 23:54
@yorhel
vn.c_olang is an array of languages whereas we're trying to replace it with a single language. Would this raise any issues? I'm not sure if there are titles with multiple first languages.
#20 by yorhel
2019-06-18 at 05:43
Last thing I'd do is add a COLUMN called "official" that holds a boolean which marks whether this is an official title for that language.
Hmm, not entirely sure this is a good idea. AniDB has a complex moderation system involving staff with a "language verifier" role. I suppose those people are responsible for setting that "official" flag, but I've no clue what it means for a title to lack that flag. Surely they don't accept random titles found around the net? I'm inclined to only permit official titles as VN titles, but I'm open to suggestions from mods and active contributors.

vn.c_olang is an array of languages whereas we're trying to replace it with a single language
That's a workaround because it's impossible to determine a single main language for a VN based on its releases alone, in practice a single main language ought to suffice. The database conversion might be a little tricky, but some heuristics based on "does the original title contain a Japanese character" would prolly work well enough. For reference, here's a list of VNs with more than 1 original language.
#21 by rampaa
2019-06-18 at 11:47
... but some heuristics based on "does the original title contain a Japanese character" would prolly work well enough.
Wouldn't developer's primary language be a better and easier solution for determining the original language of a VN?
#22 by arsym
2019-06-18 at 19:51
I'm inclined to only permit official titles as VN titles, but I'm open to suggestions from mods and active contributors.
I was thinking of fan patches that translate a VN and its title. Not official, but that's how they released it because they thought it might be easier for people in that language.

Thanks for the list, I'll use a single language and we can have someone go over those special cases.

Reply

You must be logged in to reply to this thread.