API: expose more information?

Posted in

#1 by db-ident-1
2022-10-01 at 15:53
< report >I am trying to write a mobile client for VNDB with Flutter (and hopefully open source it when possible), since the current web UI isn't very mobile-friendly. The TCP-based API is not very easy to use in Dart (their socket implementation is not that good to begin with...), but it is still fine.

However, I have found that some information present on the web page is missing from the API, in the order of importance:
* Titles in other languages (VNs and releases). Some people (incl. myself) are not that comfortable with romanized text, it can be convenient to show the title in their native languages in search results (or at least detail pages).

* Links in releases. This include Steam, SteamDB, etc.

* Play Time

If I am missing something and they are actually exposed, please feel free to correct me.
#2 by Ezezin
2022-10-01 at 16:18
< report >
Titles in other languages. [...] Links in releases. [...] Play Time
Yorhel, when was the last time you updated d11? :D

Jokes asides, I'm glad someone started on a mobile client. Do you have a website, blog, reddit or social media where I can follow your progress?Last modified on 2022-10-01 at 16:23
#3 by Yorhel
2022-10-01 at 16:28
< report >d11 is up to date, I just don't expose new site features in the API until I know someone has a use for it. Will do so later.
#4 by db-ident-1
2022-10-01 at 16:31
< report >These information are really not there when I looked at API responses. I also read the API source code (I can barely read Perl, but it is a programming language anyway), and they are not there.

Some of these (e.g. languages) might require additional JOINs, so we should probably put them behind yet another flag.
#5 by db-ident-1
2022-10-01 at 16:35
< report >
Jokes asides, I'm glad someone started on a mobile client. Do you have a website, blog, reddit or social media where I can follow your progress?
Not for now, as I have a day job and a number of papers to finish before the end of this year, so the progress will be slow. However, I will post updates when the client is at least somehow useable.
#6 by Yorhel
2022-10-02 at 13:51
< report >Alright, titles, release links and VN play times are now available through the API.
#7 by db-ident-1
2022-10-02 at 17:49
< report >Sending these two commands
get vn basic,details,titles,stats (platforms = "swi") {"page":1,"results":25,"sort":"rating","reverse":true}
get vn basic,details,titles,stats (platforms = "swi") {"page":2,"results":25,"sort":"rating","reverse":true}
duplicates Code: Realize ~Sousei no Himegimi~ at the tail and head of the pages (consistently at the time of writing).

After running my infinite scroll page for some time, it seems that the sorting is not consistent (planetarian ~Chiisana Hoshi no Yume~ is not always the last item on page 2), and I would assume that the list of top VNs do not change that often?

Is there something wrong with the pagination or should I just use traditional paging instead of infinite scrolling (which is fine but requires more user interation when turning pages)?

This can not be reproduced using the same filter and sorting on the web interface, so I think that something is wrong.Last modified on 2022-10-02 at 18:03
#8 by Yorhel
2022-10-02 at 18:11
< report >Should be fixed now, problem was that there were two VNs with the same rating at the page boundary and result order is then not very deterministic.
#9 by db-ident-1
2022-10-03 at 17:38
< report >link (screenshot on Windows, should look similar on tablets)

Well, now the only thing before this becoming a useful read-only client is a query builder.

The current API works well, but I kinda have ideas on why more developers are not trying to have a go on a mobile client and why we might be better off with HTTP APIs:

* They don't know TCP. Mobile and client developers these days almost always have HTTP APIs to call, the idea of raw TCP sockets can be dreadful. I am lucky because I do a lot of low-level networking stuff in my day job, but I can see how this can be a barrier for many. At the same time, modern UI programming is inherently asynchronous and sometimes multi-threaded, which means that locks will be a problem for many. It can be really confusing when reads and writes become out of order.

* Mobile networking is challenging. I have learned (in hard ways) that TCP (and sometimes even UDP) sockets can only be seen as ephermal on mobile devices. The network itself and the OS can interrupt TCP connections in unpredictable ways that differs from model to model. HTTP libraries are usually more robust and can handle these interruptions better. They also do things like multiplexing, compression, and caching for you (TCP connection pooling can be quite hard in these unstable conditions).

* Raw TCP does not integrate well with the ecosystem. For example, unless I roll my own SOCKS5 and HTTP CONNECT proxy protocol, I can not reliably proxy API calls (while HTTP libraries automatically handle them). I rely on proxies to access VNDB because I am in China, and the network quality to VNDB is less than ideal. Modern HTTP libraries offer many functionalities, and I am sure that developers will miss them when working with our TCP API.

I really hope that the idea of HTTP APIs can be reconsidered, but the current API is fine if that is a lot of work.
#10 by Yorhel
2022-10-03 at 17:48
< report >An HTTP API is still planned, I just need to get off my lazy ass and work out the details. I've already designed the advanced search feature with API support in mind, but then got stuck for a while on how to actually expose that through HTTP. JSON inside a query string is prolly not a great idea, so perhaps a POST/QUERY with body. Anyway, details, I just need to spend some time on it, and APIs in general haven't been an interesting area to work on given that they're not much of a benefit to VNDB itself.

EDIT: Aren't you afraid China will completely cut off VNDB at some point and all your work will have been a waste? Doesn't seem that unlikely given that we have a lot of... questionable content over here.Last modified on 2022-10-03 at 17:53
#11 by db-ident-1
2022-10-03 at 18:00
< report >People there already uses all kinds of methods to obtain those questionable contents there, so I don't think cutting off VNDB will be a problem for the determined. The only thing that might become less useful in my project is Chinese localization, which is actually not a lot of effort.

BTW: is the "official" field missing from releases? I don't know what the "doujin" field means, as there is no matching field on the web UI. And thumbnail URL and size in screenshots, as I don't want to load the full screenshot unconditionally.
#12 by Yorhel
2022-10-03 at 18:09
< report >
is the "official" field missing from releases?
Yup, will add later.

I don't know what the "doujin" field means
Old deprecated field that only exists to not break existing clients, has been removed from the site.
(Wow, I'm surprised I hadn't fully deleted that column from the DB yet)

thumbnail URL and size in screenshots
Thumbnail URL is replacing /sf/ with /st/. Not quite sure what kind of compatibility guarantees I'd assign to the image URL format, but I don't see that breaking anytime soon.
Dimensions are included, size... should be possible I suppose, if you have a use case for it.
#13 by db-ident-1
2022-10-03 at 18:20
< report >Uh.. not size, but the dimensions of thumbnails, and of visual novel (cover) images as well if possible. Without dimension information the layout may jump around after the image is downloaded.
#14 by db-ident-1
2022-10-04 at 15:36
< report >I managed to solve connection issues by having the proxy tool do port forwarding for me (the Chinese people have created probably the most advanced proxy tools) and allow configuration of alternative API backend, which F-Droid requires.

I just remebered that Flutter works in browsers as well (although it might have some problems with CJK font rendering because it is essentially a large canvas rendered with Skia), so we might be able to provide a mobile version for those who do not want to install apps, or those that are on iOS (I don't think we will ever be on the App Store, it is expensive and Apple will not like the content of this site). However, TCP sockets can not be used in browsers even if we are running Web Assembly, only HTTPS and WebSocket are available.

Anyway, I should be able to provide automated GitHub builds before the end of our national holiday (10.7).

EDIT: Source code is now available at linkLast modified on 2022-10-04 at 15:46
#15 by db-ident-1
2022-10-05 at 02:23
< report >I am getting null in the "lang"."title" field (en and zh-Hant, only zh-Hans is fine) for r89135, while the titles show fine on the web interface.
#16 by Yorhel
2022-10-05 at 05:24
< report >Correct, that means it's the same as the "main" title.
#17 by db-ident-1
2022-10-05 at 07:47
< report >So I should grab that title from the VN that it is associated? My interface allows directly search for releases, so that would be a problem...
#18 by Yorhel
2022-10-05 at 07:52
< report >The title is included in the response, it's the one with the "main":true.
#19 by db-ident-1
2022-10-05 at 07:58
< report >r89135 has three titles, zh-Hant is the same as zh-Hans (the field is left empty in the editing UI), but I don't think en should be null, it has a separate translation.
#20 by Yorhel
2022-10-05 at 08:00
< report >It will if you refresh your cache :) r89135.4 is quite fresh.
#21 by Yorhel
2022-10-05 at 12:27
< report >I've started brainstorming (and slowly implementing) a new HTTP-based API. Currently running at link

The biggest open questions are how to do field selection (the Bitbucket API might be a good learning example) and authentication (I have some ideas for this, but it's not the highest priority anyway, let's first get a sensible read-only API). Feedback on other aspects are welcome too, of course.
#22 by db-ident-1
2022-10-05 at 15:22
< report >I am leaning toward the idea that is similar to GraphQL, which is to always list all fields that you want, and only return the requested fields. In this way, we can never think about removing fields from responses.

For example, currently the only reason I am requesting the "details" flag is to get the URL of the image. I might be able to specify
"fields": ["image", "image_width", "image_height"]

However, you might consider accepting compressed JSON (e.g. with Gzip or Zstd) if we are using such a way, since requests can become large when requesting a lot of fields.

A read-only HTTP API is fine, as I don't want to bother with authentication on my side (for now) either :)Last modified on 2022-10-05 at 15:24
#23 by Yorhel
2022-10-07 at 14:27
< report >Yup, explicit individual field selection seems like a good idea. I've gone for a string-based field selection format that can neatly describe nested fields as well.

Currently implemented a good chunk of the VN fetching code now at the beta and it seems to be working pretty well. Give if a try if you got some time.

However, you might consider accepting compressed JSON
Heh, request body compression sounds like a fun way to trigger weird behavior in cheap HTTP implementations. I wonder if gzip outweights the length of a 'Content-Encoding: gzip' header for such small messages, but certainly worth experimenting with... later.
#24 by db-ident-1
2022-10-07 at 15:25
< report >Played around a bit with it in Postman. The example given in the page gives a "image_nullif", which I think should not be there?

I am still working on the (maybe overly complex for mobile) query builder, the new filter format looks much cleaner to implement though... The only thing I have in my client is textual search of titles, so I can not readily test this in the application.
#25 by Yorhel
2022-10-07 at 16:36
< report >
The example given in the page gives a "image_nullif", which I think should not be there?
Whoops, leaked some internals.

The only thing I have in my client is textual search of titles, so I can not readily test this in the application.
And that's arguably the most important query, too. I've just added a "search" filter.