|#1 by yorhel|
2019-08-31 at 14:03
|Since a few days now we've had a shiny new feature on VNDB: You can run custom SQL queries directly against the database. Or, to be precise, against the public database dump. After a short community consultation and a bit of experimentation I've decided to go with SQLPad, a simple and friendly tool that allows you to run, save and share queries. It also has (limited) graphing capabilities.|
My hope is that this facility will be useful for moderation purposes, to easily find data inconsistencies and missing or incorrect entries. I'd imagine it can also be useful for other purposes, such as more advanced querying than what the website has to offer or for interesting data analysis and stats.
You can find all the information about this new querying interface on this page.
(Alternative title: "SQL Injection as a Service!")
|#2 by shining17|
2019-08-31 at 15:36
|Thanks for your continuous hardwork for this DB.Last modified on 2019-08-31 at 15:36|
|#3 by rampaa|
2019-08-31 at 16:30
|Is it possible to save our queries privately?|
What is the sane way of getting VNs with a specific original language? I think the following works as intended but it's way too slow so I can't use it without getting a timeout:
WITH first_release (vid, released)
Is there a stored procedure to find all child tags of a tag? I do the following but doing it for all the tags I want to include/exclude seems way too repetitive.
WITH RECURSIVE descendant_tags (gid)Last modified on 2019-08-31 at 16:49
|#4 by yorhel|
2019-08-31 at 17:13
Is it possible to save our queries privately?No. I suppose you could keep a notepad somewhere :)
As for original language, this seems to be pretty fast (and uses the current VNDB definition of "original" language, if I didn't screw this up somewhere):
EDIT: Forgot to add, but we'll have a proper "original language" field when this makes it in,
Is there a stored procedure to find all child tags of a tag?There might be if you write one. :)
Alternatively I can also import the 'tags_vn_inherit' and 'traits_chars' cache tables from the main DB which have the tag<->VN and trait<->char links with the parents included, though the inheritance algorithm used may not necessarily be the one you want (case in point).Last modified on 2019-08-31 at 17:15
|#5 by rampaa|
2019-08-31 at 18:15
... this seems to be pretty fastYup it is, thanks.
There might be if you write one.
CREATE OR REPLACE FUNCTION get_descendant_tags (tag_id INTEGER)
Is this acceptable?
("SQL Injection as a Service" does not let me inject things. I shall have my refund!)
Also, I am quite surprised to hear that things mentioned in link are intentional.Last modified on 2019-08-31 at 18:19
|#6 by yorhel|
2019-08-31 at 18:33
t12750 is intentional.It kind of is and kind of isn't. It's yet another case of the system being used in a way I hadn't envisioned, and I don't have a proper solution yet.
|#7 by hinoe|
2019-08-31 at 22:17
|"SQL Injection as a Service" is a far better title and you should be ashamed of yourself for not having gone with it. :P|
Also -- SELECT do_you_remember_my_script FROM things_not_in_dump WHERE fun_was_had = TRUE (I know no SQL whatsoever so this is likely malformed even if you disregard the rest) :D
|#8 by yorhel|
2019-09-01 at 08:20
Is this acceptable?Yup, modified it a little bit to have more consistent naming with the rest of the DB.
Extras added to the DB are documented at /extras.sql.txt.
|#9 by hybtranslation|
2019-09-02 at 08:10
|Wow, this looks really cool. Just a question, the closest to a database schema I could find is |
Could you maybe post the create table statements, or something similar? One could probably download a database dump and extract it from there, but I think the whole point of this service is that you don't have to do get an SQL Server running in the first place.
P.S. "SQL Injection as a Service!" really made my day. Was actually my first thought when I read as far as SQLPad. ^^Last modified on 2019-09-02 at 08:11
|#10 by yorhel|
2019-09-02 at 08:18
|SQLPad has a built-in schema browser (the "Toggle schema" button to the left of the "VNDB" connection), but it's a little buggy at the moment. The full schema can be found in the VNDB repo - only columns marked as [pub] are available for querying.|
|#11 by hybtranslation|
2019-09-02 at 09:05
|#12 by hybtranslation|
2019-09-02 at 10:24
|Alright, I played around with it a bit. Pretty awesome.|
I personally like to split up my queries into little parcels and combine them at the end. That way, you can easily reuse the bits elsewhere. Kinda like this:
SELECT first_release.year, AVG(vote_avg) AS avg_rating, AVG(vote_cnt) AS avg_votes, COUNT(v.id) AS novels FROM
This calculates the average rating of VNs grouped by year of first release, only considering VNs if they have at least 150 votes. Anyway, the queries first_release and votes can easily be reused in any other kind of query, you only have to join on the vid. Hmm, maybe it's just another way to do WITH...
Including the table vn is actually redundant in the above query, but it probably makes it easier to mess around with it and also to check the underlying data.
Anyway, it seems to be implemented well. My queries don't even take a second.
P.S. I noticed that vn.c_popularity, vn.c_rating etc are not public. Maybe consider making those available, calculating the rating like I do is probably making Mr. Bayes angry. ^^Last modified on 2019-09-02 at 10:31
|#13 by rampaa|
2019-09-03 at 22:56
CREATE OR REPLACE FUNCTION get_child_traits(trait_id INTEGER) RETURNS TABLE(trait INTEGER) AS $$Can this be added as well?Last modified on 2019-09-03 at 22:58
|#14 by yorhel|
2019-09-04 at 06:45
Can this be added as well?Done.
I noticed that vn.c_popularity, vn.c_rating etc are not public.I've added c_rating (the bayesian thing) and c_votecount. The popularity rankings are much more expensive to recalculate. :(
|#15 by gimnacio|
2019-09-10 at 07:36
|So I tried to get the mode out of all the novels in my vnlist since i had too many and wanted to reduce to those which did not have high score but popularity or rating sucked, average could have worked somehow but I found mode to be more accurate for i want at least|
anyway so I made this query
SELECT v.title, v.id, votos.score, votos.cont, votos.ROW_NUMBER
FROM vn v
from vnlists vnl
where uid = 123003 -- this is my user id so if you want to try it in your case change that i guess?
ON mylista.vid = v.id
select round(vo.vote/10,0) as score, vo.vid, count(vo.vote) as cont,
ROW_NUMBER () OVER(
PARTITION BY vo.vid ORDER BY COUNT(round(vo.vote/10,0)) DESC
FROM votes vo
GROUP BY vo.vid, score
ON votos.vid = v.id
where votos.ROW_NUMBER = 1
order by votos.score asc
But the thing is that a lot of votes I do not get to see because of private so I am not getting the real numbers
can you not create a hidden user id 0 or something and dump all hidden votes on him so we can use that data for better queries and nones privacy is violated in the end (right?)
|#16 by yorhel|
2019-09-10 at 07:41
|No, private votes aren't and will likely never be included in the dumps or that query DB. Not even in anonymized form.|
|#17 by gimnacio|
2019-09-10 at 16:14
|ok, thanks for answering so fast and not being vague on the subject.|
|#18 by uvix|
2019-09-11 at 03:40
|Is there a lookup table for the enums you're saving? (Specifically vn.length.)|
Also, is there any query to return the current user's userid (e.g. so I could join to my data without hardcoding that value somewhere)?Last modified on 2019-09-11 at 03:42
|#19 by yorhel|
2019-09-11 at 05:30
Is there a lookup table for the enums you're saving? (Specifically vn.length.)Kind of, but those ugly smallint-enums are subject to change at some point - either to Postgres ENUM types or separate tables.
Also, is there any query to return the current user's useridNo. I suppose it could be hacked into SQLPad as some sort of placeholder value, but I'm not very interested in doing that work. :)
|#20 by gimnacio|
2019-09-11 at 06:20
|another thing I am not sure if it s just me not looking right, but I cannot find how to connect the company which produces a vn to it,|
what I want is to make a list of the companies to know how many vns I have of each of them but there seems to be no column connecting that in the sql pad, only id title original alias length length img_nsfw image desc l_wp l_encubed l_renai l_wikidata c_rating c_votecount
|#21 by yorhel|
2019-09-11 at 06:32
|producers <-> releases_producers <-> releases_vn <-> vn|
|#22 by gimnacio|
2019-09-11 at 16:15
|ok thanks again for the quick reply.|
|#23 by rampaa|
2019-09-19 at 17:00
|As per d18#4:|
My queries related to DB maintenance:
VNs and releases where を/ヲ has been romanized as "wo": link
Possible misromanization of おう as "oh" for character names: link
Possible misromanization of おう as "oh" for releases: link
VNs without developers: link
Some (hopefully) interesting statistics:
Number of new Japanese VNs per year: link
Number of Japanese VNs translated into English per year: link
Number of VNs per language: link
Number of translated OJLVNs per language: link
Also a few other things that VNDB doesn't let us do at the moment: t12507.23 (Those search queries do not suffer from the unwanted side effects mentioned in t12750)
Also, link shows the result of my suggestion regarding finding a single original language (t12465.21). I think it's sufficiently correct.Last modified on 2019-09-19 at 17:17
|#24 by roadi|
2019-09-19 at 17:24
|Some characters have their names missing a quote ('), e.g., Shinichi (correct: Shin'ichi): link|
|#25 by beliar|
2019-09-19 at 19:05
VNs and releases where を/ヲ has been romanized as "wo": linkFixed all, except for the RU releases of Ushinawareta Mirai o Motomete. The non-standard spelling is allowed in releases if the dev/publisher specifically uses that spelling for the release, and it seems to me that the Russian team plastered the title with WO in it all over their website. Of course I didn't download the game to check what the title reads when it's played with their translation
Possible misromanization of おう as "oh" for character names: linkFixed those that were incorrect.
Possible misromanization of おう as "oh" for releases: linkFixedLast modified on 2019-09-19 at 20:11
You must be logged in to reply to this thread.