#1 by Yorhel 2019-08-31 at 14:03 | < report >Since a few days now we've had a shiny new feature on VNDB: You can run custom SQL queries directly against the database. Or, to be precise, against the public database dump. After a short community consultation and a bit of experimentation I've decided to go with SQLPad, a simple and friendly tool that allows you to run, save and share queries. It also has (limited) graphing capabilities. My hope is that this facility will be useful for moderation purposes, to easily find data inconsistencies and missing or incorrect entries. I'd imagine it can also be useful for other purposes, such as more advanced querying than what the website has to offer or for interesting data analysis and stats. You can find all the information about this new querying interface on this page. Happy querying! (Alternative title: "SQL Injection as a Service!")Last modified on 1970-01-01 at 00:00 |
#2 by Ninigi 2019-08-31 at 15:36 | < report >Thanks for your continuous hardwork for this DB.Last modified on 2019-08-31 at 15:36 |
#3 by rampaa 2019-08-31 at 16:30 | < report >Is it possible to save our queries privately? What is the sane way of getting VNs with a specific original language? I think the following works as intended but it's way too slow so I can't use it without getting a timeout: WITH first_release (vid, released) Is there a stored procedure to find all child tags of a tag? I do the following but doing it for all the tags I want to include/exclude seems way too repetitive. WITH RECURSIVE descendant_tags (gid)Last modified on 2019-08-31 at 16:49 |
#4 by Yorhel 2019-08-31 at 17:13 | < report > Is it possible to save our queries privately? No. I suppose you could keep a notepad somewhere :)As for original language, this seems to be pretty fast (and uses the current VNDB definition of "original" language, if I didn't screw this up somewhere): SELECT title EDIT: Forgot to add, but we'll have a proper "original language" field when this makes it in, Is there a stored procedure to find all child tags of a tag? There might be if you write one. :)Alternatively I can also import the 'tags_vn_inherit' and 'traits_chars' cache tables from the main DB which have the tag<->VN and trait<->char links with the parents included, though the inheritance algorithm used may not necessarily be the one you want (case in point).Last modified on 2019-08-31 at 17:15 |
#5 by rampaa 2019-08-31 at 18:15 | < report > ... this seems to be pretty fast Yup it is, thanks.There might be if you write one. CREATE OR REPLACE FUNCTION get_descendant_tags (tag_id INTEGER) Is this acceptable? ("SQL Injection as a Service" does not let me inject things. I shall have my refund!) Also, I am quite surprised to hear that things mentioned in link are intentional.Last modified on 2019-08-31 at 18:19 |
#6 by Yorhel 2019-08-31 at 18:33 | < report > t12750 is intentional. It kind of is and kind of isn't. It's yet another case of the system being used in a way I hadn't envisioned, and I don't have a proper solution yet.Last modified on 1970-01-01 at 00:00 |
#7 by hinoe 2019-08-31 at 22:17 | < report >"SQL Injection as a Service" is a far better title and you should be ashamed of yourself for not having gone with it. :P Also -- SELECT do_you_remember_my_script FROM things_not_in_dump WHERE fun_was_had = TRUE (I know no SQL whatsoever so this is likely malformed even if you disregard the rest) :D |
#8 by Yorhel 2019-09-01 at 08:20 | < report > Is this acceptable? Yup, modified it a little bit to have more consistent naming with the rest of the DB.Extras added to the DB are documented at /extras.sql.txt.Last modified on 1970-01-01 at 00:00 |
#9 by hybtranslation 2019-09-02 at 08:10 | < report >Wow, this looks really cool. Just a question, the closest to a database schema I could find is link Could you maybe post the create table statements, or something similar? One could probably download a database dump and extract it from there, but I think the whole point of this service is that you don't have to do get an SQL Server running in the first place. P.S. "SQL Injection as a Service!" really made my day. Was actually my first thought when I read as far as SQLPad. ^^Last modified on 2019-09-02 at 08:11 |
#10 by Yorhel 2019-09-02 at 08:18 | < report >SQLPad has a built-in schema browser (the "Toggle schema" button to the left of the "VNDB" connection), but it's a little buggy at the moment. The full schema can be found in the VNDB repo - only columns marked as [pub] are available for querying. |
#11 by hybtranslation 2019-09-02 at 09:05 | < report >Thanks, awesome! |
#12 by hybtranslation 2019-09-02 at 10:24 | < report >Alright, I played around with it a bit. Pretty awesome. I personally like to split up my queries into little parcels and combine them at the end. That way, you can easily reuse the bits elsewhere. Kinda like this: SELECT first_release.year, AVG(vote_avg) AS avg_rating, AVG(vote_cnt) AS avg_votes, COUNT(v.id) AS novels FROM This calculates the average rating of VNs grouped by year of first release, only considering VNs if they have at least 150 votes. Anyway, the queries first_release and votes can easily be reused in any other kind of query, you only have to join on the vid. Hmm, maybe it's just another way to do WITH... Including the table vn is actually redundant in the above query, but it probably makes it easier to mess around with it and also to check the underlying data. Anyway, it seems to be implemented well. My queries don't even take a second. P.S. I noticed that vn.c_popularity, vn.c_rating etc are not public. Maybe consider making those available, calculating the rating like I do is probably making Mr. Bayes angry. ^^Last modified on 2019-09-02 at 10:31 |
#13 by rampaa 2019-09-03 at 22:56 | < report >CREATE OR REPLACE FUNCTION get_child_traits(trait_id INTEGER) RETURNS TABLE(trait INTEGER) AS $$Can this be added as well?Last modified on 2019-09-03 at 22:58 |
#14 by Yorhel 2019-09-04 at 06:45 | < report > Can this be added as well? Done.I noticed that vn.c_popularity, vn.c_rating etc are not public. I've added c_rating (the bayesian thing) and c_votecount. The popularity rankings are much more expensive to recalculate. :( |
#15 by gimnacio 2019-09-10 at 07:36 | < report >So I tried to get the mode out of all the novels in my vnlist since i had too many and wanted to reduce to those which did not have high score but popularity or rating sucked, average could have worked somehow but I found mode to be more accurate for i want at least anyway so I made this query SELECT v.title, v.id, votos.score, votos.cont, votos.ROW_NUMBER FROM vn v inner join ( SELECT vnl.vid from vnlists vnl where uid = 123003 -- this is my user id so if you want to try it in your case change that i guess? ) mylista ON mylista.vid = v.id left JOIN ( select round(vo.vote/10,0) as score, vo.vid, count(vo.vote) as cont, ROW_NUMBER () OVER( PARTITION BY vo.vid ORDER BY COUNT(round(vo.vote/10,0)) DESC ) FROM votes vo GROUP BY vo.vid, score ) votos ON votos.vid = v.id where votos.ROW_NUMBER = 1 order by votos.score asc But the thing is that a lot of votes I do not get to see because of private so I am not getting the real numbers can you not create a hidden user id 0 or something and dump all hidden votes on him so we can use that data for better queries and nones privacy is violated in the end (right?) |
#16 by Yorhel 2019-09-10 at 07:41 | < report >No, private votes aren't and will likely never be included in the dumps or that query DB. Not even in anonymized form.Last modified on 1970-01-01 at 00:00 |
#17 by gimnacio 2019-09-10 at 16:14 | < report >ok, thanks for answering so fast and not being vague on the subject. |
#18 by uvix 2019-09-11 at 03:40 | < report >Is there a lookup table for the enums you're saving? (Specifically vn.length.) Also, is there any query to return the current user's userid (e.g. so I could join to my data without hardcoding that value somewhere)?Last modified on 2019-09-11 at 03:42 |
#19 by Yorhel 2019-09-11 at 05:30 | < report > Is there a lookup table for the enums you're saving? (Specifically vn.length.) Kind of, but those ugly smallint-enums are subject to change at some point - either to Postgres ENUM types or separate tables.Also, is there any query to return the current user's userid No. I suppose it could be hacked into SQLPad as some sort of placeholder value, but I'm not very interested in doing that work. :) |
#20 by gimnacio 2019-09-11 at 06:20 | < report >another thing I am not sure if it s just me not looking right, but I cannot find how to connect the company which produces a vn to it, what I want is to make a list of the companies to know how many vns I have of each of them but there seems to be no column connecting that in the sql pad, only id title original alias length length img_nsfw image desc l_wp l_encubed l_renai l_wikidata c_rating c_votecount |
#21 by Yorhel 2019-09-11 at 06:32 | < report >producers <-> releases_producers <-> releases_vn <-> vn |
#22 by gimnacio 2019-09-11 at 16:15 | < report >ok thanks again for the quick reply. |
#23 by rampaa 2019-09-19 at 17:00 | < report >As per d18#4: My queries related to DB maintenance: VNs and releases where を/ヲ has been romanized as "wo": link Possible misromanization of おう as "oh" for character names: link Possible misromanization of おう as "oh" for releases: link VNs without developers: link Some (hopefully) interesting statistics: Number of new Japanese VNs per year: link Number of Japanese VNs translated into English per year: link Number of VNs per language: link Number of translated OJLVNs per language: link Also a few other things that VNDB doesn't let us do at the moment: t12507.23 (Those search queries do not suffer from the unwanted side effects mentioned in t12750) Also, link shows the result of my suggestion regarding finding a single original language (t12465.21). I think it's sufficiently correct.Last modified on 2019-09-19 at 17:17 |
#24 by roadi 2019-09-19 at 17:24 | < report >Some characters have their names missing a quote ('), e.g., Shinichi (correct: Shin'ichi): link |
#25 by beliar 2019-09-19 at 19:05 | < report > VNs and releases where を/ヲ has been romanized as "wo": link Fixed all, except for the RU releases of Ushinawareta Mirai o Motomete. The non-standard spelling is allowed in releases if the dev/publisher specifically uses that spelling for the release, and it seems to me that the Russian team plastered the title with WO in it all over their website. Of course I didn't download the game to check what the title reads when it's played with their translationEdit: Possible misromanization of おう as "oh" for character names: link Fixed those that were incorrect.Possible misromanization of おう as "oh" for releases: link FixedLast modified on 2019-09-19 at 20:11 |