Support for custom fields configured using SQL queries #16

simonw · 2020-08-03T02:55:44Z

Datasette canned queries could work really well with this plugin, especially in conjunction with the ability to restrict direct access to tables in #4.

Andy Ingram on Twitter says: https://twitter.com/andrewingram/status/1289624490037530625

The GraphQL creators are (I think) unanimous in their skepticism of tools that bring GraphQL directly to your database or ORM, because they just provide carte blanche access to your entire data model, without actually giving API design proper consideration.

By turning off direct access to tables and instead using canned queries, developers can use this plugin to design the exact GraphQL mechanism that they want. Canned query arguments will turn into GraphQL arguments.

Writable canned queries could even be used to define GraphQL mutations.

simonw · 2020-08-04T03:56:04Z

Need to think about how magic parameters will work: https://datasette.readthedocs.io/en/stable/sql_queries.html#magic-parameters

simonw · 2020-08-06T22:54:33Z

Initial implementation just needs to support read-only canned queries with their parameters turned into GraphQL arguments.

simonw · 2020-08-06T22:55:21Z

I'll implement first: and after: pagination too. I won't bother tackling the case where the canned query has parameters called first: or after: itself - those canned queries can be rewritten to use different parameter names.

simonw · 2020-08-06T23:07:24Z

Here's a tricky aspect of canned queries: how do we know what columns will be returned so we can define them in the schema? And how do we know the types of those columns? We may have to return them all as strings.

Running the query with "limit 0" is enough to get back the column names:

https://latest.datasette.io/fixtures.json?sql=select+neighborhood%2C+facet_cities.name%2C+state%0D%0Afrom+facetable%0D%0A++++join+facet_cities%0D%0A++++++++on+facetable.city_id+%3D+facet_cities.id%0D%0Awhere+neighborhood+like+%27%25%27+||+%3Atext+||+%27%25%27%0D%0Aorder+by+neighborhood+limit+0

Column types may not be possible to obtain in advance. Might have to default to string and allow people to override those in plugin configuration somehow.

simonw · 2020-08-07T06:50:48Z

It would be neat if you could optionally specify that certain columns returned by a canned query are actually IDs that correspond to primary keys in other tables, to enable foreign key nested expansions.

simonw · 2020-08-07T06:52:13Z

Maybe this isn't about existing canned queries at all: maybe it's about being able to define custom names fields in terms of SQL queries, with type information and explicit arguments and nested SQL definitions for nested fields.

All this could go in the plugin configuration.

simonw · 2020-08-07T15:25:57Z

It would also be good if these could support cursor adds pagination. This is a feature missing from Datasette canned queries at the moment. It requires a defined sort order.

simonw · 2020-08-08T03:17:19Z

For the problem of detecting what types a SQL query will return, I may have a partial solution:

sqlite> create table tmp as select * from foo;
sqlite> .schema
CREATE TABLE foo (id int, name text, score float);
CREATE TABLE tmp(id INT,name TEXT,score REAL);
sqlite> create table tmp2 as select 1, 'dog', true, 34.5, * from foo;
sqlite> .schema
CREATE TABLE foo (id int, name text, score float);
CREATE TABLE tmp(id INT,name TEXT,score REAL);
CREATE TABLE tmp2(
  "1",
  "'dog'",
  true,
  "34.5",
  id INT,
  name TEXT,
  score REAL
);
sqlite> create table tmp3 as select 1 as c1, 'dog' as c2, true as c3, 34.5 as c4, * from foo;
sqlite> .schema
CREATE TABLE foo (id int, name text, score float);
CREATE TABLE tmp(id INT,name TEXT,score REAL);
CREATE TABLE tmp2(
  "1",
  "'dog'",
  true,
  "34.5",
  id INT,
  name TEXT,
  score REAL
);
CREATE TABLE tmp3(
  c1,
  c2,
  c3,
  c4,
  id INT,
  name TEXT,
  score REAL
);
sqlite> pragma table_info(tmp4);
sqlite> pragma table_info(tmp3);
cid|name|type|notnull|dflt_value|pk
0|c1||0||0
1|c2||0||0
2|c3||0||0
3|c4||0||0
4|id|INT|0||0
5|name|TEXT|0||0
6|score|REAL|0||0
sqlite>

So basically the trick is to run create table tmp as select ... limit 0 (within a transaction) - then inspect that new tmp table to see what types its columns have before rolling back the transaction to delete the temporary table.

This appears to work for queries that select from a real table, but weirdly it can't identify the type of literal values that were included in the select.

simonw · 2020-08-08T03:18:29Z

Using limit 1 may well fix this, but only if the query can return at least one row.

simonw · 2020-08-08T03:19:34Z

No, even the limit 1 trick didn't correctly detect the types of those literal values:

sqlite> create table foo5 as select 1 as c1, 'dog' as c2, true as c3, 34.5 as c4, * from foo limit 1;
sqlite> pragma table_info(foo5);
cid|name|type|notnull|dflt_value|pk
0|c1||0||0
1|c2||0||0
2|c3||0||0
3|c4||0||0
4|id|INT|0||0
5|name|TEXT|0||0
6|score|REAL|0||0
sqlite> select * from foo5;
c1|c2|c3|c4|id|name|score
1|dog|1|34.5|1|simon|3.5

simonw · 2020-08-08T03:19:51Z

I can assume type of TEXT for columns that were not correctly detected.

simonw · 2020-08-08T03:21:00Z

An operation that concatenated text onto a value from a real column also resulted in a type that was not detected correctly:

sqlite> create table foo6 as select 1 as c1, name || 'dog' as c2, true as c3, 34.5 as c4, * from foo limit 0;
sqlite> pragma table_info(foo6);
cid|name|type|notnull|dflt_value|pk
0|c1||0||0
1|c2||0||0
2|c3||0||0
3|c4||0||0
4|id|INT|0||0
5|name|TEXT|0||0
6|score|REAL|0||0

simonw · 2020-08-09T17:02:56Z

Columns without types can be tricky: the won't benefit from type coercion so queries like where colname = '23' will fail if the column contains an integer - see https://www.sqlite.org/forum/forumpost/e3ed929e7c

simonw · 2020-08-09T21:41:32Z

Configuration concept:

plugins:
  datasette-graphql:
    databases:
      github:
        fields:
          recent_releases:
            sql: |-
              select * from releases order by created_at desc
            fields:
              repo:
                pk_for_table: repos
              contributors:
                row_type: users
                sql: |-
                  select * from users
                  where id in (
                    select user_id from commits
                    where release_id=:id limit 5

This would setup a recent_releases root query which executes that SQL - and returns an object with a custom nested field called repo that expands to a repo (based on the repo column being a primary key for the repos table).

That nested recent_releases item also has a contributors field which executes a custom query and returns rows which are then presented as if they had come from the users automatically created type.

simonw · 2020-08-09T21:42:12Z

This is pretty complex stuff! I'm not convinced by the names of things (especially row_type and pk_for_table). I also need to figure out how nodes/edges/pageInfo pagination would work.

Maybe there's a paginated: true option which turns on nodes/pageInfo for that field.

simonw · 2020-08-09T21:43:30Z

I'm not going to try and get existing canned queries working with this, because of the need for custom syntax to define field types and nested fields.

simonw · 2020-08-22T21:26:07Z

The easiest way to implement this would be to use Generic JSON, as seen in #53. I could get this feature running really quickly with that.

The downside is that it wouldn't have the ability for the GraphQL query author to select exactly which columns came back. That seems like a shame - but the amount of work required to introspect the SQL query and figure out the columns isn't trivial.

Alternatively... let the configuration author chose. By default use Generic JSON, but if they take the time to define the columns that are returned by the query they get a real GraphQL schema out of it. Could look something like this:

plugins:
  datasette-graphql:
    databases:
      github:
        queries:
          releases_since:
            sql: |-
              select id, title, created_at from releases where created_at > :created_since
            fields:
              id: integer
              title: text
              created_at: text

simonw · 2020-08-22T21:27:40Z

Given this configuration format it would still be nice to be able to define nested fields, potentially by optionally referencing existing tables.

simonw · 2020-08-23T21:02:52Z

This should be designed at the same time as mutations support in #47 since they will likely work in a very similar way.

simonw added the enhancement New feature or request label Aug 3, 2020

simonw mentioned this issue Aug 3, 2020

Support table allow-listing #4

Open

simonw mentioned this issue Aug 6, 2020

Support for views #23

Closed

simonw added this to the 1.0 milestone Aug 6, 2020

simonw added a commit that referenced this issue Aug 6, 2020

WIP canned query support, refs #16

358a4d0

simonw changed the title ~~Support for canned queries~~ Support for configured (mayb canned?) queries Aug 8, 2020

simonw changed the title ~~Support for configured (mayb canned?) queries~~ Support for configured (maybe canned?) queries Aug 8, 2020

simonw changed the title ~~Support for configured (maybe canned?) queries~~ Support for custom fields configured using SQL queries Aug 9, 2020

simonw removed this from the 1.0 milestone Aug 23, 2020

simonw mentioned this issue Aug 23, 2020

Custom arguments on tables #58

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for custom fields configured using SQL queries #16

Support for custom fields configured using SQL queries #16

simonw commented Aug 3, 2020

simonw commented Aug 4, 2020

simonw commented Aug 6, 2020 •

edited

Loading

simonw commented Aug 6, 2020

simonw commented Aug 6, 2020

simonw commented Aug 7, 2020

simonw commented Aug 7, 2020 •

edited

Loading

simonw commented Aug 7, 2020

simonw commented Aug 8, 2020 •

edited

Loading

simonw commented Aug 8, 2020 •

edited

Loading

simonw commented Aug 8, 2020

simonw commented Aug 8, 2020

simonw commented Aug 8, 2020

simonw commented Aug 9, 2020

simonw commented Aug 9, 2020

simonw commented Aug 9, 2020 •

edited

Loading

simonw commented Aug 9, 2020

simonw commented Aug 22, 2020 •

edited

Loading

simonw commented Aug 22, 2020

simonw commented Aug 23, 2020

Support for custom fields configured using SQL queries #16

Support for custom fields configured using SQL queries #16

Comments

simonw commented Aug 3, 2020

simonw commented Aug 4, 2020

simonw commented Aug 6, 2020 • edited Loading

simonw commented Aug 6, 2020

simonw commented Aug 6, 2020

simonw commented Aug 7, 2020

simonw commented Aug 7, 2020 • edited Loading

simonw commented Aug 7, 2020

simonw commented Aug 8, 2020 • edited Loading

simonw commented Aug 8, 2020 • edited Loading

simonw commented Aug 8, 2020

simonw commented Aug 8, 2020

simonw commented Aug 8, 2020

simonw commented Aug 9, 2020

simonw commented Aug 9, 2020

simonw commented Aug 9, 2020 • edited Loading

simonw commented Aug 9, 2020

simonw commented Aug 22, 2020 • edited Loading

simonw commented Aug 22, 2020

simonw commented Aug 23, 2020

simonw commented Aug 6, 2020 •

edited

Loading

simonw commented Aug 7, 2020 •

edited

Loading

simonw commented Aug 8, 2020 •

edited

Loading

simonw commented Aug 8, 2020 •

edited

Loading

simonw commented Aug 9, 2020 •

edited

Loading

simonw commented Aug 22, 2020 •

edited

Loading