Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API tokens with view-table but not view-database/view-instance cannot access the table #2102

Closed
simonw opened this issue Jul 14, 2023 · 20 comments · Fixed by #2154
Closed

API tokens with view-table but not view-database/view-instance cannot access the table #2102

simonw opened this issue Jul 14, 2023 · 20 comments · Fixed by #2154

Comments

@simonw
Copy link
Owner

simonw commented Jul 14, 2023

Spotted a problem while working on this: if you grant a token access to view table for a specific table but don't also grant view database and view instance permissions, that token is useless.

This was a deliberate design decision in Datasette - it's documented on https://docs.datasette.io/en/1.0a2/authentication.html#access-permissions-in-metadata

If a user cannot access a specific database, they will not be able to access tables, views or queries within that database. If a user cannot access the instance they will not be able to access any of the databases, tables, views or queries.

I'm now second-guessing if this was a good decision.

Originally posted by @simonw in simonw/datasette-auth-tokens#7 (comment)

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

I think I made this decision because I was thinking about default deny: obviously if a user has been denied access to a database. It doesn't make sense that they could access tables within it.

But now that I am spending more time with authentication tokens, which default to denying everything, except for the things that you have explicitly listed, this policy, no longer makes as much sense.

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Relevant code:

datasette/datasette/app.py

Lines 822 to 855 in 0f7192b

async def ensure_permissions(
self,
actor: dict,
permissions: Sequence[Union[Tuple[str, Union[str, Tuple[str, str]]], str]],
):
"""
permissions is a list of (action, resource) tuples or 'action' strings
Raises datasette.Forbidden() if any of the checks fail
"""
assert actor is None or isinstance(actor, dict), "actor must be None or a dict"
for permission in permissions:
if isinstance(permission, str):
action = permission
resource = None
elif isinstance(permission, (tuple, list)) and len(permission) == 2:
action, resource = permission
else:
assert (
False
), "permission should be string or tuple of two items: {}".format(
repr(permission)
)
ok = await self.permission_allowed(
actor,
action,
resource=resource,
default=None,
)
if ok is not None:
if ok:
return
else:
raise Forbidden(action)

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

I tried some code spelunking and came across d6e03b0 which says:

  • If you have table permission but not database permission you can now view the table page

Refs:

Which suggests that my initial design decision wasn't what appears to be implemented today.

Needs more investigation.

@simonw simonw added the bug label Jul 14, 2023
@simonw simonw changed the title Reconsider nested permissions policy API tokens with view-table but not view-database/view-instance cannot access the table Jul 14, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

This might only be an issue with the code that checks _r on actors.

if "_r" not in actor:
# No restrictions, so we have no opinion
return None
_r = actor.get("_r")
# Does this action have an abbreviation?
to_check = {action}
permission = datasette.permissions.get(action)
if permission and permission.abbr:
to_check.add(permission.abbr)
# If _r is defined then we use those to further restrict the actor
# Crucially, we only use this to say NO (return False) - we never
# use it to return YES (True) because that might over-ride other
# restrictions placed on this actor
all_allowed = _r.get("a")
if all_allowed is not None:
assert isinstance(all_allowed, list)
if to_check.intersection(all_allowed):
return None
# How about for the current database?
if isinstance(resource, str):
database_allowed = _r.get("d", {}).get(resource)
if database_allowed is not None:
assert isinstance(database_allowed, list)
if to_check.intersection(database_allowed):
return None
# Or the current table? That's any time the resource is (database, table)
if resource is not None and not isinstance(resource, str) and len(resource) == 2:
database, table = resource
table_allowed = _r.get("r", {}).get(database, {}).get(table)
# TODO: What should this do for canned queries?
if table_allowed is not None:
assert isinstance(table_allowed, list)
if to_check.intersection(table_allowed):
return None
# This action is not specifically allowed, so reject it
return False

Added in bcc781f - refs:

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Here's that crucial comment:

If _r is defined then we use those to further restrict the actor.

Crucially, we only use this to say NO (return False) - we never use it to return YES (True) because that might over-ride other restrictions placed on this actor

So that's why I implemented it like this.

The goal here is to be able to issue a token which can't do anything more than the actor it is associated with, but CAN be configured to do less.

So I think the solution here is for the _r checking code to perhaps implement its own view cascade logic - it notices if you have view-table and consequently fails to block view-table and view-instance.

I'm not sure that's going to work though - would that mean that granting view-table grants view-database in a surprising and harmful way?

Maybe that's OK: if you have view-database but permission checks fail for individual tables and queries you shouldn't be able to see a thing that you shouldn't. Need to verify that though.

Also, do Permission instances have enough information to implement this kind of cascade without hard-coding anything?

@simonw
Copy link
Owner Author

simonw commented Jul 17, 2023

Confirmed that this is an issue with regular Datasette signed tokens as well. I created one on https://latest.datasette.io/-/create-token with these details:

{
    "_r": {
        "r": {
            "fixtures": {
                "sortable": [
                    "vt"
                ]
            }
        }
    },
    "a": "root",
    "d": 3600,
    "t": 1689614483
}

Run like this:

curl -H 'Authorization: Bearer dstok_eyJhIjoicm9vdCIsInQiOjE2ODk2MTQ0ODMsImQiOjM2MDAsIl9yIjp7InIiOnsiZml4dHVyZXMiOnsic29ydGFibGUiOlsidnQiXX19fX0.n-VGxxawz1Q0WK7sqLfhXUgcvY0' \
  https://latest.datasette.io/fixtures/sortable.json

Returned an HTML Forbidden page:

<!DOCTYPE html>
<html>
<head>
    <title>Forbidden</title>
    <link rel="stylesheet" href="/-/static/app.css?d59929">
...

Same token againts /-/actor.json returns:

{
  "actor": {
    "id": "root",
    "token": "dstok",
    "_r": {
      "r": {
        "fixtures": {
          "sortable": [
            "vt"
          ]
        }
      }
    },
    "token_expires": 1689618083
  }
}

Reminder - "_r" means restrict, "r" means resource.

@simonw
Copy link
Owner Author

simonw commented Jul 18, 2023

I think I've figured out the problem here.

The question being asked is "can this actor access this resource, which is within this database within this instance".

The answer to this question needs to consider the full set of questions at once - yes they can access within this instance IF they have access to the specified table and that's the table being asked about.

But the questions are currently being asked independently, which means the plugin hook acting on view-instance can't see that the answer here should be yes because it's actually about a table that the actor has explicit permission to view.

So I think I may need to redesign the plugin hook to always see the full hierarchy of checks, not just a single check at a time.

@simonw simonw self-assigned this Jul 25, 2023
@simonw
Copy link
Owner Author

simonw commented Aug 23, 2023

This is the hook in question:

@hookspec
def permission_allowed(datasette, actor, action, resource):
"""Check if actor is allowed to perform this action - return True, False or None"""

  • True means they are allowed to access it. You only need a singleTrue from a plugin to allow it.
  • False means they are not, and just one False from a plugin will deny it (even if another one returned True I think)
  • None means that the plugin has no opinion on this question.

@simonw
Copy link
Owner Author

simonw commented Aug 23, 2023

Built this new test:

@pytest.mark.asyncio
async def test_view_table_token_can_access_table(perms_ds):
    actor = {
        "id": "restricted-token",
        "token": "dstok",
        # Restricted to just view-table on perms_ds_two/t1
        "_r": {"r": {"perms_ds_two": {"t1": ["vt"]}}},
    }
    cookies = {"ds_actor": perms_ds.client.actor_cookie(actor)}
    response = await perms_ds.client.get("/perms_ds_two/t1.json", cookies=cookies)
    assert response.status_code == 200

The test fails. Running it with pytest --pdb let me do this:

(Pdb) from pprint import pprint
(Pdb) pprint(perms_ds._permission_checks)
deque([{'action': 'view-table',
        'actor': {'_r': {'r': {'perms_ds_two': {'t1': ['vt']}}},
                  'id': 'restricted-token',
                  'token': 'dstok'},
        'resource': ('perms_ds_two', 't1'),
        'result': None,
        'used_default': True,
        'when': '2023-08-23T21:59:45.117155'},
       {'action': 'view-database',
        'actor': {'_r': {'r': {'perms_ds_two': {'t1': ['vt']}}},
                  'id': 'restricted-token',
                  'token': 'dstok'},
        'resource': 'perms_ds_two',
        'result': False,
        'used_default': False,
        'when': '2023-08-23T21:59:45.117189'},
       {'action': 'view-instance',
        'actor': {'_r': {'r': {'perms_ds_two': {'t1': ['vt']}}},
                  'id': 'restricted-token',
                  'token': 'dstok'},
        'resource': None,
        'result': False,
        'used_default': False,
        'when': '2023-08-23T21:59:45.126751'},
       {'action': 'debug-menu',
        'actor': {'_r': {'r': {'perms_ds_two': {'t1': ['vt']}}},
                  'id': 'restricted-token',
                  'token': 'dstok'},
        'resource': None,
        'result': False,
        'used_default': False,
        'when': '2023-08-23T21:59:45.126777'}],
      maxlen=200)

@simonw
Copy link
Owner Author

simonw commented Aug 23, 2023

Idea: datasette-permissions-debug plugin which simply prints out a stacktrace for every permission check so you can see where in the code they are.

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

There might be an easier way to solve this. Here's some permission checks that run when hitting /fixtures/facetable.json:

permission_allowed: action=view-table, resource=('fixtures', 'facetable'), actor={'_r': {'a': ['vi'], 'd': {'fixtures': ['vd']}, 'r': {'fixtures': {'facetable': ['vt']}}}, 'a': 'user'}

  File "/datasette/views/table.py", line 727, in table_view_traced
    view_data = await table_view_data(

  File "/datasette/views/table.py", line 875, in table_view_data
    visible, private = await datasette.check_visibility(

  File "/datasette/app.py", line 890, in check_visibility
    await self.ensure_permissions(actor, permissions)

permission_allowed: action=view-database, resource=fixtures, actor={'_r': {'a': ['vi'], 'd': {'fixtures': ['vd']}, 'r': {'fixtures': {'facetable': ['vt']}}}, 'a': 'user'}

  File "/datasette/views/table.py", line 727, in table_view_traced
    view_data = await table_view_data(

  File "/datasette/views/table.py", line 875, in table_view_data
    visible, private = await datasette.check_visibility(

  File "/datasette/app.py", line 890, in check_visibility
    await self.ensure_permissions(actor, permissions)

permission_allowed: action=view-instance, resource=<None>, actor={'_r': {'a': ['vi'], 'd': {'fixtures': ['vd']}, 'r': {'fixtures': {'facetable': ['vt']}}}, 'a': 'user'}

  File "/datasette/views/table.py", line 727, in table_view_traced
    view_data = await table_view_data(

  File "/datasette/views/table.py", line 875, in table_view_data
    visible, private = await datasette.check_visibility(

  File "/datasette/app.py", line 890, in check_visibility
    await self.ensure_permissions(actor, permissions)

That's with a token that has the view instance, view database and view table permissions required.

But... what if the restrictions logic said that if you have view-table you automatically also get view-database and view-instance?

Would that actually let people do anything they shouldn't be able to do? I don't think it would even let them see a list of tables that they weren't allowed to visit, so it might be OK.

I'll try that and see how it works.

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

I applied a fun trick to help test this out:

diff --git a/datasette/cli.py b/datasette/cli.py
index 58f89c1c..830f47ef 100644
--- a/datasette/cli.py
+++ b/datasette/cli.py
@@ -445,6 +445,10 @@ def uninstall(packages, yes):
     "--token",
     help="API token to send with --get requests",
 )
+@click.option(
+    "--actor",
+    help="Actor to use for --get requests",
+)
 @click.option("--version-note", help="Additional note to show on /-/versions")
 @click.option("--help-settings", is_flag=True, help="Show available settings")
 @click.option("--pdb", is_flag=True, help="Launch debugger on any errors")
@@ -499,6 +503,7 @@ def serve(
     root,
     get,
     token,
+    actor,
     version_note,
     help_settings,
     pdb,
@@ -611,7 +616,10 @@ def serve(
         headers = {}
         if token:
             headers["Authorization"] = "Bearer {}".format(token)
-        response = client.get(get, headers=headers)
+        cookies = {}
+        if actor:
+            cookies["ds_actor"] = client.actor_cookie(json.loads(actor))
+        response = client.get(get, headers=headers, cookies=cookies)
         click.echo(response.text)
         exit_code = 0 if response.status == 200 else 1
         sys.exit(exit_code)

This adds a --actor option to datasette ... --get /path which makes it easy to test an API endpoint using a fake actor with a set of _r restrictions.

With that in place I can try this, with a token that has view-instance and view-database and view-table:

datasette fixtures.db --get '/fixtures/facetable.json' --actor '{
    "_r": {
        "a": [
            "vi"
        ],
        "d": {
            "fixtures": [
                "vd"
            ]
        },
        "r": {
            "fixtures": {
                "facetable": [
                    "vt"
                ]
            }
        }
    },
    "a": "user"
}'

Or this, with a token that just has view-table but is missing the view-database and view-instance:

datasette fixtures.db --get '/fixtures/facetable.json' --actor '{
    "_r": {
        "r": {
            "fixtures": {
                "facetable": [
                    "vt"
                ]
            }
        }
    },
    "a": "user"
}'

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

On first test this seems to work!

diff --git a/datasette/default_permissions.py b/datasette/default_permissions.py
index 63a66c3c..9303dac8 100644
--- a/datasette/default_permissions.py
+++ b/datasette/default_permissions.py
@@ -187,6 +187,30 @@ def permission_allowed_actor_restrictions(datasette, actor, action, resource):
         return None
     _r = actor.get("_r")
 
+    # Special case for view-instance: it's allowed if there are any view-database
+    # or view-table permissions defined
+    if action == "view-instance":
+        database_rules = _r.get("d") or {}
+        for rules in database_rules.values():
+            if "vd" in rules or "view-database" in rules:
+                return None
+        # Now check resources
+        resource_rules = _r.get("r") or {}
+        for _database, resources in resource_rules.items():
+            for rules in resources.values():
+                if "vt" in rules or "view-table" in rules:
+                    return None
+
+    # Special case for view-database: it's allowed if there are any view-table permissions
+    # defined within that database
+    if action == "view-database":
+        database_name = resource
+        resource_rules = _r.get("r") or {}
+        resources_in_database = resource_rules.get(database_name) or {}
+        for rules in resources_in_database.values():
+            if "vt" in rules or "view-table" in rules:
+                return None
+
     # Does this action have an abbreviation?
     to_check = {action}
     permission = datasette.permissions.get(action)

Needs a LOT of testing to make sure what it's doing is sensible though.

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

Also need to confirm that permissions like insert-row, delete-row, create-table etc don't also need special cases to ensure they get through the view-instance etc checks, if those exist for those actions.

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

With that fix in place, this works:

datasette fixtures.db --get '/fixtures/facetable.json' --actor '{
    "_r": {
        "r": {
            "fixtures": {
                "facetable": [
                    "vt"
                ]
            }
        }
    },
    "a": "user"
}'

But this fails, because it's for a table not explicitly listed:

datasette fixtures.db --get '/fixtures/searchable.json' --actor '{
    "_r": {
        "r": {
            "fixtures": {
                "facetable": [
                    "vt"
                ]
            }
        }
    },
    "a": "user"
}'

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

I'm going to implement this in a branch to make it easier to test out.

simonw added a commit that referenced this issue Aug 24, 2023
Also includes a prototype implementation of --actor option from #2153 which I'm using for testing this.
@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

I tested this out against a Datasette Cloud instance. I created a restricted token and tested it like this:

curl -H "Authorization: Bearer $TOKEN" \
  'https://$INSTANCE/-/actor.json' | jq
{
  "actor": {
    "id": "245",
    "token": "dsatok",
    "token_id": 2,
    "_r": {
      "r": {
        "data": {
          "all_stocks": [
            "vt"
          ]
        }
      }
    }
  }
}

It can access the all_stocks demo table:

curl -H "Authorization: Bearer $TOKEN" \
  'https://$INSTANCE/data/all_stocks.json?_size=1' | jq
{
  "ok": true,
  "next": "1",
  "rows": [
    {
      "rowid": 1,
      "Date": "2013-01-02",
      "Open": 79.12,
      "High": 79.29,
      "Low": 77.38,
      "Close": 78.43,
      "Volume": 140124866,
      "Name": "AAPL"
    }
  ],
  "truncated": false
}

Accessing the database returns just information about that table, even though other tables exist:

curl -H "Authorization: Bearer $TOKEN" \
  'https://$INSTANCE/data.json?_size=1'
{
  "database": "data",
  "private": true,
  "path": "/data",
  "size": 3796992,
  "tables": [
    {
      "name": "all_stocks",
      "columns": [
        "Date",
        "Open",
        "High",
        "Low",
        "Close",
        "Volume",
        "Name"
      ],
      "primary_keys": [],
      "count": 8813,
      "hidden": false,
      "fts_table": null,
      "foreign_keys": {
        "incoming": [],
        "outgoing": []
      },
      "private": true
    }
  ],
  "hidden_count": 0,
  "views": [],
  "queries": [],
  "allow_execute_sql": false,
  "table_columns": {}
}

And hitting the top-level /.json thing does the same - it reveals that table but not any of the other tables or databases:

curl -H "Authorization: Bearer $TOKEN" \
  'https://$INSTANCE/.json?_size=1'
{
  "data": {
    "name": "data",
    "hash": null,
    "color": "8d777f",
    "path": "/data",
    "tables_and_views_truncated": [
      {
        "name": "all_stocks",
        "columns": [
          "Date",
          "Open",
          "High",
          "Low",
          "Close",
          "Volume",
          "Name"
        ],
        "primary_keys": [],
        "count": null,
        "hidden": false,
        "fts_table": null,
        "num_relationships_for_sorting": 0,
        "private": false
      }
    ],
    "tables_and_views_more": false,
    "tables_count": 1,
    "table_rows_sum": 0,
    "show_table_row_counts": false,
    "hidden_table_rows_sum": 0,
    "hidden_tables_count": 0,
    "views_count": 0,
    "private": false
  }
}

@simonw
Copy link
Owner Author

simonw commented Aug 24, 2023

So what's needed to finish this is:

  • Tests that demonstrate that nothing is revealed that shouldn't be by tokens restricted in this way
  • Similar tests for other permissions like create-table that check that they work (and don't also need view-instance etc).
  • Documentation

@simonw
Copy link
Owner Author

simonw commented Aug 28, 2023

Here's an existing relevant test:

# non-abbreviations should work too
({"id": "t", "_r": {"a": ["view-instance"]}}, "view-instance", None, None, DEF),
(
{"id": "t", "_r": {"d": {"one": ["view-database"]}}},
"view-database",
"one",
None,
DEF,
),
(
{"id": "t", "_r": {"r": {"one": {"t1": ["view-table"]}}}},
"view-table",
"one",
"t1",
DEF,
),
),
)
async def test_actor_restricted_permissions(
perms_ds, actor, permission, resource_1, resource_2, expected_result
):
cookies = {"ds_actor": perms_ds.sign({"a": {"id": "root"}}, "actor")}
csrftoken = (await perms_ds.client.get("/-/permissions", cookies=cookies)).cookies[
"ds_csrftoken"
]
cookies["ds_csrftoken"] = csrftoken
response = await perms_ds.client.post(
"/-/permissions",
data={
"actor": json.dumps(actor),
"permission": permission,
"resource_1": resource_1,
"resource_2": resource_2,
"csrftoken": csrftoken,
},
cookies=cookies,
)
expected_resource = []
if resource_1:
expected_resource.append(resource_1)
if resource_2:
expected_resource.append(resource_2)
if len(expected_resource) == 1:
expected_resource = expected_resource[0]
expected = {
"actor": actor,
"permission": permission,
"resource": expected_resource,
"result": expected_result,
}
assert response.json() == expected

It's not quite right for this new set of tests though, since they need to be exercising actual endpoints (/.json etc) in order to check that this works correctly.

@simonw
Copy link
Owner Author

simonw commented Aug 28, 2023

I want to test "for this set of restrictions, does a GET/POST to this path return 200 or 403"?

simonw added a commit that referenced this issue Aug 28, 2023
simonw added a commit that referenced this issue Aug 28, 2023
simonw added a commit that referenced this issue Aug 29, 2023
…perations (#2154)

Closes #2102

* Permission is now a dataclass, not a namedtuple - refs #2154
* datasette.get_permission() method
simonw added a commit that referenced this issue Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant