Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial structure for APIv3 #5356

Merged
merged 96 commits into from
May 9, 2019
Merged

Initial structure for APIv3 #5356

merged 96 commits into from
May 9, 2019

Conversation

humitos
Copy link
Member

@humitos humitos commented Feb 26, 2019

Structure for APIv3 (django app readthedocs.api.v3),

At the moment, nested expansions are like: /api/v3/projects/pip/?expand=users,active_versions,active_versions.last_build. Also, ?fields=slug,active_versions.slug can be added to select only some fields or ?omit=created,links to omit fields.

Missing parts:

  • define proper values for throttling (DEFAULT_THROTTLE_RATES setting)
  • decide if we are going to remove privacy_level and description from Project response
  • move the module under readthedocs.api.v3
  • make nested expansions work properly
  • sanitize all fields '' to be null on the response
  • return proper version.urls.vcs
  • generate version.downloads as full URLs including schema
  • declare version.last_build as expansible field
  • create serializers for links URLs
  • docstring on new methods
  • implement Users endpoint (/api/v3/users/ and /api/v3/projects/pip/users/)
  • implement BuildCommand endpoint (/api/v3/projects/pip/builds/1025/commands/)
  • support . on URLs (the auto generated URL does not support it 'api/v3/projects/(?P<parent_lookup_project__slug>[^/.]+)/versions/(?P<version_slug>[^/.]+)/$')
  • tests
  • sort keys on response
  • allow public detail and deny public listing requests

Example of current project details response: https://gist.github.com/humitos/4c0eab5c4afbca4fa10894c59d2cde09

@humitos humitos added the PR: work in progress Pull request is not ready for full review label Feb 26, 2019
from readthedocs.projects.models import Project
from readthedocs.restapi.permissions import IsOwner

from .filters import ProjectFilter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This references something that doesn't exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups... I forget to push this file.

@humitos humitos force-pushed the humitos/api/v3-implementation branch from b28c822 to cd6839d Compare March 5, 2019 21:00
@humitos humitos requested a review from a team March 6, 2019 08:05
@humitos humitos added the Needed: tests Tests are required label Mar 6, 2019
else:
slug_url = self.slug

if self.project.repo_type in ('git', 'gitlab'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo_type means what I expects. Here we need to know what kind of repository it is to generate a valid http linkfor the online VCS for that version.

This method should probably need to be refactored, anyway. For now, it does what I need, but it's not robust and does not work for all the cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @stsewd is saying is that gitlab is not even one of the REPO_CHOICES.

In [2]: Project.objects.filter(repo_type='gitlab').count()
Out[2]: 0

This should probably be:

if self.project.repo_type == REPO_TYPE_GIT:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you both are right!

I got confused here. I want to check if github or gitlab or bitbucket is in the repository's URL since I'm trying to build the URL for the version on the external VCS service.

data = {}

for k, v in downloads.items():
if k in ('htmlzip', 'pdf', 'epub'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should accept every key from get_downloas instead of filtering them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to enforce to have a consistent response on the API, but now I realized that the breaking change will be produced if the field is missing and not if there is one new. So, this check should probably be removed.

privacy_level = PrivacyLevelSerializer(source='*')
urls = ProjectURLsSerializer(source='*')
subproject_of = serializers.SerializerMethodField()
translation_of = serializers.SerializerMethodField()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we call these main_project and main_language internally

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think that subproject_of and translation_of are better names for the users of the API, though.

* Retrieve only needed data: ``/api/v3/projects/?fields=slug,created``
* Retrieve specific project: ``/api/v3/projects/{project_slug}/``
* Expand required fields: ``/api/v3/projects/{project_slug}/?expand=active_versions``
* Translations of a projects: ``/api/v3/projects/{project_slug}/translations/``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Translations of a projects: ``/api/v3/projects/{project_slug}/translations/``
* Translations of a project: ``/api/v3/projects/{project_slug}/translations/``

Same for the ones below

* Expand required fields: ``/api/v3/projects/{project_slug}/?expand=active_versions``
* Translations of a projects: ``/api/v3/projects/{project_slug}/translations/``
* Subprojects of a projects: ``/api/v3/projects/{project_slug}/subprojects/``
* Superprojects of a projects: ``/api/v3/projects/{project_slug}/superprojects/``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't allow nested subprojects, so one project can only have one superproject

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This endpoint could be complete removed or named in singular and return a detailed project object instead. Although, I don't think it will be useful.

Probably removing it is better for now.

@humitos humitos force-pushed the humitos/api/v3-implementation branch from 514333a to 00eceef Compare March 7, 2019 19:27
@humitos humitos mentioned this pull request Mar 11, 2019
@humitos humitos requested a review from a team March 11, 2019 14:39
@humitos humitos force-pushed the humitos/api/v3-implementation branch from b7788b8 to fda8d8f Compare March 12, 2019 10:59
Copy link
Contributor

@davidfischer davidfischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only tested the project endpoint so far. Here's my feedback:

  • I am on a bit of a performance kick but I found a number of performance issues related to object relationships.

  • I had to install coreapi. Otherwise I got the error:

    AssertionError: `coreapi` must be installed for schema support.
    

else:
slug_url = self.slug

if self.project.repo_type in ('git', 'gitlab'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @stsewd is saying is that gitlab is not even one of the REPO_CHOICES.

In [2]: Project.objects.filter(repo_type='gitlab').count()
Out[2]: 0

This should probably be:

if self.project.repo_type == REPO_TYPE_GIT:

if self.project.repo_type in ('git', 'gitlab'):
url = f'/tree/{slug_url}/'

if self.project.repo_type == 'bitbucket':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitbucket is also not a valid repo_type. Do you mean 'bitbucket' in repo?

@@ -5,6 +5,7 @@
absolute_import, division, print_function, unicode_literals)

import os
import datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this needed for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are no't using this

repository = RepositorySerializer(source='*')
privacy_level = PrivacyLevelSerializer(source='*')
urls = ProjectURLsSerializer(source='*')
subproject_of = serializers.SerializerMethodField()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and translation_of probably need to be prefetched in the queryset. I'm seeing a lot of duplicate queries to projects_projectrelationship.

subproject_of = serializers.SerializerMethodField()
translation_of = serializers.SerializerMethodField()
default_branch = serializers.CharField(source='get_default_branch')
tags = serializers.StringRelatedField(many=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be prefetched in the queryset.

'privacy_level',
'subproject_of',
'translation_of',
'urls',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field (via our resolver - see #3712) results in two extra queries per project returned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this urls field is the usage of Project.get_docs_url method. I need to check how to improve this.

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running locally I get:

  File "/Users/eric/projects/readthedocs.org/readthedocs/v3/urls.py", line 66, in <module>
    public=True),
  File "/Users/eric/.virtualenvs/readthedocs3/lib/python3.6/site-packages/rest_framework/documentation.py", line 68, in include_docs_urls
    permission_classes=permission_classes,
  File "/Users/eric/.virtualenvs/readthedocs3/lib/python3.6/site-packages/rest_framework/documentation.py", line 29, in get_docs_view
    permission_classes=permission_classes,
  File "/Users/eric/.virtualenvs/readthedocs3/lib/python3.6/site-packages/rest_framework/schemas/__init__.py", line 41, in get_schema_view
    urlconf=urlconf, patterns=patterns,
  File "/Users/eric/.virtualenvs/readthedocs3/lib/python3.6/site-packages/rest_framework/schemas/generators.py", line 257, in __init__
    assert coreapi, '`coreapi` must be installed for schema support.'
AssertionError: `coreapi` must be installed for schema support.

A pip install coreapi fixed it.

This is 💯 better than our current API. Super great work!! I read through most of the code and played around with it locally across all the documented endpoints. Feel free to ignore my comments on the REST Browsable API because I think we need to turn that off in production; so we need another solution for "browsing" it for our users.

I think there's some polish we need to do around field ordering and QA, but I think I'd be 👍 to get this merged and deployed behind a feature flag for internal testing.

if self.slug == STABLE:
stable = determine_stable_version(self.project.versions.all())
if stable:
return stable.slug
Copy link
Member

@ericholscher ericholscher Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only return anything on the stable version, that seems confusing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it return in case it's not stable?

This method return "where the current version points to" and it only makes sense for stable which will point to a specific tag/version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean we should call the function stable_ref or something -- it doesn't get the ref for any Version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we could wait till having aliases, also identifier has the commit of the tag, maybe we can filter for that instead of calling determinate_stable_version

return stable.slug

@property
def vcs_url(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we don't already have this logic somewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do have it, I didn't find it :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -258,6 +290,8 @@ def get_subdomain_url(self):
def get_downloads(self, pretty=False):
project = self.project
data = {}
# TODO: refactor the ``pretty`` argument. It seems to just return nicer
# keys, but the logic looks the same
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Believe @stsewd already did this in another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I need to rebase my branch.

readthedocs/templates/rest_framework/api.html Show resolved Hide resolved
@@ -77,9 +77,10 @@
# Keep the `doc_search` at root level, so the test does not fail for other API
url(r'^api/v2/docsearch/$', PageSearchAPIView.as_view(), name='doc_search'),
url(
r'^api-auth/',
r'^api/auth/',
Copy link
Member

@ericholscher ericholscher Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards compat? Do we care?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not.

I think I left this change by mistake since it's not really necessary but I was testing another thing and I wanted to keep the pattern.

Although, I think we don't care since this is not really used as far as I know.


def get_queryset(self):
queryset = super().get_queryset()
return queryset.filter(users=self.request.user)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an abstraction here that will work better between the .org & .com?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I need to use .public/.api method form the manager here probably so the migration to .com is easier and just in one place.


class ProjectURLsSerializer(serializers.Serializer):
documentation = serializers.CharField(source='get_docs_url')
project = serializers.SerializerMethodField()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks at this in the response was confusing:

urls: {
documentation: "http://time-test.dev.readthedocs.io:8000/en/latest/",
project: null
}

Perhaps it should be project_homepage or something? Otherwise I might confuse it with the URL of the RTD page for the project (which isn't in the response, afaict)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm... good point. I didn't add "human" URLs for Project, Version and Build under Read the Docs. I may worth to add them all in each response.


class VersionFilter(filters.FilterSet):
verbose_name__contains = filters.CharFilter(
field_name='versbose_name',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Suggested change
field_name='versbose_name',
field_name='verbose_name',



class BuildFilter(filters.FilterSet):
running = filters.BooleanFilter(method='get_running')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is broken in testing "invalid_name":

Suggested change
running = filters.BooleanFilter(method='get_running')
running = filters.BooleanFilter(method='get_running', label="Running")

if superproject:
data = self.get_serializer(superproject).data
return Response(data)
return Response(status=404)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I don't have a superproject I get a 404, when I don't have a subproject I get a 200 with an empty list. Should fix one approach and stick to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response is different because the relations are different: a project can have 0 to N subprojects (which returns a list view) but None or 1 superproject (which returns a detailed view).

return queryset

# give access to the user if it's maintainer of the project
if self.request.user in self.model.objects.filter(**self.get_parents_query_dict()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be either a queryset method or a mixin. I think as a rule we should have no custom querysets unless we have a really good reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored all the auth/permission checks into a Mixin and a Permission class. Please, let me know what you think about the new updates.

@humitos humitos force-pushed the humitos/api/v3-implementation branch from 215c1c0 to edf64a8 Compare April 29, 2019 12:18
@humitos humitos requested a review from a team April 29, 2019 12:22
@humitos humitos removed the PR: work in progress Pull request is not ready for full review label Apr 29, 2019
@humitos humitos force-pushed the humitos/api/v3-implementation branch from edf64a8 to 7eeeb7d Compare April 29, 2019 13:49
@humitos humitos removed the Needed: tests Tests are required label Apr 29, 2019
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great on my side. I don't fully understand it, but the approach looks good.

Mixin to define queryset permissions for ViewSet only in one place.

All APIv3 ViewSet should inherit this mixin, unless specific permissions
required. In that case, an specific mixin for that case should be defined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
required. In that case, an specific mixin for that case should be defined.
required. In that case, a specific mixin for that case should be defined.

"""

# Allow hitting ``/api/v3/projects/`` to list their own projects
if self.basename == 'projects' and self.action == 'list':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels weird to have this here as a special case, as noted in the comment. Don't understand the code enough to fully evaluate it tho.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This sort of a special case should be on the ProjectViewSet class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it the ProjectViewSet. Makes more sense there.

"""

def has_permission(self, request, view):
is_authenticated = super().has_permission(request, view)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? auth and permission are different to me, so we should be sure we about what it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as this class extends from IsAuthenticated calling the superclass will just check if the user is authenticated.

Looking more generally at this permission class as a whole I think it is too complicated and will yield future bugs. Instead of having a complex permission class that has some tight coupling with the classes it is used on (this class checks view.basename), I would rather see ProjectViewSet.get_queryset limit the queryset that a user can query.

I see that @ericholscher suggested previously that having a custom get_queryset is something we should avoid. I can't comment there directly since that commit was removed in a force push. I don't agree with that statement and fundamentally this looks like a case where the queryset does need to change. The queryset should return the user's projects on a listview and all projects on a details view.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the chunk of specific code for ProjectViewSet and this class now basically does what David mentioned here:

  1. returns user's projects if admin (detail_objects method)
  2. returns all projects the user can see (listing_objects method)

We may want to keep talking about this in another channel. I don't want to block the PR on this piece of auth since we may need a bigger refactor for that anyways.

if self.slug == STABLE:
stable = determine_stable_version(self.project.versions.all())
if stable:
return stable.slug
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean we should call the function stable_ref or something -- it doesn't get the ref for any Version.

@stsewd
Copy link
Member

stsewd commented Apr 30, 2019

The common submodule shouldn't be updated here (#5517)

I have some questions:

json string ref: tag or branch pointed by this version (available only when version is stable or latest)

Guessing we are going to use this for version's alias

Some bugs I found

  • http://localhost:8000/api/v3/projects//master/ give 500 (it returns 2 objects in a .get query) (latest and master have the same verbose_name, we should look up for slug).

  • Docs should be updated with the current implementation

https://github.com/rtfd/readthedocs.org/blob/humitos/api/v3-implementation/docs/api/v3.rst#L426-L426

  • Looks like builds/latest isn't implemented
  • builds/id/commands isn't implemented
  • users/ isn't implemented

docs/api/v3.rst Outdated
@@ -344,7 +347,7 @@ Build details

.. sourcecode:: bash

$ curl https://readthedocs.org/api/v3/projects/pip/builds/8592686/?include_config=true
$ curl https://readthedocs.org/api/v3/projects/pip/builds/8592686/?expand=config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query description needs to be updated too

Copy link
Member

@stsewd stsewd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the structure of api v3 is fine, I left some questions and maybe bugs p:



class ProjectFilter(filters.FilterSet):
name__contains = filters.CharFilter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this some kind of standard way in rest framework? Feels weird having __ them in the query params (it also can be confused with _, and be ignored without any error)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the standard from django filters and it follow the idea of Django ORM.

slug = value
break

return get_object_or_404(model, slug=slug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bug here, slug can be ended up undefined. It's named as project_slug above

readthedocs/api/v3/mixins.py Show resolved Hide resolved
@@ -0,0 +1,58 @@
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name shouldn't be renderers? p:

readthedocs/api/v3/routers.py Outdated Show resolved Hide resolved
fields = [
'username',
'date_joined',
'last_login',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we leaking some info with last_login?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm... Not sure if this is really important.

In theory, you can see these info from people you share a project with. We already remove first and last name just in case, but those are bigger ones in case of leak.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we care about last_login, we should remove date_joined as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub does show the date_joinded field, last login feels like a more private thing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 on removing last login.

if self.slug == STABLE:
stable = determine_stable_version(self.project.versions.all())
if stable:
return stable.slug
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we could wait till having aliases, also identifier has the commit of the tag, maybe we can filter for that instead of calling determinate_stable_version

return stable.slug

@property
def vcs_url(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -5,6 +5,7 @@
absolute_import, division, print_function, unicode_literals)

import os
import datetime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are no't using this

@humitos humitos force-pushed the humitos/api/v3-implementation branch from d4e5f43 to 5ebf37d Compare May 1, 2019 12:05
@humitos
Copy link
Member Author

humitos commented May 1, 2019

json string ref: tag or branch pointed by this version (available only when version is stable or latest)

Guessing we are going to use this for version's alias

My idea here is that if you get stable, you are able to know what version it is (2.0, or 1.4 for example). Otherwise, there is no way to know it.

Alias are different than this and we should add that value in the response when we implement them.

Hrm, good point. I think that it's better to remove them now and then add them if we need them --otherwise it will be a breaking change.

I suppose that we will need privacy_level for the corporate site, though. But we can add it when we get there.

http://localhost:8000/api/v3/projects/master/ give 500 (it returns 2 objects in a .get query) (latest and master have the same verbose_name, we should look up for slug).

I'm not sure to understand this. That URL will lookup for a project with slug master. If we want to check the version master of a project, this is the URL: /api/v3/projects/test-builds/versions/master/ and it works to me.

Copy link
Contributor

@davidfischer davidfischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of shipping this or something close to it as a first version so we can start to actually use it internally in production and iterate on it. I do have some comments but none of them should block shipping.

I would like to better understand the thought process behind the current expandable fields. I can see a potential use case for expandable fields if the data was very verbose or expensive to calculate but in some of the cases in this PR that is not the case. Furthermore, the data seems pretty useful (the last build on a version, the active versions of a project). Is there instead a way to display say a short listing by default (eg. just the slugs of the active versions) and then optionally expand to something more verbose?

There are still some performance issues with this but I think we can track them down after the internal rollout. For example, querying http://localhost:8000/api/v3/projects/?expand=active_versions.last_build resulted in 150 DB queries.

On a side note, moving restapi -> api/v2 in this PR made it quite a bit harder to review.

@@ -475,8 +475,13 @@ def USE_PROMOS(self): # noqa
REST_FRAMEWORK = {
'DEFAULT_FILTER_BACKENDS': ('django_filters.rest_framework.DjangoFilterBackend',),
'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.LimitOffsetPagination', # NOQA
'DEFAULT_THROTTLE_RATES': {
'anon': '5/minute',
'user': '10/minute',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need to monitor this pretty closely as I suspect we might hit this limit. For the initial rollout, this is fine (removing throttling altogether might be better though) but I think 10/m is very low for a logged in user especially if we need to use the API to do anything.

Copy link
Member Author

@humitos humitos May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I just wanted to have the settings there to decide the best values all together.

I'm going to use the ones listed in the DRF docs as example which seems reasonable: 60/minute for now. We can tune them later if we want to be less permissive.


API is browseable by sending the header ``Authorization: Token <token>`` on each request.

Full documentation at [https://docs.readthedocs.io/en/latest/api/v3.html](https://docs.readthedocs.io/en/latest/api/v3.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Full documentation at [https://docs.readthedocs.io/en/latest/api/v3.html](https://docs.readthedocs.io/en/latest/api/v3.html).
Full documentation at [https://docs.readthedocs.io/page/api/v3.html](https://docs.readthedocs.io/page/api/v3.html).

"""

def has_permission(self, request, view):
is_authenticated = super().has_permission(request, view)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as this class extends from IsAuthenticated calling the superclass will just check if the user is authenticated.

Looking more generally at this permission class as a whole I think it is too complicated and will yield future bugs. Instead of having a complex permission class that has some tight coupling with the classes it is used on (this class checks view.basename), I would rather see ProjectViewSet.get_queryset limit the queryset that a user can query.

I see that @ericholscher suggested previously that having a custom get_queryset is something we should avoid. I can't comment there directly since that commit was removed in a force push. I don't agree with that statement and fundamentally this looks like a case where the queryset does need to change. The queryset should return the user's projects on a listview and all projects on a details view.

# NOTE: we don't override the manager in User model, so we don't have
# ``.api`` method there
if self.model is not User:
queryset = queryset.api(user=user)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm the only one but I think that the magic .api() method on our managers is a bit of an anti-pattern. It isn't entirely clear to me what those methods do and I feel a number of bugs have been introduced due to confusion in what those methods do (in addition to .public() and .private()). Perhaps this is unavoidable for now until the codebases for .org and .com are merged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked a little about this at the offsite. All of our code is using this and changing/removing it will require a bigger refactor.

I wrote a document explaining a little what each of these method do when I started touching auth in this API branch to match what we discussed: #5624

Although, I think this is outside the scope of this PR. We will need to talk more and agree on a pattern to follow, though.

)
slug__contains = filters.CharFilter(
field_name='slug',
lookup_expr='contains',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the queryset, contains on versions may not have great performance. This shouldn't stop the rollout but it will be something to look at in prod.

# Using only ``TokenAuthentication`` for now, so we can give access to
# specific carefully selected users only
authentication_classes = (TokenAuthentication,)
permission_classes = (PublicDetailPrivateListing,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This permission class is too specific to projects to be in this generic mixin which is for all viewsets. I think we should have something more generic like IsAuthenticated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, all of our endpoints are registered under /projects so we can keep it like this for now and extend/move when need it.

'privacy_level',
'programming_language',
'repository_type',
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure of the use case behind adding so many different filterable elements. I'm also not sure what the use case is behind the contains filters. These are a lot of filters on fields that are not indexed in the DB. We had to remove a number of filters in API v2 due to performance issues. This may not be a problem in v3 because they will only filter a single user's much smaller set of projects but at that point these filters aren't very useful.

modified = serializers.DateTimeField(source='modified_date')

expandable_fields = dict(
users=(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maintainers of a project seems like pretty core information about the project. I'm not sure it should be an expandable field. I think it should always be expanded.

  • All the users for all projects can be prefetched in one additional query. It won't be a performance issue.
  • The project/<slug>/users endpoint is then pretty useless. The user serializer doesn't really have any useful info except the username. I say we get rid of the serializer and the endpoint.
  • We may in the future have a user endpoint (not under the project) and if we need to share additional details about a user (eg. all the projects they are a maintainer of) we can do that there and the user can query about a user given just the username.


class ProjectsViewSet(APIv3Settings, APIAuthMixin, NestedViewSetMixin,
FlexFieldsMixin, ListModelMixin, RetrieveModelMixin,
GenericViewSet):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than mixing in ListModelMixin, RetrieveModelMixin and GenericViewSet couldn't we instead mixin ReadOnlyModelViewSet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

The only reason is that I didn't know that it existed 😉 . Upgrading...

)


class APIAuthMixin(NestedParentObjectMixin):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class could use a better name. It isn't authentication per se, it is a class for getting the correct project queryset given a user and a request. How about ProjectQuerysetMixin?

'error',
'commit',
'builder',
'cold_storage',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are leaking builder and cold_storage to common users, we only show this to admin users in v2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that cold_storage could be removed from here, considering that we are going to have everything in cold storage in the near future. So, this won't be useful anymore.

builder is not a problem, but you are right that this should probably be only for admins. I will remove it for now, and we can consider re-add it later if we want to expose different things to different users. Also, if we are going to use lambda functions or similar, the builder won't make sense anymore.

'ref',
'built',
'active',
'uploaded',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we don't want show this, we don't use that feature anymore

subproject_of = serializers.SerializerMethodField()
translation_of = serializers.SerializerMethodField()
default_branch = serializers.CharField(source='get_default_branch')
tags = serializers.StringRelatedField(many=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if we wanted to remove this from versions and projects or only versions. I think tags aren't used anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to remove them from versions only, as far as I remember.

@stsewd
Copy link
Member

stsewd commented May 2, 2019

Also, I feel like links should be private (_links). It could be misinterpreted or collide with another field. For example in versions, downloads vs links. Or in projects urls vs links

@humitos
Copy link
Member Author

humitos commented May 6, 2019

OK! I think that I took a look at all the comments/suggestions and fix them all or replied. I think we can merge it once the tests pass.

Then, we can keep talking about more specific details and make the adjustments we need. We are not going live immediately for the users, so we have some time to do "breaking changes" before making it public.

@humitos humitos force-pushed the humitos/api/v3-implementation branch from 1c0d0c0 to 7985539 Compare May 7, 2019 08:51
@humitos humitos force-pushed the humitos/api/v3-implementation branch from 7985539 to 8ebfb1c Compare May 7, 2019 08:52
@stsewd
Copy link
Member

stsewd commented May 7, 2019

This test is still not passing

    def test_detail_unique_version(self):
        second_project = fixture.get(
            Project,
            users=[self.me]
        )
        second_version = fixture.get(
            Version,
            slug=self.version.slug,
            verbose_name=self.version.verbose_name,
            identifier='a1b2c3',
            project=second_project,
            active=True,
            built=True,
            type='tag',
        )
        self.client.credentials(HTTP_AUTHORIZATION=f'Token {self.token.key}')
        response = self.client.get(
            reverse(
                'projects-versions-detail',
                kwargs={
                    'parent_lookup_project__slug': self.project.slug,
                    'version_slug': self. version.slug,
                }),

        )
        self.assertEqual(response.status_code, 200)
        self.assertDictEqual(
            response.json(),
            self._get_response_dict('projects-versions-detail'),
        )

When there are two projects or more with the same slug for its versions

NestedViewSetMixin has to be on the left of ProjectQuerySetMixin to
filter nested results properly.
@humitos
Copy link
Member Author

humitos commented May 8, 2019

When there are two projects or more with the same slug for its versions

Thanks @stsewd for mentioning this. I added your test to the suite and I think we are ready to merge this now if you agree.

@humitos humitos merged commit 7f79499 into master May 9, 2019
@delete-merged-branch delete-merged-branch bot deleted the humitos/api/v3-implementation branch May 9, 2019 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants