Server Layered Profile Blueprint #6095

rimashah25 · 2021-08-10T18:35:25Z

This PR is not related to any issue. This is a blueprint to propose layered profiles a simple way to assign profiles to servers.

Which Traffic Control components are affected by this PR?

Documentation

What is the best way to verify this PR?

Go through the blueprint and make sure it looks correct.

PR submission checklist

This PR has tests
This PR has documentation
This PR has a CHANGELOG.md entry
This PR DOES NOT FIX A SERIOUS SECURITY VULNERABILITY (see the Apache Software Foundation's security guidelines for details)

dneuman64 · 2021-08-10T19:33:15Z

LGTM, I assume that if users still want to use a single monolith profile that will still be supported? This is important for migration purposes.

ocket8888

Will this affect Origins? Or will those continue to use unlayered Profiles?

In CDN Snapshots and Monitoring config, servers (including Traffic Routers and Traffic Monitors as well as cache servers) have associated Profiles that map to an entry containing Parameters - how will these be affected by layering?

Should the API disallow multiple uses of the same Profile within the list for a single server or Delivery Service (or potentially Origin)?

blueprints/layered-profile.md

ocket8888 · 2021-08-12T20:42:50Z

One other thing I just thought of: how is backwards compatibility handled for objects with layered Profiles? If I have a server with:

{
    "id": 7,
    "profiles": ["foo", "bar"]
}

visible in APIv5, then if I request GET /api/4.0/servers?id=7 do I see

{
    "id": 7,
    "profile": "foo"
}

or

{
    "id": 7,
    "profile": "bar"
}

? I assume it'd never choose a middle Profile and would always choose consistently, but should we use the "bottom" Profile foo because as the base Profile it's assumed to contain the bulk of the configuration and changed relatively the least? Or should it choose the "top" Profile bar because that's the one that is never overridden, and all of its Parameters definitely apply?

If I update a server changing its Profile through API v4 or earlier when it has been configured with multiple Profiles, does that change only the entry that would be chosen for display in a GET response, or does it replace all Profiles with the single one specified? That is, if I have

{
    "id": 7,
    "profiles": ["foo", "bar"]
}

and do a PUT /api/4.0/servers?id=7 with

{
    "id": 7,
    "profile": "testquest"
}

in the body and later do a GET /api/5.0/servers?id=7, do I see

{
    "id": 7,
    "profiles": ["foo", "testquest"]
}

or

{
    "id": 7,
    "profiles": ["testquest", "bar"]
}

or

{
    "id": 7,
    "profiles": ["testquest"]
}

?

rob05c · 2021-08-12T20:54:35Z

One other thing I just thought of: how is backwards compatibility handled for objects with layered Profiles? If I have a server with:

I put a lot of thought into this in my initial proposal. I came up with some options, but they're really hacky.

Honestly, I think the best solution is probably to return a 4xx code from previous API versions for /servers and /deliveryservices if a user has assigned more than 1 Profile. And to warn users when they do, that this will break old clients. Which is mostly ok in practice, it mostly means people just have to upgrade all ATC components and clients before they start using Layered Profiles.

I don't suggest that lightly. IMO breaking backward compatibility is one of the worst things we can do, in terms of usability, maintaining users, I could go on. But I honestly think the alternative is probably worse here.

See https://cwiki.apache.org/confluence/display/TC/Layered+Profiles for some of my compatibility ideas.

I assume it'd never choose a middle Profile and would always choose consistently, but should we use the "bottom" Profile

We can't return any single real profile. If someone assigned multiple Profiles, those Parameters, in that exact order, are necessary. Anything that looks at a single profile for params will be broken. If you look at that wiki link, most of my ideas are around creating a "fake" profile. But IDs and names make that hard, and you still can't edit it, so POST/PUT still return error codes. Unless you did even more hacks.

ocket8888 · 2021-08-12T21:49:31Z

What if we returned a null Profile when there are multiple? That's still breaking, but somewhat less so. It means that e.g. my script that does toget servers | jq '.response[]|.id still gets me all the server IDs instead of failing. In that case, I'd also vote that a PUT or POST from API versions 4 and earlier do what you'd expect - set the server's Profile to the single specified Profile.

That's harder for Delivery Services, where profile: null is already totally legal, which makes it much more of a lie.

rob05c · 2021-08-12T21:58:53Z

What if we returned a null Profile when there are multiple? That's still breaking, but somewhat less so

That makes clients work who don't need the profile/params, but my fear is it's even worse if they do. Especially DSes, which are allowed to not have profiles. But even Servers might not have error checking, that they were guaranteed not to need before.

I feel like forcibly breaking and making people upgrade is much safer here. If we don't, they can operate on that data, and do any number of wrong and broken things, potentially catastrophically breaking their CDN or deleting data. And they won't even know, and it could be a nightmare to track down.

Returning an HTTP error code makes it as easy and obvious as possible to fix, and doesn't potentially irrevocably destroy data, or create an impossible-to-debug scenario.

I'd also vote that a PUT or POST from API versions 4 and earlier do what you'd expect - set the server's Profile to the single specified Profile.

Same issue: if there were multiple, now they just wiped them out, destroying data. I feel like the valid use cases of legitimately wanting to replace multiple profiles with a single are extremely unlikely.

Again, if there are 0 or 1 assigned, letting API <4.0 set it to a different single value is totally fine. But if it has 2+ Profiles, it seems highly unlikely to be correct, and potentially catastrophic. I think the only safe thing to do is return an error code and disallow the entire request or modification.

ocket8888 · 2021-08-12T22:09:11Z

... if there were multiple, now they just wiped them out, destroying data.

sure, but that's what you'd expect to happen, same as if you did a PUT with "profiles": ["a single profile"].

It's just that these are probably two of the three most-used endpoints in the API. I'm not sure just a warning alert is enough when you're breaking probably nearly every single client of the API in existence already by the time you see it. That should definitely come with a confirmation dialog or something in TP, though I don't think there's an equivalent for raw API calls.

blueprints/layered-profile.md

rob05c · 2021-08-12T22:57:04Z

Should the API disallow multiple uses of the same Profile within the list for a single server or Delivery Service (or potentially Origin)?

Yes. Because it doesn't ever make sense. Since Parameters are applied one profile at a time, including the same profile twice will just override any of the same parameters in-between the two, and have the same effect as only having that profile in the later position.

blueprints/layered-profile.md

rawlinp · 2021-08-13T15:24:28Z

Catching up on the conversation about backwards-compatibility, it seems like using some kind of tool outside of TO to do the profile layering is sounding better and better. Or, since it seems like we never want to stop adding to the monolith, we could just update the profiles API to produce a "layered" profile. For example:

POST /api/4.1/profiles
{
    "name": "my-new-abc-def-profile",
    "type": "ATS_PROFILE",
    "description": "my new layered profile that is composed of the abc and def profiles",
    "cdn": "mycdn",
    "routingDisabled": false,
    "layeredProfiles": [
        "abc",
        "def"
    ]
}

{
    "response": {
        "name": "my-new-abc-def-profile",
        "type": "ATS_PROFILE",
        "id": 123,
        "description": "my new layered profile that is composed of the abc and def profiles",
        "cdn": "mycdn",
        "routingDisabled": false,
        "layeredProfiles": [
            "abc",
            "def",
        ]
        ...
    }
}

This new profile would then just be the same as any other profile except that you can't assign parameters directly to it, you can only assign parameters to its individual layers. So it only breaks backwards compatibility in terms of assigning parameters to profiles when they're layered, rather than breaking our bread and butter servers API. A server would still only have one profile.

rob05c · 2021-08-13T15:39:46Z

it seems like using some kind of tool outside of TO to do the profile layering is sounding better and better

-1, It fundamentally doesn't solve the problem. The goal here is to make operating ATC/TO/TP easier. Right now, our Profiles are an unmaintainable mess, they're so huge, and 95% duplicated, the Parameters are constantly wrong and nobody knows it. Doing some kind of outside "generation" doesn't solve the problem. If anything, it makes it worse, because now operators aren't allowed to actually use TO/TP to modify things, so operating becomes even harder. Or if they do, the generation is out of sync, making it even more dangerous and error-prone.

since it seems like we never want to stop adding to the monolith

I'm a pretty big proponent of getting rid of the monolith. But TO is our data store. The schema we have today is not operationally workable, and that's what Layered Profiles fixes. Generating what amounts to a denormalized blob and shoving that in the database is even more difficult and error-prone to operate.

rob05c · 2021-08-13T15:57:54Z

we could just update the profiles API to produce a "layered" profile. For example:

"layeredProfiles": [
   "abc",
   "def"

]

I'm not seeing how that solves the backward-compatibiliity issue? That would still break existing clients

So it only breaks backwards compatibility in terms of assigning parameters

So instead of breaking getting things with profiles, we break changing the profiles or assigning things to them? I'm not seeing how that's a win, they both seem about the same.

A server would still only have one profile.

Except that we redefined what a "profile" is, and it can now be a "profile" or a "profile of profiles."

Maybe it's just me, but that seems a lot harder to understand and work with. Breaking change aside, the idea that a server/ds has a list of Profiles applied in order seems pretty straightforward. Where the idea that a server/ds has a profile, that can be a single or a "profile of profiles" seems to add another layer of abstraction and indirection, seems like it fundamentally adds more to think about. Where "a list" isn't fundamentally another layer of abstraction.

rawlinp · 2021-08-13T16:53:53Z

So instead of breaking getting things with profiles, we break changing the profiles or assigning things to them? I'm not seeing how that's a win, they both seem about the same.

It's a win because we wouldn't be breaking one of our most-used APIs entirely for users of prior API versions. Instead, we're only breaking profile-param assignments, when the profile is layered, which you're not even supposed to do. So it "breaks" something that you're not supposed to do, rather than breaking something that you are supposed to do (updating a server).

Additionally, I would argue that placing the ordering of the profiles on the server object itself is bad, because, in general, all servers with the same characteristics are going to share the same order of layered profiles. My moving the order into the profile object itself, you don't have to specify the order on every single server you're using it for. Instead, you only have to get the order right once, and you can't accidentally change that ordering from the servers page in the UI.

As an operator, using the "profile list" idea, I still have to look at what profiles are in the list and figure out which profile to update. Using the "profile of profiles" idea, it could be presented exactly the same in the UI (as a list of profiles), so there wouldn't be any extra steps or pages to click through. You have to design your "profile of profiles", which would be the same amount of work as defining the order of the "profile list", except you don't have to worry about the ordering when assigning it to servers.

rob05c · 2021-08-13T17:07:48Z

It's a win because we wouldn't be breaking one of our most-used APIs

I'm not sure "most-used" is true. In terms of operation, basically everything non-trivial needs Profiles, Parameters, DeliveryServices, and Servers. For operational scripts that are trivial, I'm not sure there are more little scripts that just need the DS or Server, than there are little scripts that just need Profiles or Parameters.

Additionally, I would argue that placing the ordering of the profiles on the server object itself is bad, because, in general, all servers with the same characteristics are going to share the same order of layered profiles. My moving the order into the profile object itself, you don't have to specify the order on every single server you're using it for. Instead, you only have to get the order right once, and you can't accidentally change that ordering from the servers page in the UI.

That seems like a disadvantage, not an advantage. One of the primary use cases Layered Profiles solves is the 95% identical Profiles, that just have a few small changes. Layered Profiles allows e.g. ["ATS9", "DATACENTER-FOO", "my-one-off-hack", "CDN-X"']. If every combination of that requires a profile-of-profiles, now we're back to the combinatorial explosion of having to create and assign unique profiles for every 95%-identical thing with a few small changes. Sure, we "isolated" the Parameters on my-one-off-hack which is kind of nice; but you still have to create, assign, and manage the Profiles ["ATS8", "DATACENTER-FOO", "my-one-off-hack", "CDN-X"'] and ["ATS9", "DATACENTER-BAR", "my-one-off-hack", "CDN-X"'] and ["ATS8", "DATACENTER-BAR", "my-one-off-hack", "CDN-X"'] and ["ATS9", "DATACENTER-FOO", "my-one-off-hack", "CDN-Y"'] and ["ATS8", "DATACENTER-FOO", "my-one-off-hack", "CDN-Y"'] and ["ATS8", "DATACENTER-BAR", "my-one-off-hack", "CDN-Y"'] and ["ATS8", "DATACENTER-BAR", "my-one-off-hack", "CDN-Y"'].

rawlinp · 2021-08-13T17:54:29Z

If every combination of that requires a profile-of-profiles, now we're back to the combinatorial explosion of having to create and assign unique profiles for every 95%-identical thing with a few small changes.

That might be the case, but the combinatorial explosion doesn't also include parameters in the mess, which is the real issue here, since only the "legacy" profiles have parameters. You're still only editing the parameters of "legacy" profiles. Plus, since the ordering is important, it makes sense to not have the possibility of different orderings on every server in the same "set" by having the ordering be defined only once in the profile object.

rawlinp · 2021-08-13T17:58:29Z

Defining the order in the profile object rather than the server also reduces the chances of making a mistake when creating a new server. Instead of having to replicate the list exactly on the new server to match another server in the same set, you just have to assign it the same profile. Less button clicks and no chance of messing up the order seems like a win for me. Plus, we've been using example profile lists of just a few profiles each. The chance of making a mistake in the ordering increases with the size of the list.

ocket8888 · 2021-08-14T02:21:05Z

I'm not really a big fan of those two approaches either. The second one where you redefine what a Profile is somewhat more promising, but I share Rob's reservations about it. It'd be nice if there was some way to just deprecate the old endpoints and introduce entirely new ones, so that they could operate totally independently, but that's not only a lot more work but difficult (for me, anyway) to decide how to design.

ocket8888 · 2021-08-14T02:33:58Z

On other hand, I'm with Rawlin on Servers and Delivery Services being our two most requested objects/endpoints. Maybe not in terms of "things that use the API" but almost certainly in terms of "requests that hit the API". It's also very troubling that if even one server uses a layered Profile GET /servers would fail completely. Could the API instead feign ignorance of things with layered Profiles in lower API versions than the one that supports it? like if you do GET /api/4.0/servers it returns a list of all servers that don't have layered profiles, and maybe a warning alert that tells you some were omitted (maybe even which, by hostname or ID) and why. PUTs can still fail for layered-profile servers, POSTs can tell the truth if you try to create one that doesn't appear in the results of a GET - that the server actually exists but you're not seeing it because your API version isn't high enough to support its structure.

blueprints/layered-profile.md

mitchell852 · 2021-08-19T18:08:58Z

how about this? if you try to update a server that has a layered profile (1+ profiles) with a non-5.x endpoint, ignore the profile value on PUT (don't make the profile change) and return alert.type=warning and alert.message='server updated but server has a layered profile and changes to a layered profile must be made using api 5.x'

breaking prior versions seems like a non-starter.

blueprints/layered-profile.md

…int/layered_profile

blueprints/layered-profile-server.md

rob05c

LGTM

blueprints/layered-profile-server.md

…int/layered_profile

blueprints/layered-profile-server.md

…int/layered_profile

rimashah25 added 2 commits August 10, 2021 12:30

blueprint for layered profile

769ea79

capitalization

03d6a95

dneuman64 added blueprint feature requirements / implementation details cache-config Cache config generation Traffic Ops related to Traffic Ops Traffic Portal v1 related to Traffic Portal version 1 labels Aug 10, 2021

ocket8888 reviewed Aug 12, 2021

View reviewed changes

rawlinp reviewed Aug 12, 2021

View reviewed changes