Prebid Caching #663

bretg · 2018-08-31T23:52:39Z

This is a proposed set of additions around server-side caching that affects Prebid Server, Prebid Cache, Prebid.js, and Prebid SDK.

Background

Several Prebid use cases require that ad response creatives be cached server-side and subsequently retrieved upon Prebid’s bidWon event. Client-side caching is effective for the standard use case of client-side bidAdapters competing for web display ads through Prebid.js. Other integration types such as Prebid for Mobile, Prebid Video, and Prebid for AMP either cannot use client-side caching, or pay an undesirable performance penalty to do so.

Prebid's cache offering is the Prebid Cache server, which works alone and in conjunction with Prebid Server to implement some caching use cases.

Use Cases

Scenarios supported by this set of requirements:

As a web publisher, I want to be able to use Prebid.js to serve video ads using a mix of bidders that support server-side and client-side caching of VAST XML. I want to be able to define the TTL when stored from the client so certain adunits (e.g. longer videos) may have custom TTL values.
As a web publisher, I want to be able to use Prebid.js to serve video ads via Prebid Server, with the ability to define caching behavior.
As an app developer, I want to be able to use Prebid SDK and Prebid Server to implement header bidding and minimize network traffic by utilizing server-side caching. I don't want the creative body to be returned in the result in order to save my user's network bandwidth and speed my application performance.
As an operator of a prebid server cluster, I want to be able to host multiple independent datacenters in a region to provide fault tolerance.

New Requirements

These are features not currently supported by the Prebid caching infrastructure.

The system should allow the publisher to define what gets cached in each supported scenario: either the whole bid or just the creative body.
The system should allow the publisher's request to define whether the creative body (adm) should be returned even when cached. The default should be 'yes', because that's the current Prebid behavior.
A full URL to the cached asset should be returned in each bid response.
These attributes should be made available to renderers in all cache scenarios, including from the prebidServerBidAdapter
The page should be able to specify an ad cache time-to-live (TTL) for each AdUnit. This is because some adunits may require longer cache periods than others. E.g. one customer wants to have a video unit where the VAST XML is cached for an hour while the default is 5 mins.
1. Max TTLs should be configurable for each cache host company.
Separate system mediaType default TTLs must be specifiable by the Prebid Server host company for video and for mobile. The hard coded system default TTL should be 300 seconds (5 mins) for both.
The caching system should allow each publisher to be able to define their own TTL values by mediaType that override the system defaults.
Prebid Server should use TTLs in this priority order:
1. Request-specified TTL (e.g. this particular adunit has a TTL of 90 mins) (subject to configured Max TTL)
2. Publisher mediaType configured TTL (e.g. all video for this publisher has a TTL of 60 mins) (server config)
3. Format configured TTL (e.g. video on this cluster generally has a TTL of 30 mins) (server config)
4. Hardcoded system default TTL (e.g. 5 min overall default) (server config)
Operational reporting: Prebid Cache should log failed cache-writes and failed cache-reads as metrics.
The system should support writing to multiple Prebid Cache servers. This enables operational redundancy so the same cache ID can be read from a cluster that didn't necessarily host the auction request. It would be better to do this with a distributed cache system, but this option could be useful for Prebid Server host companies.
1. The HTTP return code should be for the primary local cluster write.
2. Failures writing to a secondary cluster should be logged as a metric and to the local log file.
Prebid Server should return an additional key value pair when an item was cached: hb_cache_hostpath. This value should be configurable for each cluster. It could be used by the Prebid Universal Creative to parameterize the cache settings for better portability.

Security

Security requirements for caching:

The system should attempt to prevent specific cache IDs be written from unauthorized sources. The goal is to prevent an attack where malware is inserted into the cache on a valid key that might be retrieved by a user.
The system should be able to detect suspicious cache write behavior, such as one client inserting a large number of entries.
All cache writes and retrievals should be done over HTTPS.

Proposed OpenRTB2 request and response

Request extensions:
{
…
  "imp": [{
      "exp": 3600,    // openRTB location for request TTL
      ...
  }],
  "ext": {
    "prebid": {
      "cache": {
        "vastXml": {
          returnCreative: false,   // new: don't return the VAST, just cache it
        },
        "bids": {
          returnCreative: true, 
        }
      }
    }
  }
… 
}

Response extensions:

{
…
  "seatbid": [{
    "bid": [{
      …
      "ext": {
        "bidder": {
          ...
        },
        "prebid": {
          "targeting": {
             …
             "hb_cache_hostpath": "prebid.adnxs.com/pbc/v1/cache"
             … 
          },
          "cache": {
             "vastXml": {
                 "url": "FULL_CACHE_URL_FOR_THIS_ITEM",
                 "cacheId": "1234567890A"
             },
             "bids": {
                 "url": "FULL_CACHE_URL_FOR_THIS_ITEM",
                 "cacheId": "1234567890B"
             },
           }
         }
       }
    }
  }],
… 
}

Proposed Prebid.js Configuration

Prebid.js needs to be updated to allow the publisher to specify caching parameters. Suggested config:

pbjs.setConfig({
  "cache": {
    url: "https://prebid-server.pbs-host-company.com/cache",
    ttl: 300
  },
  "s2sConfig": {
    …
    "video": {           // new format selector
      "ext.prebid": {    // merged into the openRTB2 request
        "cache": {
          "vastXml": {
            returnCreative: false
          }
        }
      }
    }
    …
  }
});

Appendix - Changes to current systems

If all requirements above are to be implemented these are the changes that would be required.

Prebid.js - better support for s2s video header bidding

prebidServerBidAdapter: s2sConfig 'video.ext.prebid' support
prebidServerBidAdapter: making response.ext.prebid.cache values available.
prebidServerBidAdapter: always add ext.prebid.targeting.includewinners: true for openrtb
support ttl cache parameter

Prebid Cache

Support secondary cache config (cross-datacenter replication)
- Accept and process new query parameter: secondaryCache
Establish graphite metrics for errors
- org.prebid.cache.handlers.PostCacheHandler.error_existing_id
- org.prebid.cache.handlers.PostCacheHandler.remote_error_rate
- org.prebid.cache.handlers.GetCacheHandler.system_error_rate

Prebid Server

Accept and process new request parameter: returnCreative
Add new cache params to response
Generate hb_cache_hostpath targeting variable

Prebid SDK - in a server-side caching scenario

Add returnCreative=false to openRtb
Add asyncCaching option to SDK, pass async option through openRtb when specified
Add ttl option to SDK, pass ttl option through openRtb when specified
Make cache.url in response available to app code

Prebid Universal Creative

Support hb_cache_hostpath

(Note: async caching feature split out into #687)

The text was updated successfully, but these errors were encountered:

bretg · 2018-09-07T14:42:38Z

Updated response example

bretg · 2018-09-12T20:37:56Z

Made a number of updates after feedback from AppNexus team:

removing cacheHost from resp.bid.ext.prebid.cache
supporting req.imp.exp instead of a value in req.ext.prebid.cache
moving per-account async config to be server-side
removed IP address detection security mechanism
clarified that PBS-assigned UUIDs support only async writes, not the datacenter replication scenario
added requirement for hb_cache_hostpath targeting variable
updated PBC metrics to utilize the metrics already implemented and fit new ones into that structure

Going to discuss the "two-endpoint" architecture with the team tomorrow.

bretg · 2018-09-17T15:20:10Z

Got feedback from another internal review that the ttl parameter on the PBC query string is unnecessary -- it's already supported within the protocol packet. So the proposal is to update PBJS to take a ttl argument on the cache object in setConfig and add it appropriately to the cache request.

dbemiller · 2018-09-18T14:36:32Z

Could you give more details on what you mean by "within the protocol packet?"

bretg · 2018-09-18T14:36:40Z

Followup on the "two-endpoint" architecture. We've confirmed that both Redis and Aerospike support a mode where a given key can't be overwritten, and that performance of this mode is good. There's a slight cost (~10%). The proposal is that we make this feature configurable so PBS host companies can make the tradeoff between security and performance. So we don't intend to split out the uuid-specification feature to a separate endpoint -- instead, added requirement 21:

The cache server should also have a configuration which defines whether uuid is accepted as a parameter. The general idea is that a PBS cluster will run in one of two modes: either the caching layer prevents cache entries from being overwritten or the cache won't accept UUIDs on the request, which disables the 'asynchronous cache' feature.

bretg · 2018-09-18T14:38:59Z

Could you give more details on what you mean by "within the protocol packet?"

Apparently Prebid Cache Go and Prebid Cache Java have diverged more than I realized. The Java version supports an 'expiry' attribute on the POST. And a uuid key.

dbemiller · 2018-09-18T14:56:36Z

The proposal is that we make this feature configurable so PBS host companies can make the tradeoff between security and performance.

Imagine the experience of a publisher who wants to switch PBS host companies, or one who starts out running PBS themselves and decides to use a host company instead because it's more trouble than it's worth.

Imagine a publisher trying to read documentation to figure out how to use PBS, if the behavior depends on configs that they can't even see, or which a host company might change at any time without their notice.

This seems like a bad idea for everyone involved.

bretg · 2018-09-18T15:21:12Z

Here's the proposed story:

As a PBS host company, you can decide whether to support asynchronous caching or not.
As a publisher, you can decide which PBS host company you want to use, considering many factors, including whether you want the asynchronous caching feature.

This does not appear to be an unreasonable or unworkable situation.

Having a two-VIP architecture adds fairly significant complexity in setup and debug. So it would only be utilized by PBS host companies that want to support asynchronous caching. So really it comes down to what sort of complexity is required to support asynchronous caching:

two-vip solution

setup an internal-only VIP that responds to /cache-uuid
configure PBS to use /cache-uuid (per account)
no need for a caching layer that supports overwrite prevention

configuration

configure PBS to use async caching (per account)
use a caching layer that supports overwrite prevention
configure PBC to support async caching (utilizing overwrite prevention)

Both cases require configuration, but #2 has fewer moving parts to break.

bretg · 2018-09-18T15:22:01Z

We do need to address the divergence between PBC-Go and PBC-Java. More on that in a separate conversation.

dbemiller · 2018-09-18T16:25:17Z

Might be a good idea to break this proposal into smaller pieces. Many parts of it are good ideas no matter what... but there's a lot to discuss about this async one.

Our consensus over here as basically: "let's run an experiment." Config & publisher-facing options are great if there are legitimately good reasons to make different choices... but they're horrible if one way is just "better".

My intuition here is that async would just be better across the board... but intuition counts for much less than concrete math or experimental data.

If you're open to this, I can open a new issue for it and we can discuss in more detail.

bretg · 2018-09-18T19:16:36Z

Yes, we can leave the async caching feature aside for now.

Have split out the relevant requirements into a separate issue -- #687

dbemiller · 2018-09-20T13:43:50Z

Min and max TTLs should be configurable for each cache host company.

Max TTL config makes sense because host companies have hardware capacity... but what's the use-case for min TTLs?

The caching system should allow each publisher to be able to define their own TTL values by mediaType that override the system defaults.

The publisher already has per-AdUnit cache control in (4)... so this introduces a data redundancy in the request.

I see how this would be a convenient option for publishers... but it's worth noting that the Prebid Server API isn't really publisher-facing. Publishers use PBS through Prebid.js, and edit Stored Requests through a GUI.

Prebid Server should use TTLs in this priority order:

Asking for clarification: where do you see the configs the host company sets in this hierarchy?

Our opinion was that the "max TTL" config took precedence over everything, because only the host company knows what their hardware can support.

hhhjort · 2018-09-20T20:29:53Z

Adding some support for reading exp from the imp and bid, and sending a ttl to prebid cache appropriately. Short term this will help optimize cache utilization. #684

bretg · 2018-10-01T13:24:43Z

what's the use-case for min TTLs?

It doesn't make sense to cache for less than a couple of seconds - it's an edge case, but the idea is to avoid read misses.

where do you see the configs the host company sets in this hierarchy?

Most of them are host company configs

Request-specified TTL (e.g. this particular adunit has a TTL of 90 mins) (from request)
Publisher mediaType configured TTL (e.g. all video for this publisher has a TTL of 60 mins) (in server config)
Format configured TTL (e.g. video on this cluster generally has a TTL of 30 mins) (in server config)
Hardcoded system default TTL (e.g. 5 min overall default) (in config and code?)

The idea behind PBS account-level config is that overrides will be rare and can be supported as config by the PBS host company for important accounts.

Also - updated the cache response to be able to carry cache urls for both vastXml and bids. This accounts for the use case where both are requested.

hhhjort · 2018-10-01T14:14:06Z

Since we have stored requests, I am not sure that publisher level default TTLs are that needed. The stored requests do provide an even granular control, with the downside that it must be set per stored request rather than simply per media type. I am not against it per se, but would rather wait and see if there is a demand before adding it preemptively.

There is also the issue of adding too many controls on the TTL. The more rules we have as to how to set the TTL, the more difficult it becomes to debug why the cache expires when it does. And of course the system needs to run through the entire logic tree to determine the actual TTL on every cache request, which can eat up resources and latency.

For min TTL, I think it may be better to just let the ads fail to cache and have the issue caught quickly, rather than trying to second guess what the publisher meant. For example, let us say that we have a default TTL of 5 minutes, but the publisher realizes it can sometimes take a bit more than 5 minutes before the cache call is made. They want to bump it up to 10 minutes, but accidentally set it to 10 seconds instead. Now if we had a min TTL of 2 or 5 minutes, that TTL might still be enough to get the majority of the publisher's calls. But it could lead to a lot of confused debugging as they try to determine why the bump in TTL did not improve the cache performance, and perhaps degraded it. If however we let the 10 second TTL stand, they should recognize and catch the issue fairly quickly, and get the TTL they actually want in place much sooner.

dbemiller · 2018-10-01T17:07:54Z

Most of them are host company configs

Yeah... sorry, I wasn't clear. I meant to ask about the Max TTL allowed by PBS host. You listed it as a requirement in (4), but it wasn't clear where it sat in the hierarchy of (7).

It seems to me like that should take the highest precedence, since otherwise it's a hardware liability for the PBS host.

bretg · 2018-10-11T19:47:36Z

Updated (7) to clarify that the incoming TTL is compared to the configured max. The other values are in the server config, so are under the control of the host company.
Remove min TTL

Here's the pseudo-code implemented by PBS-Java

if imp.exp then use that or configuredMaxTTL 
else if ext.prebid.cache.*.ttlseconds then use that or configuredMaxTTL 
else if account ID is available from request.app.publisher or request.site.publisher
       if mediatype config for the account is set up, use that
else if mediaType config for the cluster is set up, use that
else, finally, just use the default

Here are the server config values in the PBS-Java PR

auction.cache.expected-request-time-ms - approximate value in milliseconds for Cache Service interacting. This time will be subtracted from global timeout.
auction.cache.banner-ttl-seconds - how long (in seconds) banner will be available in Cache Service.
auction.cache.video-ttl-seconds - how long (in seconds) video creative will be available in Cache Service.
auction.cache.account.<ACCOUNT>.banner-ttl-seconds - how long (in seconds) banner will be available in Cache Service
for particular publisher account. Overrides cache.banner-ttl-seconds property.
auction.cache.account.<ACCOUNT>.video-ttl-seconds - how long (in seconds) video creative will be available in Cache Service
for particular publisher account. Overrides cache.video-ttl-seconds property.

It may be reasonable to place the account-level values in the Accounts DB table at some point, but for now we don't envision these values changing much, don't really want to encourage non-standard timeouts, and reading/caching/updating DB entries is harder than static config.

stale · 2019-08-08T03:30:06Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dbemiller mentioned this issue Sep 14, 2018

Add best-effort TTL support to the API prebid/prebid-cache#40

Merged

bretg mentioned this issue Sep 18, 2018

Support Asynchronous Caching #687

Closed

dbemiller mentioned this issue Oct 1, 2018

Default ttl per media type #697

Merged

stale bot added the stale label Aug 8, 2019

stale bot closed this as completed Aug 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prebid Caching #663

Prebid Caching #663

bretg commented Aug 31, 2018 •

edited

Loading

bretg commented Sep 7, 2018

bretg commented Sep 12, 2018

bretg commented Sep 17, 2018

dbemiller commented Sep 18, 2018

bretg commented Sep 18, 2018 •

edited

Loading

bretg commented Sep 18, 2018

dbemiller commented Sep 18, 2018 •

edited

Loading

bretg commented Sep 18, 2018

bretg commented Sep 18, 2018

dbemiller commented Sep 18, 2018

bretg commented Sep 18, 2018

dbemiller commented Sep 20, 2018 •

edited

Loading

hhhjort commented Sep 20, 2018

bretg commented Oct 1, 2018

hhhjort commented Oct 1, 2018

dbemiller commented Oct 1, 2018

bretg commented Oct 11, 2018

stale bot commented Aug 8, 2019

Prebid Caching #663

Prebid Caching #663

Comments

bretg commented Aug 31, 2018 • edited Loading

Background

Use Cases

New Requirements

Security

Proposed OpenRTB2 request and response

Proposed Prebid.js Configuration

Appendix - Changes to current systems

bretg commented Sep 7, 2018

bretg commented Sep 12, 2018

bretg commented Sep 17, 2018

dbemiller commented Sep 18, 2018

bretg commented Sep 18, 2018 • edited Loading

bretg commented Sep 18, 2018

dbemiller commented Sep 18, 2018 • edited Loading

bretg commented Sep 18, 2018

bretg commented Sep 18, 2018

dbemiller commented Sep 18, 2018

bretg commented Sep 18, 2018

dbemiller commented Sep 20, 2018 • edited Loading

hhhjort commented Sep 20, 2018

bretg commented Oct 1, 2018

hhhjort commented Oct 1, 2018

dbemiller commented Oct 1, 2018

bretg commented Oct 11, 2018

stale bot commented Aug 8, 2019

bretg commented Aug 31, 2018 •

edited

Loading

bretg commented Sep 18, 2018 •

edited

Loading

dbemiller commented Sep 18, 2018 •

edited

Loading

dbemiller commented Sep 20, 2018 •

edited

Loading