get_timestamp should use the cache, not query the database #8

WD-stefaang · 2018-11-08T13:31:53Z

Opening and reloading a stage applies these steps with database calls:

resolve asset: check asset existence in db
fetch asset: download asset from db if NEEDS_CACHING
get timestamp: check asset existence again in db

Fetch asset always does an extra get_raw_timestamp check when the cache says that fetching is not necessary...
Because of this, I think that the get_timestamp call should just return the cached timestamp value without making yet another database query.

I think this line should be dropped: https://github.com/LumaPictures/usd-uri-resolver/blob/master/URIResolver/sql.cpp#L525

This reduces the database queries per unique asset from 3 to 2.

sirpalee · 2018-11-12T06:13:44Z

Hey! Querying the timestamp is meant to support updating the asset in case the time stamp has been updated. Using the timestamp from the cache does not give us this feature.

However, we can add an option that disables updating the timestamp from the database and always uses the cache.

sirpalee · 2018-11-12T06:14:00Z

@elrond79 @nrusch what do you think?

WD-stefaang · 2018-11-12T08:38:42Z

I think that get_timestamp is (ab)used to check if a reload of the assets will be required on the next reload... while most of the code in SDFLayer seems to use get_timestamp to keep track of the timestamp of the current layer.
The SDFLayer._reload call seems to prefer to make use of UpdateAssetInfo.

WD-stefaang · 2018-11-12T09:22:41Z

The timestamp recorded during the resolver.FetchToLocalResolvedPath is the correct asset timestamp to be reported in layer::_OpenLayerAndUnlockRegistry.
But for layer::_Reload you'd want the latest timestamp from the original asset, not from the localPath.

So if you won't need to reload assets in the usd runtime (usdcat, usdzip, ...), you can save a lot of database calls.

WD-stefaang · 2018-11-12T10:05:50Z

The confusion all starts with this dubious function

AssetResolver::GetModificationTimestamp(
    const std::string& path, 
    const std::string& resolvedPath)

It would be ideal if an extra 'fetch' boolean would be passed along to indicate that you want either the timestamp from the path (from database) or from the resolvedPath (local file).
It requires some changes in sdf stage... but then both the initial stage load and reloads would run optimally without extra configuration options such as 'NO_CACHED_TIMESTAMP'.
Something for AR2.0 @spiffmon?

WD-stefaang · 2018-11-12T14:51:22Z

I'm getting my expected result by using UpdateAssetInfo to fetch the latest date modified from the db when the cache indicates that the asset is already fetched. When the cache indicates 'NEEDS_CACHING', UpdateAssetInfo does nothing.

Stage.Open
- resolve: set entry in cache with NEEDS_CACHING
- update asset info: do nothing as asset is not cached yet
- fetch: fetch asset and save timestamp in cache
- timestamp: get timestamp from cache

Stage.Reload (no changes)
- resolve: uses cache
- update asset info: update info in cache
- timestamp: get timestamp from cache

Stage.Reload (with changes)
- resolve: uses cache
- update asset info: update info in cache, set NEEDS_CACHING
- timestamp: get timestamp from cache, is newer than layer.info
- fetch: fetch and save timestamp in cache

I ended up with an implementation where Resolve only determines the localPath (without doing any calls), UpdateAssetInfo does a HEAD request (similar to get_raw_timestamp) only if a local cache is present (to refresh the metadata) and Fetch does the heavy lifting with less pre-conditions.

On top of that, I'm adding the "IfModifiedSince" header in the Fetch call, catching the 304 Not Modified error.
In MySQL, a similar feature could be added to the query, allowing the server to decide that no data needs to be send.

So ideally, the URI resolver does a single DB call max for each time the scene is loaded. Only when the server doesn't support the 'IfModifiedSince' header (even not via a query), you may need up to 2 DB calls: one to resolve/update and one to fetch.
An option should exist to prevent checking/fetching if a local cached copy exists.

sirpalee · 2018-11-12T22:04:20Z

Thanks for the detailed information, I'll dig through the details. If you have it working on your side, a PR would be useful.

WD-stefaang · 2018-11-13T10:13:56Z

I've been working in a separate repo: https://github.com/westerndigitalcorporation/usd-s3-resolver
Primary focus was a performant S3 resolver, I bothered a bit less about 'plugability' with the URI resolver at this stage... I kept the history of this project though, so it shouldn't be too hard to merge them.

I need to add a few more performance tests and sort out the fact that reloads are slower than initial loads on S3. Once that is done, I could apply the same logic to the MySQL resolver. It looks like stage.reload is single threaded while the initial load is multithreaded... more on that later.

Unfortunately I have to move on to my next project so I can't make promises on the timeline here.

WD-stefaang · 2018-11-14T14:49:31Z

resolver.RefreshContext(context) seems the ideal place to reset the caches. You're even able to flush only a subset of the cache.
You may need to expose the python bindings here.

WD-stefaang mentioned this issue Nov 15, 2018

Reloading a stage happens single threaded PixarAnimationStudios/OpenUSD#693

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_timestamp should use the cache, not query the database #8

get_timestamp should use the cache, not query the database #8

WD-stefaang commented Nov 8, 2018

sirpalee commented Nov 12, 2018

sirpalee commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

sirpalee commented Nov 12, 2018

WD-stefaang commented Nov 13, 2018

WD-stefaang commented Nov 14, 2018

get_timestamp should use the cache, not query the database #8

get_timestamp should use the cache, not query the database #8

Comments

WD-stefaang commented Nov 8, 2018

sirpalee commented Nov 12, 2018

sirpalee commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

WD-stefaang commented Nov 12, 2018

sirpalee commented Nov 12, 2018

WD-stefaang commented Nov 13, 2018

WD-stefaang commented Nov 14, 2018