x/capability: GetCapability deletes capabilities unexpectedly #7805

ebuchman · 2020-11-04T02:48:23Z

Surfaced from Informal Systems IBC Audit of cosmos-sdk hash 82f15f3.

If ScopedKeeper.GetCapability is called with a name that is not in its store, or by a module that doesn't own the cap it's getting, it will delete the capability with the 0th index.

GetCapability contains the following code:

	key := types.RevCapabilityKey(sk.module, name)
	indexBytes := memStore.Get(key)
	index := sdk.BigEndianToUint64(indexBytes)

	if len(indexBytes) == 0 {
		// If a tx failed and NewCapability got reverted, it is possible
		// to still have the capability in the go map since changes to
		// go map do not automatically get reverted on tx failure,
		// so we delete here to remove unnecessary values in map
		delete(sk.capMap, index)
		return nil, false
	}

The intention here seems to be that if the key is not in the store, make sure we remove the associated index from our capMap. However, the capability is referred by name here, and we have to look it up in the store to get the index. But if the lookup returns empty, we don't actually know what the index is, and the empty return bytes are interpreted as 0, so we end up deleting the 0th capability! This is a double whammy: it fails to delete the phantom capability in the capMap, and it succeeds in deleting the 0th indexed capability from the capMap, which will likely cause problems later when something tries to access it.

We don't have a test yet for the case where there was a roll back, but the following two tests (to be run from x/capability/keeper) demonstrate ways this could be triggered. The first involves a module calling GetCapability with a name that doesn't exist, and the second involves a second module called GetCapability on a name owned by another module:

func TestNewGetCap1(t *testing.T) {
	cdc := simapp.MakeTestEncodingConfig().Marshaler
	db := dbm.NewMemDB()
	cms := store.NewCommitMultiStore(db)
	key := sdk.NewKVStoreKey("hello")
	memkey := storetypes.NewMemoryStoreKey("memhello")

	// create new keeper so we can define custom scoping before init and seal
	keeper := keeper.NewKeeper(cdc, key, memkey)

	// scope to module
	sk1 := keeper.ScopeToModule(banktypes.ModuleName)

	// mount stores, initialize and seal
	cms.MountStoreWithDB(key, sdk.StoreTypeIAVL, db)
	cms.MountStoreWithDB(memkey, sdk.StoreTypeMemory, nil)
	require.NoError(t, cms.LoadLatestVersion())
	ctx := sdk.NewContext(cms, tmproto.Header{Height: 1}, false, log.NewNopLogger())
	keeper.InitializeAndSeal(ctx)

	// create a new capability
	cap, err := sk1.NewCapability(ctx, "transfer")
	require.NoError(t, err)
	require.NotNil(t, cap)

	// try to get it, should pass.
	got, ok := sk1.GetCapability(ctx, "transfer")
	require.True(t, ok)
	require.NotNil(t, got)
	require.Equal(t, cap, got, "expected memory addresses to be equal")

	// try to get a made up capability, should fail.
	got, ok = sk1.GetCapability(ctx, "transfer-1")
	require.False(t, ok)
	require.Nil(t, got)

	// try to get the original again. should pass. but fails !
	got, ok = sk1.GetCapability(ctx, "transfer")
	require.True(t, ok)
	require.NotNil(t, got)
	require.Equal(t, cap, got, "expected memory addresses to be equal")
}

func TestNewGetCap2(t *testing.T) {
	cdc := simapp.MakeTestEncodingConfig().Marshaler
	db := dbm.NewMemDB()
	cms := store.NewCommitMultiStore(db)
	key := sdk.NewKVStoreKey("hello")
	memkey := storetypes.NewMemoryStoreKey("memhello")

	// create new keeper so we can define custom scoping before init and seal
	keeper := keeper.NewKeeper(cdc, key, memkey)

	// scope to two modules
	sk1 := keeper.ScopeToModule(banktypes.ModuleName)
	sk2 := keeper.ScopeToModule(stakingtypes.ModuleName)

	// mount stores, initialize and seal
	cms.MountStoreWithDB(key, sdk.StoreTypeIAVL, db)
	cms.MountStoreWithDB(memkey, sdk.StoreTypeMemory, nil)
	require.NoError(t, cms.LoadLatestVersion())
	ctx := sdk.NewContext(cms, tmproto.Header{Height: 1}, false, log.NewNopLogger())
	keeper.InitializeAndSeal(ctx)

	// module 1 creates a new capability
	cap, err := sk1.NewCapability(ctx, "transfer")
	require.NoError(t, err)
	require.NotNil(t, cap)

	// module 1 tries to get it. should pass.
	got, ok := sk1.GetCapability(ctx, "transfer")
	require.True(t, ok)
	require.NotNil(t, got)
	require.Equal(t, cap, got, "expected memory addresses to be equal")

	// module 2 tries to get it. should fail
	got, ok = sk2.GetCapability(ctx, "transfer")
	require.False(t, ok)
	require.Nil(t, got)

	// module 1 tries to get it. should pass. but fails !
	got, ok = sk1.GetCapability(ctx, "transfer")
	require.True(t, ok)
	require.NotNil(t, got)
	require.Equal(t, cap, got, "expected memory addresses to be equal")
}

While both tests should pass, they fail. In the first test, if the call to sk1.GetCapability(ctx, "transfer-1") is removed, the test passes. In the second, if the call to sk2.GetCapability is removed, the test passes. So it seems these calls are causing the 0th index capability to be wiped due to the empty bytes returned from the store being interpreted as the 0th index. Notably, this bug only applies to the capability with index 0 - if the index is already greater than 0, the test sequences above would pass as expected.

Fixes

One approach may be to simply start the capability index at 1, which would solve the deletion issues. However it would not solve the problem of failing to remove phantom capabilities from reverted txs, since the problem here is we don't know their index. Possibly the simplest solution there is to maintain a reverse lookup map from name to index (just like the fwd/reverse maps are kept in the store, we'd keep one in the local map too).

Also, should the capMap really be shared across all ScopedKeepers?

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

The text was updated successfully, but these errors were encountered:

alexanderbez · 2020-11-04T13:03:13Z

I like the idea of a reverse lookup map. @AdityaSripal was there a reason we went with zero-based indexing?

AdityaSripal · 2020-11-06T12:01:35Z

Sorry it took me a while to reply @alexanderbez @ebuchman, I was pretty sure I initialized with 1 but needed to reread the code to make sure as it has been a while. Read through your issue, thanks for surfacing this!!

So we actually do start with one-based indexing with the capability module which is why there isn't a bug with regards to deleting a capability.

The global starting capability index is inititialized in InitChain by reading from the module's GenesisState. The default GenesisState will intialize the capability index to 1. The reason this is done through genesis and not statically in code, is that a chain that is importing state from a previous blockchain will need to initialize the capability index to the index that was exported by the previous chain (ie some index > 1). Note that if a chain tried to Initialize the capability index with 0, this would indeed cause issues which is why we error on ValidateGenesis if GenesisState.Index == 0.

Here is the PR that fixed this issue: #6047

Specific files on master to look at:
https://github.com/cosmos/cosmos-sdk/blob/master/x/capability/types/genesis.go
https://github.com/cosmos/cosmos-sdk/blob/master/x/capability/genesis.go

So while this error is possible to recreate in keeper tests, it cannot happen in a real chain since capability index will never start at 0 and no capability will ever be written to the 0th index. Let me know if there is a way for this to be documented better, or if you believe there is a better solution than the one implemented.

However, your point about us not deleting the capability from the map in case of a tx rollback is absolutely correct and should be fixed. A reverse lookup map should work.

One issue I found is that SetIndex I believe should still check it hasn't been previously initialized. This was implemented in #6047 but removed in #6057. I believe that check should be added back.
f82bc19#diff-6f9e7102c865d8072bd3eceea20a24354290dfafe38c7cf59135b31ead9ec3f4R133
While still allowing initialization with "non-1" indices.

ebuchman · 2020-11-06T21:12:48Z

Aha, thanks for the details, I should have seen that! I had the comment on SetIndex marked as incorrect, but went after the "its set to 1" issue and demonstrated it artificially. Clearly the stores need to be implemented from genesis states - in trying to cut out the simapp as a testing dependency, I neglected that.

I think adding more constraints to SetIndex is a good idea - both checking if its already set and if the index is greater than 0. I've submitted a PR which does so and fixes some other comments/tests here: #7845

colin-axner · 2020-11-16T14:14:46Z

looks like this issue has been addressed

ebuchman added this to the IBC 1.0 Implementation Audit milestone Nov 4, 2020

ebuchman added the T:Bug label Nov 4, 2020

cwgoes assigned AdityaSripal, cwgoes and colin-axner Nov 6, 2020

ebuchman mentioned this issue Nov 6, 2020

x/cap: fix comments, SetIndex, and some tests #7845

Merged

9 tasks

colin-axner closed this as completed Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/capability: GetCapability deletes capabilities unexpectedly #7805

x/capability: GetCapability deletes capabilities unexpectedly #7805

ebuchman commented Nov 4, 2020 •

edited

Loading

alexanderbez commented Nov 4, 2020

AdityaSripal commented Nov 6, 2020 •

edited

Loading

ebuchman commented Nov 6, 2020

colin-axner commented Nov 16, 2020

x/capability: GetCapability deletes capabilities unexpectedly #7805

x/capability: GetCapability deletes capabilities unexpectedly #7805

Comments

ebuchman commented Nov 4, 2020 • edited Loading

Fixes

For Admin Use

alexanderbez commented Nov 4, 2020

AdityaSripal commented Nov 6, 2020 • edited Loading

ebuchman commented Nov 6, 2020

colin-axner commented Nov 16, 2020

ebuchman commented Nov 4, 2020 •

edited

Loading

AdityaSripal commented Nov 6, 2020 •

edited

Loading