Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_storage_account - fix hanging destroy caused by multiple network rules #5565

Merged
merged 4 commits into from
Feb 7, 2020

Conversation

krispetkov
Copy link
Contributor

Fix for issue #5434 (and related issues: #5530 , #3156 )

  • Problem: If you have a storage account with more than one network rule set - the destroy action hangs forever and the real destroy/deletion of the resource never happens. If you have a storage account with zero or one network rule attached then this problem is irrelevant - the destroy action completes successfully.

  • Reason: In the resourceArmStorageAccountDelete method located in azurerm/internal/services/storage/resource_arm_storage_account.go there is a string array which is passed to the mutex lock method locks.MultipleByName. The reason for the problem is that the creation of this array does not cover all possible cases. The logic of the method assumes that you have different virtual networks and inserts them into the array used for locking but does not work correctly in the corner case where you have one virtual network with few sub-networks and you are assigning these sub-networks as a network rules to your storage account. Hence the array passed to the lock method contains n-times strings with the same name (same strings). When this array goes to the mutex the first string is used to lock, the second string is used to lock etc. but at the end all the strings are the same and when the first unlock comes - everything is OK but when a second, third or what so ever unlock hits... dead lock there is nothing to unlock because the first unlock already used the string and it was removed and all other unlocks are trying to use something that simply does not exists.

  • Resolution: Check if there are subnets and attach them to the network name hence all the strings used to populate the array are unique and the lock-unlock problem is resolved.

Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @krispetkov,

Thank you for the PR, however i'm not sure if your solution is the ideal. By changing the name of what we are locking on other places that lock on the network name might not endup guarded.

How about instead we update locks.MultipleByName to remove any duplicate strings?

@krispetkov
Copy link
Contributor Author

krispetkov commented Feb 4, 2020

Hi @katbyte

Ok, I can make this change. So it will be a more generic solution and will prevent other scenarios similar to this one.

@krispetkov
Copy link
Contributor Author

krispetkov commented Feb 4, 2020

Hi @katbyte

I have provided a new commit with a more generic solution of the problem. Please have a look and share your opinion. I was thinking to make the solution more complex by providing an additional method that will check if the passed array have duplicated elements and then to execute the logic from my last commit. But I don't know if it worth to make it more complex (to introduce a hole new method which will iterate all over the passed array and seeks for duplicates)?

With the current proposed solution we guarantee that every item of the array will be unique.

ByName(name, resourceType)
for i, name := range *names {
// at the end of every array item add its index. This way we guarantee that this item will be unique (no duplicates are possible)
uniqueValue := fmt.Sprintf("%s-arrIdx-%d", name, i)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same issue as the previous solution, if two resources call this with resource N, but they have different indexes we could deadlock. What we should do is just ensure there are no duplicates:

func SliceUniqString(elements []string) []sring {
	seen := make(map[int]bool, len(s))
	j := 0
	for _, v := range s {
		if _, ok := seen[v]; ok {
			continue
		}
		seen[v] = true
		elements[j] = v
		j++
	}
	return elements[:j]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also abstracts out the actual lock names, which could introduce hard to debug issues

Copy link
Contributor Author

@krispetkov krispetkov Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I got your idea. I had prepared similar solution. I was trying to avoid the deletion/remove of duplicated strings and that's why I proposed my previous solution but maybe you are right and this is the right way. I have committed the new changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katbyte I know that you are busy working on v2.0.0 but just wanted to ask if I need to make some other changes or the last ones are OK? And do you think that this fix will be merged in the next v1.44? I'm asking because it is an essential bug for our team and currently blocking some of our tasks and I just want to have some more information.

azurerm/internal/locks/lock.go Outdated Show resolved Hide resolved
@ghost ghost added size/M and removed size/XS labels Feb 5, 2020
@katbyte katbyte added this to the v1.44.0 milestone Feb 7, 2020
Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @krispetkov, this LGTM now 🚀

@katbyte katbyte changed the title Providing a fix for issue 5434 azurerm_storage_account - fix hanging destroy caused by multiple network rules Feb 7, 2020
@katbyte katbyte merged commit 138cfcf into hashicorp:master Feb 7, 2020
katbyte added a commit that referenced this pull request Feb 7, 2020
@ghost
Copy link

ghost commented Feb 12, 2020

This has been released in version 1.44.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 1.44.0"
}
# ... other configuration ...

@ghost
Copy link

ghost commented Mar 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants