-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
S3: fix concurrency issue with versioned bucket and precondition #13299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
S3 Image Test Results (AMD64 / ARM64) 2 files 2 suites 8m 14s ⏱️ Results for commit 149f42b. ♻️ This comment has been updated with latest results. |
Test Results (amd64) - Integration, Bootstrap 5 files 5 suites 1h 40m 53s ⏱️ Results for commit 149f42b. ♻️ This comment has been updated with latest results. |
30014f1 to
4bb03c5
Compare
4bb03c5 to
149f42b
Compare
LocalStack Community integration with Pro 2 files ± 0 2 suites ±0 1h 17m 43s ⏱️ - 43m 43s Results for commit 149f42b. ± Comparison against base commit afbdfeb. This pull request removes 2894 tests. |
k-a-il
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Motivation
See #13294
The issue above is due to using the version-specific write lock to enforce write preconditions, which doesn't work as the new version already has its own lock which isn't the same as the previous version being overwritten, so the 2 new concurrent versions aren't sharing the lock thus both execute.
Current S3 locking is pretty complex as we need to lock for multiple, very different scenarios:
This is how S3 versioning works: https://docs.aws.amazon.com/AmazonS3/latest/userguide/versioning-workflows.html
Basically, an object key is now a reference to a stack of object versions (themselves regular S3 object, or delete markers). The problem is that concurrent writes on a versioned object are not overwriting the previous object, but only putting a new object on the stack, so the current "per object version" locking doesn't work, we're not accessing the existing object lock when doing a conditional write, but only the new, current object.
We need a way to lock "the whole stack" instead. I've tried adding this to the current storage layer, but it doesn't work: if we share the same write lock around an "object stack", we enter a deadlock when doing a in-place
CopyObject:CopyObjectcan read the previous object version and write a new version on top of it. If it shares the same lock, it's pretty clear this cannot work.So we need to use a specific lock for conditional writes, per object key (stack) when using versioned buckets. For regular objects, it already works correctly by design, so I decided to not add it for them and keep the changes contained to versioned buckets.
Side-note: AWS did some changes in
DeleteObjectto now allowIfMatchto be used which wasn't the case before, so maybe we can use the new lock for all keys or useopen()instead to get the right lock. I'm not sureIfMatchcan be used with a specific version id, so usingopenwithout the lock might just work fine on the latest object of the stack.Changes
Testing
This is pretty hard to test in Python due to bad concurrency, but the reporter of the issue shared a small Go snippet that I could execute which would fail reliably before the fix, and is now properly working 👍