Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #684 Abort S3 MultipartUpload if exception is raised #752

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions smart_open/compression.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
"""Implements the compression layer of the ``smart_open`` library."""
import logging
import os.path
import sys

from smart_open.s3 import MultipartWriter

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -83,9 +86,15 @@ def close_both(*args):
try:
outer_close()
finally:
if inner:
inner, fp = None, inner
fp.close()
ex, _, _ = sys.exc_info()

# https://github.com/RaRe-Technologies/smart_open/issues/684
# If exception is raised multipart writer should be terminated to abort file uploading to S3
# If not incomplete file will be saved in bucket
if type(inner) == MultipartWriter and ex:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi 👋 can this be generalized such that it doesn't just handle S3, but also the other adapters like Azure?

Azure exception handling ref #783

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. by refactoring tweak_close to become tweak_exit, that patches .__exit__() instead of .close()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened #786

inner.terminate()
elif inner:
inner.close()

outer.close = close_both

Expand Down
27 changes: 27 additions & 0 deletions smart_open/tests/test_s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -554,6 +554,33 @@ def test_writebuffer(self):

assert actual == contents

def test_write_gz_with_error(self):
"""Does s3 multipart chunking work correctly if exception is raised for compression file?"""

session = boto3.Session()
resource = session.resource('s3', region_name='us-east-1')
resource.create_bucket(Bucket=BUCKET_NAME)

with self.assertRaises(ValueError):
with smart_open.open(
f's3://{BUCKET_NAME}/{WRITE_KEY_NAME}',
mode="wb",
compression='.gz',
transport_params={
"multipart_upload": True,
"min_part_size": 10,
}
) as fout:
fout.write(b"test12345678test12345678")
fout.write(b"test\n")

raise ValueError("some error")

with self.assertRaises(OSError) as cm:
smart_open.s3.open(BUCKET_NAME, WRITE_KEY_NAME, 'rb')

assert 'The specified key does not exist.' in cm.exception.args[0]


@moto.mock_s3
class SinglepartWriterTest(unittest.TestCase):
Expand Down