Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle exceptions during writes to Azure #783

Merged
merged 7 commits into from
Feb 22, 2024
Merged

Conversation

ddelange
Copy link
Contributor

@ddelange ddelange commented Sep 8, 2023

Title

Avoid lingering incomplete uploads on Azure storage blob in case of upload interrupt/failure.

Analogous to the s3.Writer.__exit__ implementation.

Motivation

Housekeeping, expected behaviour, cost savings.

If you're adding a new feature, then consider opening a ticket and discussing it with the maintainers before you actually do the hard work.

Tests

If you're fixing a bug, consider test-driven development:

  1. Create a unit test that demonstrates the bug. The test should fail.
  2. Implement your bug fix.
  3. The test you created should now pass.

If you're implementing a new feature, include unit tests for it.

Make sure all existing unit tests pass.
You can run them locally using:

pytest smart_open

If there are any failures, please fix them before creating the PR (or mark it as WIP, see below).

Work in progress

If you're still working on your PR, include "WIP" in the title.
We'll skip reviewing it for the time being.
Once you're ready to review, remove the "WIP" from the title, and ping one of the maintainers (e.g. mpenkov).

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

Workflow

Please avoid rebasing and force-pushing to the branch of the PR once a review is in progress.
Rebasing can make your commits look a bit cleaner, but it also makes life more difficult from the reviewer, because they are no longer able to distinguish between code that has already been reviewed, and unreviewed code.

@ddelange ddelange changed the title Add Writer.terminate to azure.py analogous to s3.py Handle exceptions during azure write Sep 10, 2023
@ddelange ddelange changed the title Handle exceptions during azure write Handle exceptions during writes to Azure Sep 10, 2023
raise ValueError
except ValueError:
# FakeBlobClient.commit_block_list was not called
self.assertGreater(len(blob_client._staged_contents), 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add a similar test including compression, after #786 merges

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…open into patch-2

* 'develop' of https://github.com/RaRe-Technologies/smart_open:
  Propagate __exit__ call to underlying filestream (piskvorky#786)
  Retry finalizing multipart s3 upload (piskvorky#785)
  Fix `KeyError: 'ContentRange'` when received full content from S3 (piskvorky#789)
  Add support for SSH connection via aliases from `~/.ssh/config` (piskvorky#790)
  Make calls to smart_open.open() for GCS 1000x faster by avoiding unnecessary GCS API call (piskvorky#788)
  Add zstandard compression feature (piskvorky#801)
  Support moto 4 & 5 (piskvorky#802)
  Secure the connection using SSL when connecting to the FTPS server (piskvorky#793)
  upgrade dev status classifier to stable (piskvorky#798)
  Fix formatting of python code (piskvorky#795)
raise ValueError
except ValueError:
# FakeBlobClient.commit_block_list was not called
self.assertGreater(len(blob_client._staged_contents), 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one still fails, because this test case covers the TextIOWrapper case which calls close on the azure binary io before our FileLikeProxy had the chance to call __exit__ on the azure binary io (which circumvents close when __exit__ is called during exception handling ref #786)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddelange
Copy link
Contributor Author

@mpenkov one more PR green :) added some documentation and extra tests

…open into patch-2

* 'develop' of https://github.com/RaRe-Technologies/smart_open:
  fix test, for real this time
  update integration test
  Add advanced usage sections to README.rst (piskvorky#741)
  Add logic for handling large files in MultipartWriter uploads to s3 (piskvorky#796)
  Fix __str__ method in SinglepartWriter (piskvorky#791)
@ddelange
Copy link
Contributor Author

I've also opened an analogous gcs issue googleapis/python-storage#1228

@mpenkov mpenkov merged commit 42682e7 into piskvorky:develop Feb 22, 2024
21 checks passed
@mpenkov
Copy link
Collaborator

mpenkov commented Feb 22, 2024

Thank you for your work!

@ddelange
Copy link
Contributor Author

Thanks for the reviews! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants