forked from greenplum-db/gpbackup-archive
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid using stderr to detect plugin failures, wait for plugin processes #99
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
whitehawk
changed the title
Avoid using stderr to detect plugin failures, wait for plugin process…
Avoid using stderr to detect plugin failures, wait for plugin processes
Jul 29, 2024
RekGRpth
reviewed
Jul 29, 2024
RekGRpth
reviewed
Jul 29, 2024
RekGRpth
reviewed
Jul 29, 2024
RekGRpth
reviewed
Jul 29, 2024
RekGRpth
reviewed
Jul 29, 2024
RekGRpth
reviewed
Jul 31, 2024
This comment was marked as resolved.
This comment was marked as resolved.
RekGRpth
reviewed
Jul 31, 2024
RekGRpth
reviewed
Jul 31, 2024
whitehawk
force-pushed
the
ADBDEV-5966
branch
2 times, most recently
from
August 1, 2024 01:09
b595bb6
to
68da53c
Compare
RekGRpth
reviewed
Aug 1, 2024
RekGRpth
reviewed
Aug 1, 2024
…es (#89) Previously, gpbackup_helper would error out and abort restore operations if any plugin wrote anything to stderr. Additionally, when using the adb_ddp_plugin to restore data, gpbackup_helper did not wait for plugin processes, leading to a large number of zombie processes when restoring with the --resize-cluster flag, causing the process to stop. This patch removes the requirement for stderr to be empty. Now, messages directed to stderr are logged as warnings, allowing the process to continue without interruption. The helper can still detect when a plugin process has exited because the exit of a plugin process closes the associated reader handles, causing an error during subsequent read attempts. The patch also adds logic to wait and reap plugin processes. Instead of turning plugin processes into zombies, gpbackup_helper now calls Wait() on them. This action is performed every time a reader finishes copying its content. Wait() is not done in case of --single-data-file, because Wait() closes pipes immediately, but helper will reuse the same reader and read from its stdout pipe multiple times. Two new tests are introduced: the first one verifies that gpbackup_helper does not fail when a plugin writes something to stderr during the restore operation. The second test ensures that gpbackup_helper errors out when a plugin process terminates in the middle of the restore operation. Changes comparing to the original commit: 1. logWarning() is replaced with already existing logWarn(), that has the same functionality. 2. One of the calls to waitForPlugin() is removed as no more necessary, because there is no more nested loop over batches, and we can leave only one call for waitForPlugin() after 'LoopEnd' label. 3. Several variable names in the test were updated as old names do not exist anymore. Plus the pipefile name in the test was updated, as now it includes batch number. 4. log() doesn't exist anymore and is replaced with logVerbose(). 5. Unreachable call to logPlugin() is removed. 6. New tests are added to cover the case with cluster resize. 7. logPlugin() is merged into waitForPlugin(). 8. Tests are reworked to avoid goroutines. 9. Cleanup of plugin's test control files in the test is now done from a defer function in order not to leave them if test failed (otherwise a failed test could affect the subsequent tests). 10. SpecTimeout is added to some new tests to ensure that if the delta from this commit is broken, the test will not hang and will provide more useful output. (cherry picked from commit bb75d5a) Co-authored-by: Roman Eskin <[email protected]>
RekGRpth
approved these changes
Aug 1, 2024
bandetto
approved these changes
Aug 1, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Avoid using stderr to detect plugin failures, wait for plugin processes (#89)
Previously, gpbackup_helper would error out and abort restore operations if any
plugin wrote anything to stderr. Additionally, when using the adb_ddp_plugin to
restore data, gpbackup_helper did not wait for plugin processes, leading to a
large number of zombie processes when restoring with the --resize-cluster flag,
causing the process to stop.
This patch removes the requirement for stderr to be empty. Now, messages
directed to stderr are logged as warnings, allowing the process to continue
without interruption. The helper can still detect when a plugin process has
exited because the exit of a plugin process closes the associated reader
handles, causing an error during subsequent read attempts.
The patch also adds logic to wait and reap plugin processes. Instead of turning
plugin processes into zombies, gpbackup_helper now calls Wait() on them. This
action is performed every time a reader finishes copying its content. Wait() is
not done in case of --single-data-file, because Wait() closes pipes immediately,
but helper will reuse the same reader and read from its stdout pipe multiple
times.
Two new tests are introduced: the first one verifies that gpbackup_helper does
not fail when a plugin writes something to stderr during the restore operation.
The second test ensures that gpbackup_helper errors out when a plugin process
terminates in the middle of the restore operation.
Changes comparing to the original commit:
functionality.
there is no more nested loop over batches, and we can leave only one call for
waitForPlugin() after 'LoopEnd' label.
anymore. Plus the pipefile name in the test was updated, as now it includes
batch number.
function in order not to leave them if test failed (otherwise a failed test
could affect the subsequent tests).
this commit is broken, the test will not hang and will provide more useful
output.
(cherry picked from commit bb75d5a)
Note: do not squash the commit to preserve authorship.