-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Defer buffer release when no host memory to be updated #6837
Conversation
…Spec2020 Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
This reverts commit 8d05802.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
std::unique_ptr<ResourceHandler> &MObj; | ||
static std::atomic_bool MReleaseCalled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why this is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it was used to handle the following case:
main thread exits and MCounter becomes equal to 0, we call release resources and start joining thread pool threads. They pass check !MCounter and MObj since scheduler is still alive and call release resources again which has no sense.
Although, after your question I think that I may sort it out without extra variable mReleaseCalled and do it like this:
if (!MIncrementCounter)
return; //no actions at all for thread pool threads
MCounter--;
if (!MCounter && MObj)
MObj->releaseResources();
Will update patch shortly.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
This reverts commit 1c62d08.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
// MObj and MReleaseCalled is extra protection needed to handle case when main | ||
// thread finished but thread_pool is still running and we will join that | ||
// threads in releaseResources call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// MObj and MReleaseCalled is extra protection needed to handle case when main | |
// thread finished but thread_pool is still running and we will join that | |
// threads in releaseResources call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 71e9048
@@ -47,7 +80,24 @@ T &GlobalHandler::getOrCreate(InstWithLock<T> &IWL, Types... Args) { | |||
return *IWL.Inst; | |||
} | |||
|
|||
Scheduler &GlobalHandler::getScheduler() { return getOrCreate(MScheduler); } | |||
void GlobalHandler::attachScheduler(Scheduler *Scheduler) { | |||
// The method is for testing purposes. Do not protect with lock since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The method is for testing purposes. Do not protect with lock since | |
// The method is used in unittests only. Do not protect with lock since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-) fixed in 71e9048
@@ -141,9 +191,18 @@ void GlobalHandler::unloadPlugins() { | |||
GlobalHandler::instance().getPlugins().clear(); | |||
} | |||
|
|||
void GlobalHandler::drainThreadPool() { | |||
if (MHostTaskThreadPool.Inst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we lock MHostTaskThreadPool.Lock
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please correct me if my understanding is wrong but I thought that lock in InstWithLock exists to protect during getOrCreate call to avoid data race for object creation. Drain call is done in releaseResources called on program exit and do not introduce any data races related to it.
@@ -444,9 +450,13 @@ class Scheduler { | |||
const QueueImplPtr &getDefaultHostQueue() const { return DefaultHostQueue; } | |||
|
|||
static MemObjRecord *getMemObjRecord(const Requirement *const Req); | |||
// Virtual for testing purposes only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Virtual for testing purposes only | |
// Virtual for testing purposes only |
Is it still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, fixed in 71e9048
|
||
Scheduler(); | ||
~Scheduler(); | ||
void releaseResources(); | ||
inline bool isDeferredMemObjectsEmpty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why inline
is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 71e9048
you are right, compiler will decide
@@ -91,7 +91,7 @@ void SYCLMemObjT::updateHostMemory() { | |||
// If we're attached to a memory record, process the deletion of the memory | |||
// record. We may get detached before we do this. | |||
if (MRecord) | |||
Scheduler::getInstance().removeMemoryObject(this); | |||
assert(Scheduler::getInstance().removeMemoryObject(this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, do not put assert
around this call because in this case removeMemoryObject
is not called in the build without asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 71e9048
sycl/source/detail/thread_pool.hpp
Outdated
@@ -30,13 +30,16 @@ class ThreadPool { | |||
std::mutex MJobQueueMutex; | |||
std::condition_variable MDoSmthOrStop; | |||
std::atomic_bool MStop; | |||
std::atomic_uint MJobsInExecution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::atomic_uint MJobsInExecution; | |
std::atomic_uint MNumOfJobs; |
Or maybe MJobsInPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 71e9048
sycl/source/detail/thread_pool.hpp
Outdated
std::unique_lock<std::mutex> Lock(MJobQueueMutex); | ||
|
||
std::thread::id ThisThreadId = std::this_thread::get_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::thread::id ThisThreadId = std::this_thread::get_id(); |
It seems this line is not needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 71e9048
sycl/source/detail/thread_pool.hpp
Outdated
@@ -91,7 +102,7 @@ class ThreadPool { | |||
std::lock_guard<std::mutex> Lock(MJobQueueMutex); | |||
MJobQueue.emplace(Func); | |||
} | |||
|
|||
MJobsInExecution++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the counter be incremented in the another version of submit
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, missed that we have two versions and got hang(
fixed in 06e2608
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
assert( | ||
Result && | ||
"removeMemoryObject should not return false in mem object destructor"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be a warning saying that Result
is unused. And will turn into an error when building with -werror
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, fixed in ceea7f8
HIP fails with UNRESOLVED is known issue and reported here #7634 |
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
the second fix for post commit (win symbols) #7705 |
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
…used by intel#6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>
…yet (#7908) (caused by #6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>
intel#6837 enabled asynchronous buffer destruction for buffers constructed without host data. However, initial fallback assert implementation in intel#3767 predates it and as such had to place the buffer inside `queue_impl` to avoid unintended synchronization point. I don't know if there was the same crash observed on the end-to-end test added as part of this PR prior to intel#3767, but it doesn't even matter because the "new" implementation is both simpler and doesn't result in a crash. I suspect that without it (with the buffer for fallback assert implementation being a data member of `sycl::queue_impl`) we had a cyclic dependency somewhere leading to resource leak and ultimately to the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.
#6837 enabled asynchronous buffer destruction for buffers constructed without host data. However, initial fallback assert implementation in #3767 predates it and as such had to place the buffer inside `queue_impl` to avoid unintended synchronization point. I don't know if there was the same crash observed on the end-to-end test added as part of this PR prior to #3767, but it doesn't even matter because the "new" implementation is both simpler and doesn't result in a crash. I suspect that without it (with the buffer for fallback assert implementation being a data member of `sycl::queue_impl`) we had a cyclic dependency somewhere leading to resource leak and ultimately to the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.
SYCL2020 4.7.2.3. Buffer synchronization rules states that "A buffer can be constructed from a range (and without a hostData pointer). The memory management for this type of buffer is entirely handled by the SYCL system. The destructor for this type of buffer does not need to block, even if work on the buffer has not completed. Instead, the SYCL system frees any storage required for the buffer asynchronously when it is no longer in use in queues."
This commit implements this behavior for sycl::buffer.
This feature introduced more resources to be released in the end of program if there was no chance to release them earlier. This commit implements WA of known issues with global object destruction based on thread_local usage, thread_local variables destroy earlier than global variables that allow us to do release resources earlier.