[SYCL] Defer buffer release when no host memory to be updated #6837

KseniyaTikhomirova · 2022-09-21T14:37:30Z

SYCL2020 4.7.2.3. Buffer synchronization rules states that "A buffer can be constructed from a range (and without a hostData pointer). The memory management for this type of buffer is entirely handled by the SYCL system. The destructor for this type of buffer does not need to block, even if work on the buffer has not completed. Instead, the SYCL system frees any storage required for the buffer asynchronously when it is no longer in use in queues."
This commit implements this behavior for sycl::buffer.

This feature introduced more resources to be released in the end of program if there was no chance to release them earlier. This commit implements WA of known issues with global object destruction based on thread_local usage, thread_local variables destroy earlier than global variables that allow us to do release resources earlier.

…Spec2020 Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

This reverts commit 8d05802.

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

sycl/CMakeLists.txt

sycl/unittests/buffer/CMakeLists.txt

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

sycl/include/sycl/buffer.hpp

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad · 2022-12-06T13:44:48Z

sycl/source/detail/global_handler.cpp

  std::unique_ptr<ResourceHandler> &MObj;
+  static std::atomic_bool MReleaseCalled;


Could you please clarify why this is needed?

Sure, it was used to handle the following case:
main thread exits and MCounter becomes equal to 0, we call release resources and start joining thread pool threads. They pass check !MCounter and MObj since scheduler is still alive and call release resources again which has no sense.
Although, after your question I think that I may sort it out without extra variable mReleaseCalled and do it like this:
if (!MIncrementCounter)
return; //no actions at all for thread pool threads
MCounter--;
if (!MCounter && MObj)
MObj->releaseResources();

Will update patch shortly.

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

sycl/plugins/hip/pi_hip.cpp

sycl/source/detail/event_impl.cpp

This reverts commit 1c62d08.

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad · 2022-12-06T17:35:23Z

sycl/source/detail/global_handler.cpp

+// MObj and MReleaseCalled is extra protection needed to handle case when main
+// thread finished but thread_pool is still running and we will join that
+// threads in releaseResources call.


Suggested change

// MObj and MReleaseCalled is extra protection needed to handle case when main

// thread finished but thread_pool is still running and we will join that

// threads in releaseResources call.

fixed in 71e9048

romanovvlad · 2022-12-06T17:36:21Z

sycl/source/detail/global_handler.cpp

@@ -47,7 +80,24 @@ T &GlobalHandler::getOrCreate(InstWithLock<T> &IWL, Types... Args) {
  return *IWL.Inst;
 }

-Scheduler &GlobalHandler::getScheduler() { return getOrCreate(MScheduler); }
+void GlobalHandler::attachScheduler(Scheduler *Scheduler) {
+  // The method is for testing purposes. Do not protect with lock since


Suggested change

// The method is for testing purposes. Do not protect with lock since

// The method is used in unittests only. Do not protect with lock since

:-) fixed in 71e9048

romanovvlad · 2022-12-06T17:38:23Z

sycl/source/detail/global_handler.cpp

@@ -141,9 +191,18 @@ void GlobalHandler::unloadPlugins() {
  GlobalHandler::instance().getPlugins().clear();
 }

+void GlobalHandler::drainThreadPool() {
+  if (MHostTaskThreadPool.Inst)


Shouldn't we lock MHostTaskThreadPool.Lock here?

Please correct me if my understanding is wrong but I thought that lock in InstWithLock exists to protect during getOrCreate call to avoid data race for object creation. Drain call is done in releaseResources called on program exit and do not introduce any data races related to it.

romanovvlad · 2022-12-06T17:46:40Z

sycl/source/detail/scheduler/scheduler.hpp

@@ -444,9 +450,13 @@ class Scheduler {
  const QueueImplPtr &getDefaultHostQueue() const { return DefaultHostQueue; }

  static MemObjRecord *getMemObjRecord(const Requirement *const Req);
+  // Virtual for testing purposes only


Suggested change

// Virtual for testing purposes only

// Virtual for testing purposes only

Is it still relevant?

nope, fixed in 71e9048

romanovvlad · 2022-12-06T17:47:11Z

sycl/source/detail/scheduler/scheduler.hpp


  Scheduler();
  ~Scheduler();
+  void releaseResources();
+  inline bool isDeferredMemObjectsEmpty();


Could you please clarify why inline is needed here?

fixed in 71e9048
you are right, compiler will decide

romanovvlad · 2022-12-06T17:48:54Z

sycl/source/detail/sycl_mem_obj_t.cpp

@@ -91,7 +91,7 @@ void SYCLMemObjT::updateHostMemory() {
  // If we're attached to a memory record, process the deletion of the memory
  // record. We may get detached before we do this.
  if (MRecord)
-    Scheduler::getInstance().removeMemoryObject(this);
+    assert(Scheduler::getInstance().removeMemoryObject(this));


Please, do not put assert around this call because in this case removeMemoryObject is not called in the build without asserts.

fixed in 71e9048

romanovvlad · 2022-12-06T17:50:48Z

sycl/source/detail/thread_pool.hpp

@@ -30,13 +30,16 @@ class ThreadPool {
  std::mutex MJobQueueMutex;
  std::condition_variable MDoSmthOrStop;
  std::atomic_bool MStop;
+  std::atomic_uint MJobsInExecution;


Suggested change

std::atomic_uint MJobsInExecution;

std::atomic_uint MNumOfJobs;

Or maybe MJobsInPool

fixed in 71e9048

romanovvlad · 2022-12-06T17:51:11Z

sycl/source/detail/thread_pool.hpp

    std::unique_lock<std::mutex> Lock(MJobQueueMutex);
-
+    std::thread::id ThisThreadId = std::this_thread::get_id();


Suggested change

std::thread::id ThisThreadId = std::this_thread::get_id();

It seems this line is not needed anymore.

fixed in 71e9048

romanovvlad · 2022-12-06T17:53:46Z

sycl/source/detail/thread_pool.hpp

@@ -91,7 +102,7 @@ class ThreadPool {
      std::lock_guard<std::mutex> Lock(MJobQueueMutex);
      MJobQueue.emplace(Func);
    }
-
+    MJobsInExecution++;


Shouldn't the counter be incremented in the another version of submit as well.

you are right, missed that we have two versions and got hang(
fixed in 06e2608

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad

LGTM

romanovvlad · 2022-12-07T09:36:19Z

sycl/source/detail/sycl_mem_obj_t.cpp

+    assert(
+        Result &&
+        "removeMemoryObject should not return false in mem object destructor");
+  }


There will be a warning saying that Result is unused. And will turn into an error when building with -werror.

good catch, fixed in ceea7f8

KseniyaTikhomirova · 2022-12-07T10:20:43Z

HIP fails with UNRESOLVED is known issue and reported here #7634

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova · 2022-12-08T18:29:23Z

the second fix for post commit (win symbols) #7705

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

…used by intel#6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>

…yet (#7908) (caused by #6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>

intel#6837 enabled asynchronous buffer destruction for buffers constructed without host data. However, initial fallback assert implementation in intel#3767 predates it and as such had to place the buffer inside `queue_impl` to avoid unintended synchronization point. I don't know if there was the same crash observed on the end-to-end test added as part of this PR prior to intel#3767, but it doesn't even matter because the "new" implementation is both simpler and doesn't result in a crash. I suspect that without it (with the buffer for fallback assert implementation being a data member of `sycl::queue_impl`) we had a cyclic dependency somewhere leading to resource leak and ultimately to the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.

#6837 enabled asynchronous buffer destruction for buffers constructed without host data. However, initial fallback assert implementation in #3767 predates it and as such had to place the buffer inside `queue_impl` to avoid unintended synchronization point. I don't know if there was the same crash observed on the end-to-end test added as part of this PR prior to #3767, but it doesn't even matter because the "new" implementation is both simpler and doesn't result in a crash. I suspect that without it (with the buffer for fallback assert implementation being a data member of `sycl::queue_impl`) we had a cyclic dependency somewhere leading to resource leak and ultimately to the assert in `DeviceGlobalUSMMem::~DeviceGlobalUSMMem()`.

KseniyaTikhomirova added 20 commits September 20, 2022 05:37

[SYCL] Mark mem object which may have not blocking dtor according to …

0601210

…Spec2020 Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Add draft how to delay buffer_impl release

aff3be6

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Update symbols for non-breaking change

1195b59

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Update abi test vtable.cpp - non-breaking change

8d05802

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Update SYCL_MINOR_VERSION for non-breaking ABI change

b54b8e4

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Remove ABI break

965a015

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Update symbols to new version

27ccbff

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Tiny rename

9540fe0

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Revert "Update abi test vtable.cpp - non-breaking change"

c00c7cb

This reverts commit 8d05802.

Remove isDefault method, reimplemented

661dace

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Fix symbols again

6615db3

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Add handling of deferred mem objects release

8174dc3

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Remove unused function and restore XPTI traces collection

d55405e

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Add skeleton for unit test

bb2c4fb

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Fix shared_ptr use_count check

5db9e85

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Test draft

53a1892

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

[SYCL] Align usm_allocator ctor and operators with SYCCL2020

4b0a3fa

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Update attach scheduler logic

8daea20

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Make cleanup iterative

c855f13

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Fix test utils impl error

8dbcd1c

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova commented Sep 28, 2022

View reviewed changes

sycl/CMakeLists.txt Outdated Show resolved Hide resolved

sycl/unittests/buffer/CMakeLists.txt Outdated Show resolved Hide resolved

KseniyaTikhomirova added 5 commits September 28, 2022 11:32

Add other tests for buffer contructors

0f61c64

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Other tests for high level buffer destruction deferring logic

ddf215b

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Add unittest for waitForRecordToFinish

23bea82

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Remove debug flags uploaded by mistake

aa41d76

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Merge branch 'sycl' into buff_detach

e296d03

KseniyaTikhomirova commented Oct 3, 2022

View reviewed changes

sycl/include/sycl/buffer.hpp Outdated Show resolved Hide resolved

Fix clang-format

179c472

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova marked this pull request as ready for review October 3, 2022 20:18

KseniyaTikhomirova requested a review from a team as a code owner October 3, 2022 20:18

Code cleanup

c6d5dc7

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad reviewed Dec 6, 2022

View reviewed changes

Code cleanup Part 2

a0b37ef

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova commented Dec 6, 2022

View reviewed changes

sycl/plugins/hip/pi_hip.cpp Outdated Show resolved Hide resolved

sycl/source/detail/event_impl.cpp Outdated Show resolved Hide resolved

KseniyaTikhomirova added 3 commits December 6, 2022 06:04

Revert "Try to align hip context destruction handling with cuda WA"

619ee4e

This reverts commit 1c62d08.

Return cleanup deferred buffers to cleanupCommands call

3187f0a

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Remove unnecessary variable in ObjectRefCounter

dbe88e2

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad reviewed Dec 6, 2022

View reviewed changes

KseniyaTikhomirova added 2 commits December 6, 2022 12:41

Fix hang

06e2608

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Fix comments

71e9048

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova requested a review from romanovvlad December 6, 2022 21:02

Merge branch 'sycl' into buff_detach

a89e577

romanovvlad reviewed Dec 7, 2022

View reviewed changes

Prevent warning as error for release build

ceea7f8

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova requested a review from romanovvlad December 7, 2022 10:48

romanovvlad approved these changes Dec 7, 2022

View reviewed changes

wprotectMDeferredMemObjRelease modification with mutex

1f201a9

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

romanovvlad merged commit 894ce25 into intel:sycl Dec 8, 2022

KseniyaTikhomirova added a commit to KseniyaTikhomirova/llvm that referenced this pull request Dec 8, 2022

[SYCL] Fix post commit failure for intel#6837

0b1fa61

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

steffenlarsen pushed a commit that referenced this pull request Dec 8, 2022

[SYCL] Fix post commit failure for #6837 (#7703)

9f9503f

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

pvchupin pushed a commit that referenced this pull request Dec 8, 2022

[SYCL] Post commit fix for #6837 (win symbols) (#7705)

a534b94

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

aelovikov-intel mentioned this pull request Dec 12, 2022

Multiple intel/llvm-test-suite reduction tests failing with L0 on Windows #7743

Closed

KseniyaTikhomirova added a commit to KseniyaTikhomirova/llvm that referenced this pull request Jan 3, 2023

Fix segfault on program exit when user thread is not finished yet (ca…

7f10781

…used by intel#6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova mentioned this pull request Jan 3, 2023

[SYCL] Fix segfault on program exit when user thread is not finished yet #7908

Merged

bader pushed a commit that referenced this pull request Jan 6, 2023

[SYCL] Fix segfault on program exit when user thread is not finished …

ac58dd3

…yet (#7908) (caused by #6837) Signed-off-by: Tikhomirova, Kseniya <[email protected]>

aelovikov-intel mentioned this pull request Jan 29, 2024

[SYCL] Fix resource leak related to SYCL_FALLBACK_ASSERT #12532

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Defer buffer release when no host memory to be updated #6837

[SYCL] Defer buffer release when no host memory to be updated #6837

KseniyaTikhomirova commented Sep 21, 2022 •

edited

Loading

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad Dec 6, 2022

KseniyaTikhomirova Dec 6, 2022

romanovvlad left a comment

romanovvlad Dec 7, 2022

KseniyaTikhomirova Dec 7, 2022

KseniyaTikhomirova commented Dec 7, 2022

KseniyaTikhomirova commented Dec 8, 2022

		std::unique_ptr<ResourceHandler> &MObj;
		static std::atomic_bool MReleaseCalled;

	// MObj and MReleaseCalled is extra protection needed to handle case when main
	// thread finished but thread_pool is still running and we will join that
	// threads in releaseResources call.

	// The method is for testing purposes. Do not protect with lock since
	// The method is used in unittests only. Do not protect with lock since

	// Virtual for testing purposes only
	// Virtual for testing purposes only

	std::atomic_uint MJobsInExecution;
	std::atomic_uint MNumOfJobs;

		std::unique_lock<std::mutex> Lock(MJobQueueMutex);

		std::thread::id ThisThreadId = std::this_thread::get_id();

[SYCL] Defer buffer release when no host memory to be updated #6837

[SYCL] Defer buffer release when no host memory to be updated #6837

Conversation

KseniyaTikhomirova commented Sep 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanovvlad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KseniyaTikhomirova commented Dec 7, 2022

KseniyaTikhomirova commented Dec 8, 2022

KseniyaTikhomirova commented Sep 21, 2022 •

edited

Loading