Skip to content

Conversation

@jkoritzinsky
Copy link
Member

@jkoritzinsky jkoritzinsky commented Nov 4, 2025

Fixes #121287

Validation: Every run of runtime-coreclr gcstress-extra has at least one hit of the failure in the link for some out-of-proc test. This PR has no failures of the sort. This is not my favorite form of validation, but I don't think I'm going to get much better as I can't repro locally.

May also fix #121285, but I'm not sure.

@jkoritzinsky jkoritzinsky added area-VM-coreclr NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it labels Nov 4, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr gcstress-extra

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…that other threads observe the swapped-in and initialized lock before the thin-lock value is cleared.
@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr gcstress-extra

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkoritzinsky jkoritzinsky changed the title [NO MERGE] Testing fixes for lock Stress failures Add MemoryBarrier to SyncBlock Lock upgrade path Nov 5, 2025
@jkoritzinsky jkoritzinsky removed NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it labels Nov 5, 2025
@jkoritzinsky jkoritzinsky marked this pull request as ready for review November 5, 2025 17:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a memory barrier after successfully storing a lock handle in a SyncBlock to ensure proper memory ordering when transitioning from thin lock to full lock semantics.

  • Adds an explicit MemoryBarrier() call after the successful InterlockedCompareExchangeT operation in GetOrCreateLock

…GetOrCreateLock instead of trying to do something lock-free.
@jkoritzinsky jkoritzinsky requested a review from jkotas November 6, 2025 19:01
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please trigger a few GC stress runs to see whether it is fixing the problem (and not introducing any new ones)?

@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jkoritzinsky
Copy link
Member Author

Looks like I need to modify the timeout handling, which makes sense (especially if a GC happens).

…ht get a GC on the thread that owns the lock, so spinning for a while makes sense.

Also remove more store barriers.
@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

//
// ***************************************************************************

#ifdef MP_LOCKS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete #define MP_LOCKS from gc.h as well?

// destructor will reset the CLR debug state to what it was before entering the
// function, which will undo the BeginNoTriggerGC() call below.
STATIC_CONTRACT_GC_NOTRIGGER;
void EnterSpinLock(Volatile<DWORD>* pLock COMMA_INDEBUG(int spinTimeout = INFINITE_TIMEOUT))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not bother with the debug-only timeout argument. Delete it instead.

, m_thinLock()
, m_dwSyncIndex(indx)
#ifdef FEATURE_METADATA_UPDATER
, m_pEnCInfo(PTR_NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also delete ENTER_SPIN_LOCK /LEAVE_SPIN_LOCK macros? They are unnecessary indirection.

if (result == curValue)
break;
}
if (g_SystemInfo.dwNumberOfProcessors > 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the spin loop with YieldProcessorNormalized was unintentionally disabled at some point. We may want to add it back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants