Skip to content

fix: thread safety issues#614

Open
leakonvalinka wants to merge 5 commits into
open-feature:mainfrom
leakonvalinka:fix/thread-safety-adjustments
Open

fix: thread safety issues#614
leakonvalinka wants to merge 5 commits into
open-feature:mainfrom
leakonvalinka:fix/thread-safety-adjustments

Conversation

@leakonvalinka

@leakonvalinka leakonvalinka commented Jun 19, 2026

Copy link
Copy Markdown
Member

This PR

  • aims to improve the thread-safety of the python-sdk
  • some of these changes still need to be discussed -> see comments (mentioning potential issues and options)

Related Issues

#96

Notes

  • i'm not a python expert, please let me know if i got anything wrong :)

Signed-off-by: Lea Konvalinka <lea.konvalinka@dynatrace.com>
Signed-off-by: Lea Konvalinka <lea.konvalinka@dynatrace.com>
Signed-off-by: Lea Konvalinka <lea.konvalinka@dynatrace.com>
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.37%. Comparing base (79bb01b) to head (22d9583).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #614      +/-   ##
==========================================
+ Coverage   98.34%   98.37%   +0.02%     
==========================================
  Files          45       45              
  Lines        2483     2518      +35     
==========================================
+ Hits         2442     2477      +35     
  Misses         41       41              
Flag Coverage Δ
unittests 98.37% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 47e06c89-abbc-478f-8916-35ebc2f331f4

📥 Commits

Reviewing files that changed from the base of the PR and between 7f14e05 and 22d9583.

📒 Files selected for processing (4)
  • openfeature/_event_support.py
  • openfeature/hook/__init__.py
  • openfeature/provider/_registry.py
  • openfeature/transaction_context/__init__.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • openfeature/hook/init.py
  • openfeature/transaction_context/init.py
  • openfeature/_event_support.py
  • openfeature/provider/_registry.py

📝 Walkthrough

Walkthrough

The PR adds locking around shared hook, provider, and transaction-context state, moves event-handler cleanup into the provider registry, and changes client provider-status checks to use the resolved provider with test coverage updates.

Changes

Shared-state hardening

Layer / File(s) Summary
Provider and handler clearing
openfeature/api.py, openfeature/provider/_registry.py, openfeature/_event_support.py
clear_providers() no longer clears event handlers directly, registry clearing now performs that cleanup, and event-support clearing uses the updated lock acquisition.
Hook list locking
openfeature/hook/__init__.py, openfeature/client.py
Global hook registration and client hook appends now mutate shared hook lists under locks.
Provider access and emit
openfeature/provider/_registry.py, openfeature/provider/__init__.py
Provider emit callback dispatch and registry lookups now use guarded reads.
Client provider status checks
openfeature/client.py, tests/test_client.py
Client flag evaluation now checks the resolved provider, and tests stub registry-backed provider statuses.

Transaction context locking

Layer / File(s) Summary
Propagator locking
openfeature/transaction_context/__init__.py
Transaction-context propagator access is synchronized with a module-level lock.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: fixing thread-safety issues in the Python SDK.
Description check ✅ Passed The description is on-topic and summarizes the thread-safety improvements and related issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands.

@leakonvalinka leakonvalinka changed the title Fix/thread safety adjustments fix: thread safety issues Jun 19, 2026
Signed-off-by: Lea Konvalinka <lea.konvalinka@dynatrace.com>
@leakonvalinka leakonvalinka force-pushed the fix/thread-safety-adjustments branch from 0c9eb51 to 5d11dbc Compare June 19, 2026 10:51
@leakonvalinka leakonvalinka marked this pull request as ready for review June 25, 2026 12:32
@leakonvalinka leakonvalinka requested review from a team as code owners June 25, 2026 12:32

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openfeature/provider/_registry.py (1)

99-105: 🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win

Avoid clearing event handlers while holding the registry lock.

clear_event_handlers() acquires _client_lock; concurrently, handler dispatch holds _client_lock while resolving client.provider, which now acquires ProviderRegistry._lock. This creates a registry-lock → client-lock path here and a client-lock → registry-lock path during dispatch, so clear_providers() can deadlock with provider event dispatch.

🔒 Proposed fix
     def clear_providers(self) -> None:
         self.shutdown()
         with self._lock:
             self._providers.clear()
             self._default_provider = NoOpProvider()
             self._provider_status = {
                 self._default_provider: ProviderStatus.READY,
             }
-            clear_event_handlers()
+
+        clear_event_handlers()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openfeature/provider/_registry.py` around lines 99 - 105, clear_providers()
is calling clear_event_handlers() while still holding ProviderRegistry._lock,
which creates a lock-order cycle with event dispatch. Move the
clear_event_handlers() call out of the locked section in
ProviderRegistry.clear_providers(), keeping only provider state updates under
the lock and then clearing handlers after the lock is released. Use the
ProviderRegistry._lock, clear_providers(), and clear_event_handlers() symbols to
update the flow so the registry lock is never held when acquiring the
client/event-handler lock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openfeature/transaction_context/__init__.py`:
- Around line 26-50: The transaction-context helpers can self-deadlock because
_propagator_lock is a non-reentrant Lock while get_transaction_context and
set_transaction_context invoke user-provided TransactionContextPropagator
methods under that lock. Update the locking strategy in
openfeature/transaction_context/__init__.py so re-entrant calls from custom
propagators do not block the same thread, for example by switching
_propagator_lock to an RLock and keeping the existing serialization around
set_transaction_context_propagator, get_transaction_context, and
set_transaction_context.

---

Outside diff comments:
In `@openfeature/provider/_registry.py`:
- Around line 99-105: clear_providers() is calling clear_event_handlers() while
still holding ProviderRegistry._lock, which creates a lock-order cycle with
event dispatch. Move the clear_event_handlers() call out of the locked section
in ProviderRegistry.clear_providers(), keeping only provider state updates under
the lock and then clearing handlers after the lock is released. Use the
ProviderRegistry._lock, clear_providers(), and clear_event_handlers() symbols to
update the flow so the registry lock is never held when acquiring the
client/event-handler lock.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8e01e6e3-2b84-4dab-a926-10ec95fae6b5

📥 Commits

Reviewing files that changed from the base of the PR and between 79bb01b and 7f14e05.

📒 Files selected for processing (8)
  • openfeature/_event_support.py
  • openfeature/api.py
  • openfeature/client.py
  • openfeature/hook/__init__.py
  • openfeature/provider/__init__.py
  • openfeature/provider/_registry.py
  • openfeature/transaction_context/__init__.py
  • tests/test_client.py
💤 Files with no reviewable changes (1)
  • openfeature/api.py

Comment thread openfeature/transaction_context/__init__.py Outdated
Signed-off-by: Lea Konvalinka <lea.konvalinka@dynatrace.com>
@leakonvalinka leakonvalinka force-pushed the fix/thread-safety-adjustments branch from 7f14e05 to 22d9583 Compare June 25, 2026 12:55
@aepfli

aepfli commented Jun 25, 2026

Copy link
Copy Markdown
Member

I am curious, can we add a tool to detect this like vmlens in java? In the sense of automated testing?

# to be shut down isn't returned
# however it contributes to a potential deadlock that is currently
# still in place (clear_providers: registry's lock -> _event_support's lock;
# run_handlers_for_provider: _event_support's lock -> registry's lock)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just writing a comment here as a TODO that still needs to be resolved before merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants