-
Notifications
You must be signed in to change notification settings - Fork 89
Read only uses #9237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Read only uses #9237
Conversation
WalkthroughAdds per-query pool selection and read-only pool usage: Changes
Sequence DiagramsequenceDiagram
participant Caller
participant MDStore
participant PostgresTable
participant PoolMgr as Pool Manager
participant DefaultPool as default
participant ReadPool as read_only
Caller->>MDStore: read operation
MDStore->>PostgresTable: executeSQL(query, params, {preferred_pool:'read_only'})
PostgresTable->>PoolMgr: get_pool('read_only')
alt read_only exists
PoolMgr-->>PostgresTable: ReadPool
PostgresTable->>ReadPool: run query
alt success
ReadPool-->>PostgresTable: result
PostgresTable-->>MDStore: result
else error and retry_with_default_pool
PostgresTable->>DefaultPool: retry query on default (logged)
DefaultPool-->>PostgresTable: result/error
PostgresTable-->>MDStore: result/error
end
else read_only missing
PoolMgr->>PoolMgr: fallback to default
PoolMgr-->>PostgresTable: DefaultPool
PostgresTable->>DefaultPool: run query
DefaultPool-->>PostgresTable: result
PostgresTable-->>MDStore: result
end
MDStore-->>Caller: return result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Potential attention areas:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0a008b6 to
d699655
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/util/postgres_client.js (1)
632-642: Review the fallback behavior in get_pool.The fallback logic on lines 636-637 silently falls back to the default pool when a requested pool key is invalid:
if (key && key !== this.pool_key) { return this.get_pool(); }Concerns:
- Silent fallback masks configuration issues: If code requests a non-existent pool (e.g., typo like
'read_onyl'), it silently uses the default pool instead of failing fast.- Misleading error message: After the fallback attempt, line 639 throws an error mentioning the requested
key, but the fallback might have already succeeded withthis.pool_key.Recommendation:
Consider removing the fallback or adding a warning log when falling back, so configuration issues are visible:get_pool(key = this.pool_key) { const pool = this.client.get_pool(key); if (!pool) { - //if original get_pool was no for the default this.pool_key, try also this.pool_key + // If original get_pool was not for the default this.pool_key, try also this.pool_key if (key && key !== this.pool_key) { + dbg.warn(`Pool ${key} not found, falling back to default pool ${this.pool_key}`); return this.get_pool(); } throw new Error(`The postgres clients pool ${key} disconnected`); } return pool; }Alternatively, remove the fallback entirely to fail fast on invalid pool names.
src/server/object_services/md_store.js (1)
814-835: Route mapReduce operations to read-only pool for consistent replica distribution.The
list_objectsandlist_object_versionsmethods usemapReducefor queries with delimiters (lines 814-824 and 864-874), but these calls don't passpreferred_poolto route to the read-only pool. While thefind()operations support routing viapreferred_pool: 'read_only'(as seen elsewhere in md_store.js), the mapReduce calls should be consistent.To fix:
- Update
mapReduceListObjects()andmapReduceAggregate()in postgres_client.js to extractoptions.preferred_pooland pass it tothis.single_query()(currently they callthis.single_query(mr_q)without the pool parameter)- Pass
preferred_pool: 'read_only'in the options object when callingthis._objects.mapReduce()at lines 814-824 and 864-874 in md_store.js
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/sdk/nb.d.ts(1 hunks)src/server/bg_services/db_cleaner.js(1 hunks)src/server/object_services/md_store.js(5 hunks)src/util/postgres_client.js(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
PR: noobaa/noobaa-core#9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.
Applied to files:
src/server/object_services/md_store.js
🧬 Code graph analysis (1)
src/server/bg_services/db_cleaner.js (1)
src/server/object_services/md_store.js (3)
dbg(14-14)P(13-13)config(28-28)
🔇 Additional comments (10)
src/server/object_services/md_store.js (6)
779-779: LGTM! Read-only pool routing for unreclaimed objects.Correctly routes the find operation to the read-only pool, which aligns with the PR objective of offloading background read queries from the main cluster.
1127-1127: LGTM! Read-only pool routing for deleted objects query.The SQL query correctly uses the read-only pool via the options parameter, consistent with the pattern established in this PR.
1603-1603: LGTM! Read-only pool routing for chunk iteration.Appropriately routes the chunk iteration query to the read-only pool for the scrubber's background operations.
1778-1779: LGTM! Read-only pool routing for deleted chunks.Correctly adds
preferred_pool: 'read_only'to the find options for deleted chunks cleanup query.
1787-1788: LGTM! Read-only pool routing for chunk existence checks.Both
has_any_blocks_for_chunkandhas_any_parts_for_chunkcorrectly route to the read-only pool. These existence checks are read-only operations suitable for read replicas.Also applies to: 1796-1797
2039-2040: LGTM! Read-only pool routing for deleted blocks.Correctly routes the deleted blocks query to the read-only pool for background cleanup operations.
src/util/postgres_client.js (3)
252-252: Helpful debug logging for pool routing.The added debug log that includes
pg_client.options?.hostwill be useful for verifying which pool (and host) is being used for each query, aiding in troubleshooting read-only pool routing.
726-733: LGTM! Clean implementation of preferred_pool in executeSQL.The JSDoc and implementation correctly support routing queries to a preferred pool. The pattern of passing
options.preferred_pooltoget_pool()is clean and extensible.
937-937: LGTM! Consistent preferred_pool support in find and findOne.Both methods correctly pass
options.preferred_pooltoget_pool(), enabling read-only pool routing for these common query methods. The implementation is consistent with the executeSQL pattern.Also applies to: 954-954
src/sdk/nb.d.ts (1)
769-769: LGTM! TypeScript interface correctly updated.The addition of
preferred_pool?: stringto the executeSQL options parameter aligns with the implementation in postgres_client.js and properly documents the new functionality for TypeScript consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/util/postgres_client.js (1)
635-638: Add warning log when falling back to default pool.The silent fallback could mask misconfigurations. If the
read_onlypool is expected but unavailable, queries will silently route to the default pool, defeating the load distribution purpose.Apply this diff to add visibility:
//if original get_pool was not for the default this.pool_key, try also this.pool_key if (key && key !== this.pool_key) { + dbg.warn(`Pool ${key} not available for table ${this.name}, falling back to default pool ${this.pool_key}`); return this.get_pool(); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/util/postgres_client.js(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build Noobaa Image
- GitHub Check: run-package-lock-validation
- GitHub Check: run-jest-unit-tests
🔇 Additional comments (4)
src/util/postgres_client.js (4)
251-252: LGTM! Helpful debug logging for pool routing verification.The host logging will be useful for confirming that queries are being routed to the correct pool (read-only vs. default).
954-954: LGTM! Consistent with find implementation.The
preferred_poolsupport infindOneis consistent with thefindimplementation. Same eventual consistency considerations apply.
937-937: LGTM. Routingfindqueries to read replicas is appropriate for background maintenance operations.The callers using
preferred_pool: 'read_only'are background operations (find_unreclaimed_objects,find_deleted_objects, chunk/block existence checks) where eventual consistency is acceptable. These align with the stated use cases (object reclaimer, scrubber) and do not impact critical read paths.
726-733: Review comment is incorrect—executeSQL is designed for standalone queries and not used within transactions.The code shows that
executeSQLandPgTransactionhave separate execution paths:
PgTransaction(lines 335-378) uses its ownquery()method with a dedicatedpg_clientconnection for transactional queriesexecuteSQL(line 731) gets its own pool connection viapool.connect()for standalone queries- All
executeSQLcall sites in md_store.js are standalone queries, never invoked within transaction contexts- The
preferred_pooloption is appropriate for directing standalone queries to specific pools (e.g., read-only)There is no mixing of transaction and standalone query execution paths, and no risk of routing queries to incorrect pools during transactions.
Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this looks good. A few general comments\questions
- We should consider the case where the secondary is down. Maybe we should fallback to the primary?
- We should make sure that the rest of the code behaves nicely if the secondary returns stale data (e.g., object_mds that were marked as reclaimed but were not reflected yet in the secondary). From a quick look at the code, it looks ok, but I didn't look too deeply into it.
|
@alphaprinz I also think we should have a config to disable this feature and revert to the old behavior. It can be done in a different PR, so approving this in the meantime |
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
af555fb to
2f4a70a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/util/postgres_client.js (2)
639-642: Fallback logic is correct but comment could be clearer.The fallback mechanism works as intended, but the comment "try also this.pool_key" could be more explicit that it's falling back to the table's default pool key.
Consider this clearer comment:
- //if original get_pool was not for the default this.pool_key, try also this.pool_key + // If requested pool doesn't exist and wasn't the table's default pool, fallback to table's default pool if (key && key !== this.pool_key) { return this.get_pool(); }
1510-1514: Consider implementing the requested configuration option.Based on the PR comment from dannyzaken, there's a request for a configuration option to disable the new read-only behavior and revert to previous behavior. While approved for a separate PR, consider whether adding a simple config flag now (e.g.,
config.ENABLE_READ_ONLY_POOL) would prevent future breaking changes.Would you like me to generate a script to check how the
preferred_pool: 'read_only'option is being used across the codebase?
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/sdk/nb.d.ts(1 hunks)src/server/object_services/md_store.js(5 hunks)src/util/postgres_client.js(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/sdk/nb.d.ts
- src/server/object_services/md_store.js
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.
Applied to files:
src/util/postgres_client.js
📚 Learning: 2025-08-11T06:12:12.318Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:6-22
Timestamp: 2025-08-11T06:12:12.318Z
Learning: In the noobaa-core upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js, bucket migration from the internal mongo pool to a new default pool is planned to be handled in separate future PRs with comprehensive testing, rather than being included directly in the pool removal script.
Applied to files:
src/util/postgres_client.js
📚 Learning: 2025-08-08T13:10:36.141Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1314-1317
Timestamp: 2025-08-08T13:10:36.141Z
Learning: In src/server/system_services/pool_server.js (and config usage), the constant config.INTERNAL_STORAGE_POOL_NAME has been removed from the system. Future logic should not depend on this constant and should instead use config.DEFAULT_POOL_NAME or structural markers (e.g., pool.resource_type === 'INTERNAL' or pool.mongo_info) to identify internal/mongo pools.
Applied to files:
src/util/postgres_client.js
📚 Learning: 2025-08-08T13:13:42.192Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1304-1307
Timestamp: 2025-08-08T13:13:42.192Z
Learning: In noobaa-core, do not rely on config.INTERNAL_STORAGE_POOL_NAME to detect backingstore pools. In src/server/system_services/pool_server.js, prefer checking pool.hosts_pool_info.backingstore or pool.cloud_pool_info.backingstore; fallback to prefix-matching the pool name against config.DEFAULT_POOL_NAME ('backingstores') to handle potential suffixes.
Applied to files:
src/util/postgres_client.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run-package-lock-validation
- GitHub Check: Build Noobaa Image
- GitHub Check: run-jest-unit-tests
🔇 Additional comments (6)
src/util/postgres_client.js (6)
252-252: LGTM: Debug logging added for pool routing.The debug log provides useful information for troubleshooting pool selection and retry behavior.
730-730: LGTM: Type definition updated correctly.The
preferred_pooloption is properly documented as optional in the typedef.
737-737: LGTM: Pool selection correctly uses preferred_pool option.The implementation properly routes queries to the preferred pool when specified.
941-941: LGTM: Pool selection propagated correctly in find operations.Both
find()andfindOne()correctly pass thepreferred_pooloption to route queries appropriately.Also applies to: 958-958
1753-1754: LGTM: Retry flag correctly propagated to pool instance.The flag is properly propagated from pool configuration to the pool instance, making it available in
_do_query().
1510-1514: Configuration is correct—no infinite recursion risk.The
retry_with_default_pool: truesetting for the read_only pool is properly configured. The default pool (lines 1502-1505) does not have this flag, preventing infinite recursion. The read_only pool will safely retry failed queries using the default pool as intended.
| if (pg_client.retry_with_default_pool) { | ||
| dbg.warn("retrying with default pool. q = ", q); | ||
| return _do_query(PostgresClient.instance().get_pool('default'), q, transaction_counter); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Risk of infinite recursion if default pool also has retry flag set.
If the default pool's retry_with_default_pool is set to true, this creates infinite recursion. The code should either:
- Explicitly pass
retry_with_default_pool: falsewhen getting the default pool, or - Add a guard to prevent retrying if already using the default pool.
Additionally, the original error details are lost before the retry. Consider logging the original error with more context.
Apply this diff to prevent infinite recursion:
- if (pg_client.retry_with_default_pool) {
+ if (pg_client.retry_with_default_pool && pg_client !== PostgresClient.instance().get_pool('default')) {
dbg.warn("retrying with default pool. q = ", q);
return _do_query(PostgresClient.instance().get_pool('default'), q, transaction_counter);
}Or alternatively, ensure the default pool instance never has retry_with_default_pool set:
if (pg_client.retry_with_default_pool) {
- dbg.warn("retrying with default pool. q = ", q);
+ dbg.warn("retrying with default pool. original error = ", err.message, ", q = ", q);
- return _do_query(PostgresClient.instance().get_pool('default'), q, transaction_counter);
+ const default_pool = PostgresClient.instance().get_pool('default');
+ // Ensure we don't retry infinitely if default pool also has retry flag
+ const safe_pool = { ...default_pool, retry_with_default_pool: false };
+ return _do_query(safe_pool, q, transaction_counter);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (pg_client.retry_with_default_pool) { | |
| dbg.warn("retrying with default pool. q = ", q); | |
| return _do_query(PostgresClient.instance().get_pool('default'), q, transaction_counter); | |
| } | |
| if (pg_client.retry_with_default_pool && pg_client !== PostgresClient.instance().get_pool('default')) { | |
| dbg.warn("retrying with default pool. q = ", q); | |
| return _do_query(PostgresClient.instance().get_pool('default'), q, transaction_counter); | |
| } |
🤖 Prompt for AI Agents
In src/util/postgres_client.js around lines 271 to 274, the retry branch
unconditionally calls _do_query against the default pool which can cause
infinite recursion if the default pool also has retry_with_default_pool=true and
loses the original error context; change the retry to (a) call
get_pool('default') with retry_with_default_pool explicitly set to false or (b)
add a guard that detects when the current pool is already the default and skip
retrying, and also log the original error before attempting the retry so the
original error details are preserved for debugging.
Describe the Problem
Use new 'read only' pool (connected to cnpg replicated read-only cluster) in order to alleviate load from main, writable pg cluster.
Explain the Changes
-db cleancer
-object reclaimer's first find
-scrubber's first find
Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
Summary by CodeRabbit
New Features
Improvements
Observability