-
Notifications
You must be signed in to change notification settings - Fork 575
UN-2970 [FIX] Handle PostgreSQL race condition in concurrent table creation #1735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
UN-2970 [FIX] Handle PostgreSQL race condition in concurrent table creation #1735
Conversation
…eation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a constant and helpers to detect Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler
participant Postgres
Client->>Handler: execute_query("CREATE TABLE IF NOT EXISTS ...")
Handler->>Postgres: send SQL
alt Postgres creates table successfully
Postgres-->>Handler: success
Handler-->>Client: success
else Postgres raises DuplicateTable or UniqueViolation
Postgres-->>Handler: error (DuplicateTable/UniqueViolation)
Handler->>Handler: rollback transaction
alt SQL matches IF NOT EXISTS or constraint race detected
Handler-->>Client: log info + return (suppress error)
else
Handler-->>Client: log error + re-raise
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro Cache: Disabled due to Reviews > Disable Cache setting Knowledge base: Disabled due to 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py (2)
38-52: Solid handling of the PostgreSQL race condition.The approach is correct: catching
DuplicateTable, rolling back the aborted transaction, and returning early makes the operation idempotent.Optional improvement: Consider using
startswithinstead ofinto avoid false positives if the string appears elsewhere in the query (e.g., as a literal in an INSERT).🔧 Suggested improvement
- if CREATE_TABLE_IF_NOT_EXISTS in sql_query.upper(): + if sql_query.strip().upper().startswith(CREATE_TABLE_IF_NOT_EXISTS):
53-66: Race condition handling is appropriate.This correctly addresses the
pg_type_typname_nsp_indexUniqueViolation that can occur during concurrent table creation.Regarding the static analysis hint at line 65: While
logging.exceptionwould include the traceback, the existing handlers in this file (lines 68, 71, 74, 77, etc.) all uselogger.error. Since the exception is re-raised immediately, the traceback is preserved. Keepinglogger.errormaintains consistency with the rest of the file.Same
startswithsuggestion applies here for line 57.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
🧰 Additional context used
🪛 Ruff (0.14.10)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
51-51: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
65-65: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py (1)
18-19: LGTM!Good use of a module-level constant to avoid string repetition and ensure consistent matching.
- Extract _is_create_table_if_not_exists helper function - Use startswith() instead of in for more precise matching - Remove unnecessary else blocks to reduce complexity Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py:
- Around line 43-69: The UniqueViolation handling in the block catching
PsycopgError.UniqueViolation is too broad; change it to only suppress when
_is_create_table_if_not_exists(sql_query) is true AND the exception's constraint
name equals "pg_type_typname_nsp_index" (the pg_type race), otherwise call
engine.rollback() and re-raise; also replace logger.error(...) with
logger.exception(...) to preserve traceback when logging the suppressed case and
when logging before re-raising. Ensure any path that re-raises calls
engine.rollback() first to avoid leaving the transaction aborted; use the
existing symbols PsycopgError.UniqueViolation, _is_create_table_if_not_exists,
table_name, sql_query, engine, and logger to locate and update the code.
🧹 Nitpick comments (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py (1)
18-23: HardenCREATE TABLE IF NOT EXISTSdetection (comments/whitespace/modifiers).
strip().upper().startswith(...)is easy to bypass with leading SQL comments, extra whitespace/newlines, orCREATE TEMP/UNLOGGED TABLE IF NOT EXISTS .... If your SQL generator ever emits those variants, the race handling won’t trigger.Proposed refactor (more tolerant regex)
+import re import logging from typing import Any from psycopg2 import errors as PsycopgError @@ CREATE_TABLE_IF_NOT_EXISTS = "CREATE TABLE IF NOT EXISTS" +_CREATE_TABLE_IF_NOT_EXISTS_RE = re.compile( + r"^\s*(?:/\*.*?\*/\s*)*(?:--[^\n]*\n\s*)*" + r"CREATE\s+(?:TEMP|TEMPORARY|UNLOGGED\s+)?TABLE\s+IF\s+NOT\s+EXISTS\b", + re.IGNORECASE | re.DOTALL, +) + def _is_create_table_if_not_exists(sql_query: str) -> bool: """Check if query is a CREATE TABLE IF NOT EXISTS statement.""" - return sql_query.strip().upper().startswith(CREATE_TABLE_IF_NOT_EXISTS) + return bool(_CREATE_TABLE_IF_NOT_EXISTS_RE.match(sql_query))Please sanity-check this against the actual emitted CREATE TABLE statements in your workflows.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
🧰 Additional context used
🪛 Ruff (0.14.10)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
55-55: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
68-68: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py (3)
43-56: Uselogger.exceptionfor better tracebacks when re-raising.When logging before re-raising an exception,
logger.exceptionautomatically includes the traceback, aiding debugging.Suggested fix
- logger.error(f"DuplicateTable error: {e.pgerror}") + logger.exception(f"DuplicateTable error: {e.pgerror}")
62-65: Theconstraint is Nonefallback may be overly permissive.When
constraint_nameis unavailable (e.g., due to psycopg2/PostgreSQL version differences), anyUniqueViolationduring aCREATE TABLE IF NOT EXISTSwill be suppressed. While this is likely safe for pure DDL statements, verify this assumption holds for your use cases.Consider logging the raw exception details when
constraint is Noneto aid future debugging:Optional: Add trace logging when constraint is unknown
is_pg_type_race = ( constraint == "pg_type_typname_nsp_index" or constraint is None ) + if constraint is None: + logger.debug( + f"UniqueViolation constraint_name unavailable, assuming pg_type race. " + f"Full error: {e}" + )
73-74: Uselogger.exceptionfor better tracebacks when re-raising.Same as the
DuplicateTablehandler—uselogger.exceptionto capture the full traceback.Suggested fix
- logger.exception(f"UniqueViolation error (constraint={constraint}): {e.pgerror}") + logger.exception(f"UniqueViolation error (constraint={constraint}): {e.pgerror}")Wait, let me fix this:
- logger.error(f"UniqueViolation error (constraint={constraint}): {e.pgerror}") + logger.exception(f"UniqueViolation error (constraint={constraint}): {e.pgerror}")
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
🧰 Additional context used
🪛 Ruff (0.14.10)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py
55-55: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
73-73: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
unstract/connectors/src/unstract/connectors/databases/psycopg_handler.py (1)
18-23: LGTM!Clean extraction of the constant and helper function. The case-insensitive matching via
.upper()ensures SQL queries are properly detected regardless of case formatting.
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
DuplicateTableandUniqueViolationerrors inPsycoPgHandler.execute_query()Why
CREATE TABLE IF NOT EXISTSis not atomic at the catalog level, causing a race conditionpg_type_typname_nsp_index)psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "pg_type_typname_nsp_index"How
PsycopgError.DuplicateTableandPsycopgError.UniqueViolationexceptionsCREATE TABLE IF NOT EXISTSqueries only, suppress the error (rollback and return)IF NOT EXISTSclauseCan this PR break any existing features. If yes, please list possible items. If no, please explain why.
CREATE TABLE IF NOT EXISTSqueries are affected, and they are idempotent by designDatabase Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.