Skip to content

Conversation

@richa-rochna
Copy link

Liquibase throws IllegalArgumentException "Multiple entries with same key: enablearrow=0 and enablearrow=0"

…iple entries with same key: enablearrow=0 and enablearrow=0"
@coderabbitai
Copy link

coderabbitai bot commented Dec 1, 2025

📝 Walkthrough

Walkthrough

Reworked URL handling in DatabricksConnection: deduplicate URL parameters (case-insensitive), avoid pre-inserting UserAgentEntry and EnableArrow into driver properties, append those parameters only when absent, ensure semicolon-terminated URLs, and add helpers urlContainsParam and deduplicateUrlParameters.

Changes

Cohort / File(s) Summary
URL handling & helpers
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java
Refactored open and getConnectionUrl to deduplicate URL parameters up front, ensure a trailing semicolon, conditionally append UserAgentEntry and EnableArrow only if absent, and deduplicate again after modifications. Removed pre-insertion of those properties into driverProperties. Added protected static helpers urlContainsParam(String, String) and deduplicateUrlParameters(String).
Minor signature/formatting tweaks
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java
Non-semantic formatting/override-signature adjustments across several methods (no public API change beyond URL handling and new helpers).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify deduplicateUrlParameters handles case-insensitive keys, different separators (;, &, ?), and preserves the first occurrence.
  • Test urlContainsParam against varied casing and parameter-placement scenarios.
  • Review open and getConnectionUrl code paths to ensure semicolon termination and no duplicate addition to driverProperties.
  • Check that formatting/signature tweaks didn't alter method contracts or override behavior.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly references issue #340 and clearly describes the main problem being fixed: duplicate URL parameters causing IllegalArgumentException.
Description check ✅ Passed The description accurately identifies the core issue being addressed: the IllegalArgumentException related to duplicate 'enablearrow' parameters.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

54-55: Parameters are set redundantly in both driverProperties and the URL, creating undefined driver behavior.

The code sets UserAgentEntry and EnableArrow in the driverProperties object (lines 54-55) and then appends the same parameters to the URL string (line 64), both passed to driver.connect() at line 73. Since JDBC driver behavior for duplicate parameters from different sources is driver-specific and not well-defined, this could cause conflicts or unexpected precedence with the Databricks JDBC driver.

Additionally, the getConnectionUrl() method (lines 160-174) also appends the same parameters, creating further inconsistency.

Minor inconsistency: Line 64 appends EnableArrow=0 without a trailing semicolon, while lines 167 and 170 include the trailing semicolon. This creates inconsistent URL formatting.

Recommendation: Choose a single source (either driverProperties OR URL parameters) rather than both, or verify that the Databricks JDBC driver explicitly handles this redundancy without conflicts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 536e179 and dc643f6.

📒 Files selected for processing (1)
  • src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (3 hunks)
🔇 Additional comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

190-210: LGTM with one note.

The refactored URL construction logic correctly uses the urlContainsParam helper to avoid duplicate parameters. The semicolon handling here (both parameters get trailing semicolons on lines 203 and 206) is consistent and correct.

Note: This method is more consistent with semicolon usage than the open method—see the earlier comment about line 70.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between dc643f6 and 08e6404.

📒 Files selected for processing (1)
  • src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (3 hunks)
🔇 Additional comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

139-163: LGTM! Previous review feedback addressed.

The exact parameter name matching logic correctly addresses the false positive issue raised in the previous review. The implementation properly extracts the parameter name (left of =) and performs case-insensitive exact matching, avoiding the prefix-matching issue.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

197-220: getConnectionUrl is safe now; consider slightly clearer handling of null/empty URLs

The refactor removes the NPE/SIOOB risks and reuses urlContainsParam(...) to avoid introducing duplicate keys, which is good. One minor behavioral nuance:

  • When con.getMetaData().getURL() returns null, you normalize it to "" and then synthesize a URL like ";UserAgentEntry=Liquibase;EnableArrow=0;". That may be a bit surprising to callers that interpret “no URL” differently from “constructed URL with only tool-specific params”.

If you want to keep behavior closer to the underlying metadata, you could early‑return "" (or rawUrl) when it’s null/empty, and only do the appends when you actually have a base URL:

    String rawUrl = con.getMetaData().getURL();

-   // Handle null or empty URL according to JDBC spec
-   if (rawUrl == null) {
-       rawUrl = "";
-   }
-
-   // Check for ; characters
-   StringBuilder urlBuilder = new StringBuilder(rawUrl);
+   if (rawUrl == null || rawUrl.isEmpty()) {
+       return rawUrl == null ? "" : rawUrl;
+   }
+
+   // Check for ; characters
+   StringBuilder urlBuilder = new StringBuilder(rawUrl);

    // Ensure there's a terminating semicolon for consistent parsing
    if (rawUrl.isEmpty() || rawUrl.charAt(rawUrl.length() - 1) != ';') {
        urlBuilder.append(";");
    }

Not a blocker, just a small clarity/semantics improvement if you care about distinguishing “unknown URL” from “augmented URL”.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 08e6404 and dad0029.

📒 Files selected for processing (1)
  • src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (3 hunks)
🔇 Additional comments (2)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (2)

64-73: URL parameter de‑duplication in open looks correct and consistent

The new StringBuilder usage plus urlContainsParam(...) checks correctly avoid appending duplicate UserAgentEntry/EnableArrow keys while preserving the existing behavior of always setting the driver properties. The added semicolon for EnableArrow=0; also fixes the earlier separator inconsistency. No issues from my side here.


139-163: urlContainsParam helper is robust against case and prefix collisions

The helper now:

  • Normalizes the URL (terminating ;, removes spaces),
  • Splits on ;,
  • Compares only the key portion before the first = in a case‑insensitive way.

This avoids the earlier prefix-matching pitfall (foo vs foobar) and correctly handles EnableArrow in any casing. The implementation looks sound and side‑effect free.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (3)

18-23: Align con field generalization with openConn to avoid unnecessary downcast

You’ve generalized the stored connection to java.sql.Connection, but openConn still downcasts the result of driverObject.connect(...) to S42Connection. That downcast is now unnecessary for the field and keeps this path tightly coupled to the JDBC42-specific implementation, which can become brittle if the driver type changes.

You can keep the S42-specific handling isolated in getUnderlyingSparkConnection and let openConn store whatever Connection the driver returns:

-    public void openConn(String url, Driver driverObject, Properties driverProperties) throws DatabaseException {
+    public void openConn(String url, Driver driverObject, Properties driverProperties) throws DatabaseException {
         String sanitizedUrl = sanitizeUrl(url);
         try {
             Scope.getCurrentScope().getLog(this.getClass()).info("opening connection " + sanitizedUrl);
-            this.con = (S42Connection) driverObject.connect(url, driverProperties);
+            this.con = driverObject.connect(url, driverProperties);
             if (this.con == null) {
                 Scope.getCurrentScope().getLog(this.getClass()).severe("Connection could not be created");
                 throw new DatabaseException("Connection could not be created to " + sanitizedUrl + " with driver " + driverObject.getClass().getName() + ".  Possibly the wrong driver for the given database URL");
             }

getUnderlyingSparkConnection already safely checks instanceof S42Connection, so this remains compatible while making the connection-opening logic more robust.

Also applies to: 92-104


125-151: Optional: consider case-insensitive lookup for consistency with other helpers

getUrlParamValue currently does an exact, case-sensitive startsWith(paramName + "=") match. That’s fine if callers always use the canonical case, but it’s slightly inconsistent with the new case-insensitive handling in urlContainsParam/deduplicateUrlParameters.

If you expect mixed-casing in URLs and want consistent behavior, consider normalizing both sides to lower case (similar to urlContainsParam) before the startsWith check.


268-297: Clarify behavior when the metadata URL is null/empty

The current flow handles null safely by converting it to "", then synthesizing a URL-like string (e.g., ";UserAgentEntry=Liquibase;EnableArrow=0;"). That’s functionally safe, but a bit surprising if callers ever inspect this value beyond logging.

If you prefer a clearer contract, consider an early return for the degenerate case and only run the URL-massaging logic when there’s an actual base URL:

-        String rawUrl = con.getMetaData().getURL();
-        
-        // Handle null or empty URL according to JDBC spec
-        if (rawUrl == null) {
-            rawUrl = "";
-        }
-        
-        // First, deduplicate all parameters in the URL to handle any existing duplicates
-        rawUrl = deduplicateUrlParameters(rawUrl);
+        String rawUrl = con.getMetaData().getURL();
+        if (rawUrl == null || rawUrl.isEmpty()) {
+            return "";
+        }
+
+        // First, deduplicate all parameters in the URL to handle any existing duplicates
+        rawUrl = deduplicateUrlParameters(rawUrl);

The rest of the method can stay as-is, now operating only on a non-empty base URL.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between dad0029 and 5b1d8dc.

📒 Files selected for processing (1)
  • src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (12 hunks)
🔇 Additional comments (5)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (5)

39-48: Underlying Spark connection unwrapping is safe and focused

The instanceof S42Connection guard and nested instanceof SparkJDBCConnection check keep this method safe and nicely scoped: you only expose a SparkJDBCConnection when the underlying type matches, and gracefully return null otherwise. No changes needed here.


59-63: URL/driverProperties normalization in open correctly prevents duplicate parameter keys

The sequence of:

  • stripping UserAgentEntry and EnableArrow from driverProperties,
  • deduplicating URL parameters,
  • enforcing a trailing ;,
  • appending UserAgentEntry / EnableArrow only when they’re missing, and
  • deduplicating again

is a solid way to avoid "Multiple entries with same key" errors while still guaranteeing the desired defaults. This should robustly handle both user-supplied URL parameters and driver-injected properties.

Also applies to: 67-69, 75-88


153-177: Case-insensitive parameter detection is robust and avoids prefix collisions

The new urlContainsParam correctly:

  • normalizes the URL with a trailing ;,
  • strips spaces and splits on ;,
  • isolates the key portion before =, and
  • does an exact, case-insensitive comparison.

This avoids false positives on prefix-like parameter names and works well with the deduplication logic.


179-235: URL parameter deduplication logic is correct and order-preserving

deduplicateUrlParameters cleanly:

  • separates the base URL from the parameter segment,
  • normalizes and splits the parameter string,
  • uses a LinkedHashMap with lowercase keys to deduplicate case-insensitively while keeping the first occurrence and its original casing, and
  • reconstructs the URL with a consistent ;-terminated parameter list.

This should reliably eliminate duplicate keys like EnableArrow/enablearrow without disturbing parameter order.


347-367: JDBC delegate overloads remain correct after signature reflow

The updated line wrapping in the various createStatement, prepareCall, and prepareStatement overloads does not change behavior: each method still delegates directly to the underlying Connection and consistently wraps SQLException in DatabaseException. No issues here.

Also applies to: 491-501, 523-577, 545-552, 555-562, 565-572

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

120-146: Consider case-insensitive matching for consistency.

The method uses case-sensitive matching (startsWith(paramName + "=")), while the new urlContainsParam method uses case-insensitive matching. This inconsistency could cause subtle bugs when parameter names have different casing in the URL.

For example, if the URL contains useragententry=value (lowercase), urlContainsParam would find it, but getUrlParamValue would not.

Apply this diff to make the matching case-insensitive:

         // Use Java Streams to find the parameter value
         Optional<String> paramString = Arrays.stream(uriArgs)
-            .filter(x -> x.startsWith(paramName + "="))
+            .filter(x -> {
+                int equalsIndex = x.indexOf('=');
+                if (equalsIndex == -1) return false;
+                String paramPart = x.substring(0, equalsIndex).toLowerCase();
+                return paramPart.equals(paramName.toLowerCase());
+            })
             .findFirst();
♻️ Duplicate comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

261-292: Same redundant deduplication pattern as in open method.

Similar to the open method, the second deduplication at line 291 is unnecessary if urlContainsParam reliably prevents duplicate additions. Since you deduplicate at line 271 and only append missing parameters (lines 282-287), the final deduplication should never find duplicates.

         String result = urlBuilder.toString();
-        // Deduplicate again after adding parameters to ensure no duplicates
-        return deduplicateUrlParameters(result);
+        return result;
🧹 Nitpick comments (1)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (1)

62-82: Consider removing the second deduplication call.

The deduplication at line 82 is redundant if urlContainsParam correctly prevents duplicate additions. Since you deduplicate at line 63 and only append parameters that don't already exist (lines 72-77), the second deduplication should never find duplicates.

This is defensive programming but adds unnecessary overhead.

         String updatedUrl = urlBuilder.toString();
-        
-        // Deduplicate again after adding parameters to ensure no duplicates
-        updatedUrl = deduplicateUrlParameters(updatedUrl);

-        this.openConn(updatedUrl, driverObject, driverProperties);
+        this.openConn(updatedUrl, driverObject, driverProperties);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5b1d8dc and b010946.

📒 Files selected for processing (1)
  • src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (10 hunks)
🔇 Additional comments (3)
src/main/java/liquibase/ext/databricks/database/DatabricksConnection.java (3)

54-57: Good defensive approach to prevent duplicate parameters.

Removing these properties before they can be duplicated in the URL is the right strategy to address the root cause.


148-172: LGTM!

The case-insensitive exact matching logic correctly prevents false positives from parameter name prefixes. The implementation properly handles edge cases (null inputs, missing '=', empty strings).


174-230: Solid deduplication logic with reasonable assumptions.

The method correctly handles the typical JDBC URL format and preserves the first occurrence of duplicate parameters (case-insensitive). The use of LinkedHashMap ensures parameter order is maintained.

One minor consideration: parameters without an = sign are silently skipped (lines 208-210). This is reasonable for malformed URLs, but you might consider logging a warning if this occurs during debugging.

@macdhollister
Copy link

Just wanted to call out, I addressed this bug in my previous PR #332

My solution was less robust; I just checked if the URL contains the two specific configs and then only sets those configs in driverProperties if the URL doesn't already have them (individually). So this solution definitely covers more. But it will likely have some small merge conflicts with my branch, so just wanted to give a heads up.

@richa-rochna
Copy link
Author

macdhollister Thanks for the heads up! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants