Skip to content

Conversation

@abhishekrb19
Copy link
Contributor

Refactor:

  • Move LookupLoadSpec and BroadcastDatasourceLoadingSpec to a new class called LoadSpecHolder. These are applicable to all servers, including CliPeon servers.
  • Rename DataSourceTaskIdHolder to TaskPropertiesHolder. This is applicable to only CliPeon servers.

This separation makes the intent clearer, as DataSourceTaskIdHolder was previously misleading as it held more than just the dataSource and taskId fields.

The TaskPropertiesHolder currently contain taskId and dataSource fields injected from the CliPeon. In the future, we can extend it to also include groupId and taskType from the Task and add those dimensions to all the metric modules as applicable.

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@abhishekrb19 abhishekrb19 requested a review from kfaraz November 10, 2025 05:07
"Cannot specify both `lookupTier` and `lookupTierIsDatasource`"
);
final String lookupTier = lookupTierIsDatasource ? dataSourceTaskIdHolder.getDataSource() : this.lookupTier;
final String lookupTier = lookupTierIsDatasource ? taskPropsHolder.getDataSource() : this.lookupTier;

Check notice

Code scanning / CodeQL

Possible confusion of local and field Note

Confusing name: method
getLookupTier
also refers to field
lookupTier
(without qualifying it with 'this').

Copilot Autofix

AI 13 days ago

To fix the problem, we should rename the local variable in the getLookupTier() method to avoid overshadowing the field name. This means giving the local variable a new, clear, and context-specific name while ensuring the assignment and uses of the variable are updated to match the new name. The best choice is to use a name like tierToUse, which clearly identifies its role as the lookup tier value that will be used in the final return.

Steps:

  • In getLookupTier(), change final String lookupTier = ... to final String tierToUse = ....
  • Update all uses of the local variable within the method (namely, the argument to emptyToNullNonDruidDataString and Preconditions.checkNotNull) to reference tierToUse instead of lookupTier.
  • No additional imports or method definitions are required.

All changes can be made within the method body in server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java.


Suggested changeset 1
server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java b/server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java
--- a/server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java
+++ b/server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java
@@ -55,10 +55,10 @@
         !(lookupTierIsDatasource && null != lookupTier),
         "Cannot specify both `lookupTier` and `lookupTierIsDatasource`"
     );
-    final String lookupTier = lookupTierIsDatasource ? taskPropsHolder.getDataSource() : this.lookupTier;
+    final String tierToUse = lookupTierIsDatasource ? taskPropsHolder.getDataSource() : this.lookupTier;
 
     return Preconditions.checkNotNull(
-        lookupTier == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(lookupTier),
+        tierToUse == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(tierToUse),
         "Cannot have empty lookup tier from %s",
         lookupTierIsDatasource ? "bound value" : LookupModule.PROPERTY_BASE
     );
EOF
@@ -55,10 +55,10 @@
!(lookupTierIsDatasource && null != lookupTier),
"Cannot specify both `lookupTier` and `lookupTierIsDatasource`"
);
final String lookupTier = lookupTierIsDatasource ? taskPropsHolder.getDataSource() : this.lookupTier;
final String tierToUse = lookupTierIsDatasource ? taskPropsHolder.getDataSource() : this.lookupTier;

return Preconditions.checkNotNull(
lookupTier == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(lookupTier),
tierToUse == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(tierToUse),
"Cannot have empty lookup tier from %s",
lookupTierIsDatasource ? "bound value" : LookupModule.PROPERTY_BASE
);
Copilot is powered by AI and may make mistakes. Always verify output.
@github-actions github-actions bot added the GHA label Nov 10, 2025
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, @abhishekrb19 !

I had a slightly different approach in mind.

In the original class, DatasourceTaskIdHolder all the fields are derived from Task itself, so we should probably just:

  • Rename it to PeonTaskHolder to denote that this class is meaningful only for a peon and that it contains the task that the peon is running
  • Just keep one field inside it Task task and add method getTask()
  • Retain the original utility methods getDatasource(), getTaskId(), getLookupLoadinSpec() etc.
  • Create the PeonTaskHolder in a @Provides method in CliPeon.
  • In non peon servers, bind the PeonTaskHolder to null.

Let me know if this would make sense.

@abhishekrb19
Copy link
Contributor Author

abhishekrb19 commented Nov 10, 2025

Thanks for the review, @kfaraz!

I think the approach of keeping one renamed class would work well and makes sense. Just one caveat: the Task class is in the indexing-service module, so we can’t use it directly in the server module. However, we could create a constructor that accepts all the task properties of interest and have it @Provides from CliPeon like you suggested, along with a default constructor for all other servers. This way, we can remove all the individual @Named annotation bindings from CliPeon to DatasourceTaskIdHolder. Does that work?

@kfaraz
Copy link
Contributor

kfaraz commented Nov 10, 2025

Ah, I see, thanks for clarifying that, @abhishekrb19 ! All the current usages of DatasourceTaskIdHolder are indeed in the druid-server module.

In that case, I think we should add an interface TaskHolder to druid-server and use it wherever needed.
Bind the TaskHolder to a PeonTaskHolder in CliPeon and bind it elsewhere to a NoopTaskHolder which throws UnsupportedOperationException upon calling any method. I feel this clarifies the intent better and makes it more extensible for the future.

Sound good?

@abhishekrb19
Copy link
Contributor Author

@kfaraz sorry I ran into some hiccups with the bindings last week and didn't get a chance to take a closer look, but I'm hoping to revive this change again soon.

Btw I noticed this deprecated comment on EmbeddedMiddleManager, so most (or all?) the tests seem to use the indexer for good reasons. Given that there are some subtle differences between indexers and MMs, do you have any suggestions for embedded tests that use MMs?

At the very least, I was thinking of updating some of the embedded tests to use EmbeddedMiddleManager instead of EmbeddedIndexer locally. I’m also planning to add some unit tests.

@kfaraz
Copy link
Contributor

kfaraz commented Nov 20, 2025

@kfaraz sorry I ran into some hiccups with the bindings last week and didn't get a chance to take a closer look, but I'm hoping to revive this change again soon.

No worries, @abhishekrb19 ! 🙂

Given that there are some subtle differences between indexers and MMs, do you have any suggestions for embedded tests that use MMs?

If the purpose is to test out peons in an embedded test setup, you could do a couple of things:

  • Use EmbeddedMiddleManager and annotate it @SuppressForbidden(reason = "EmbeddedMiddleManager#init")
  • Use the k3s environment with the help of K3sClusterResource, e.g. KubernetesClusterDockerTest (These tests would run in the docker-tests GHA workflow. It is also possible to run them locally after setting the druid.testing.docker.image property.)

For local testing, please feel free to update any test to use an MM instead of Indexer. Most of them should work fine for both Indexer and MM.
But while committing, I would advise not to change any of the existing tests to use MMs (since Indexers are faster and easier to debug) and add new MM-flavored variants of any test where necessary.

Please let me know if this meets yours needs.

@abhishekrb19
Copy link
Contributor Author

Sounds good, thanks for the pointer @kfaraz! I will give that a try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants