-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[realppl 7] realppl integration with remote/local and unit tests #14853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: cheryllin/ppl
Are you sure you want to change the base?
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback. |
Generated by 🚫 Danger |
aedab52 to
777c31f
Compare
18f7d5c to
f3752fa
Compare
|
/gemini summary |
Summary of ChangesThis pull request introduces a new query execution model based on "realtime pipelines" to enhance Firestore's query capabilities. It refactors core components to support a unified Highlights
Changelog
Activity
|
f3752fa to
655d8cd
Compare
bcf6c5f to
7865fe4
Compare
655d8cd to
aea7b2d
Compare
22651c5 to
e0ccb49
Compare
06aaef3 to
8efed9e
Compare
0a8821c to
9292af4
Compare
8efed9e to
7c0150f
Compare
9292af4 to
958cee7
Compare
7c0150f to
4e3240d
Compare
958cee7 to
affb6e1
Compare
9706f02 to
edf7e9a
Compare
affb6e1 to
cc0e1cd
Compare
cc0e1cd to
531ba2a
Compare
edf7e9a to
cb6e366
Compare
531ba2a to
82b77f2
Compare
cb6e366 to
3abafda
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a large and significant refactoring to introduce a pipeline-based query execution model. The core of the change is the introduction of QueryOrPipeline and TargetOrPipeline variant types, which are then plumbed through the entire query execution stack. The use of absl::variant is a good choice. The testing strategy of parameterizing existing test suites to run with both Query and RealtimePipeline is excellent and provides confidence in this large change. I've found a couple of issues, one of which is critical, but overall this is a well-executed refactoring.
| } else if (bound.inclusive() && sub_end == orderings.size() - 1) { | ||
| conditions.push_back(std::make_shared<api::FunctionExpr>( | ||
| func_inclusive_name, | ||
| std::vector<std::shared_ptr<api::Expr>>{ | ||
| orderings[index].expr_shared(), cursors[index]})); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for applying an inclusive cursor bound appears to be incorrect. The condition sub_end == orderings.size() - 1 causes the inclusive operator (lte/gte) to be applied to the second-to-last component of the OR condition, while the final component (when sub_end == orderings.size()) will incorrectly use the exclusive operator. This will lead to incorrect results for queries with inclusive cursors.
The condition should likely be sub_end == orderings.size() to apply the inclusive bound to the final term of the entire disjunction.
| } else if (bound.inclusive() && sub_end == orderings.size() - 1) { | |
| conditions.push_back(std::make_shared<api::FunctionExpr>( | |
| func_inclusive_name, | |
| std::vector<std::shared_ptr<api::Expr>>{ | |
| orderings[index].expr_shared(), cursors[index]})); | |
| } else { | |
| } else if (bound.inclusive() && sub_end == orderings.size()) { | |
| conditions.push_back(std::make_shared<api::FunctionExpr>( | |
| func_inclusive_name, | |
| std::vector<std::shared_ptr<api::Expr>>{ | |
| orderings[index].expr_shared(), cursors[index]})); |
| if (current_args[i].which_value_type == | ||
| google_firestore_v1_Value_string_value_tag) { | ||
| document_paths.push_back(DecodeString(current_args[i].string_value)); | ||
| } else { | ||
| context->Fail(StringFormat( | ||
| "Invalid argument type for 'documents' stage at index %zu: " | ||
| "expected string_value, got %d", | ||
| i, current_args[i].which_value_type)); | ||
| return nullptr; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a mismatch in the serialization and deserialization of the documents pipeline stage. DocumentsSource::to_proto() encodes document paths using reference_value and google_firestore_v1_Value_reference_value_tag. However, this decoding logic expects string_value and google_firestore_v1_Value_string_value_tag.
This will cause deserialization of pipelines containing a documents stage to fail.
| if (current_args[i].which_value_type == | |
| google_firestore_v1_Value_string_value_tag) { | |
| document_paths.push_back(DecodeString(current_args[i].string_value)); | |
| } else { | |
| context->Fail(StringFormat( | |
| "Invalid argument type for 'documents' stage at index %zu: " | |
| "expected string_value, got %d", | |
| i, current_args[i].which_value_type)); | |
| return nullptr; | |
| } | |
| if (current_args[i].which_value_type == | |
| google_firestore_v1_Value_reference_value_tag) { | |
| document_paths.push_back(DecodeString(current_args[i].reference_value)); | |
| } else { | |
| context->Fail(StringFormat( | |
| "Invalid argument type for 'documents' stage at index %zu: " | |
| "expected reference_value, got %d", | |
| i, current_args[i].which_value_type)); | |
| return nullptr; | |
| } |
| auto kk = index_iterator->key(); | ||
| (void)kk; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ 1
| serializer_ = | ||
| std::make_unique<remote::Serializer>(other.serializer_->database_id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should serializer_ be a shared_ptr instead to make this easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is a unique_ptr. RealtimePipeline::operator= might not be called often, but in the AddStage function below, using a shared_ptr could improve efficiency.
| if (pipeline.rewritten_stages().empty()) { | ||
| pipeline.SetRewrittentStages(RewriteStages(pipeline.stages())); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this removed?
| absl::optional<int64_t> limit = GetLastEffectiveLimit(query.pipeline()); | ||
| if (limit) { | ||
| return limit; | ||
| } | ||
| return absl::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of this can be replaced with return GetLastEffectiveLimit(query.pipeline());
| LimitType View::GetLimitType(const QueryOrPipeline& query) { | ||
| if (query.IsPipeline()) { | ||
| absl::optional<int64_t> limit = GetLastEffectiveLimit(query.pipeline()); | ||
| return limit > 0 ? LimitType::First : LimitType::Last; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this compile? 😮
I don't think operator > is defined for absl::optional. You could use value_or. something along the lines of:
return limit.value_or(0) > 0 ? LimitType::First : LimitType::Last;| api::RealtimePipeline AsCollectionPipelineAtPath( | ||
| const api::RealtimePipeline& pipeline, const model::ResourcePath& path); | ||
|
|
||
| absl::optional<int64_t> GetLastEffectiveLimit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also add a comment for this function
82b77f2 to
a2c7960
Compare
3abafda to
28cf5fa
Compare
a2c7960 to
68e64f0
Compare
| std::shared_ptr<Firestore> firestore = FSTTestFirestore().wrapped; | ||
| core::Query query = Query("foo"); | ||
| ViewSnapshot viewSnapshot(query, newDocuments, oldDocuments, std::move(documentChanges), | ||
| ViewSnapshot viewSnapshot(core::QueryOrPipeline(query), newDocuments, oldDocuments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it would be better to call it QueryOrRealtimePipeline since pipeline is not a part of it.
| serializer_ = | ||
| std::make_unique<remote::Serializer>(other.serializer_->database_id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is a unique_ptr. RealtimePipeline::operator= might not be called often, but in the AddStage function below, using a shared_ptr could improve efficiency.
|
|
||
| private: | ||
| std::vector<std::string> documents_; | ||
| std::set<std::string> documents_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my observation, most of the code references the documents() vector. The set type is only used once for sorting during object creation. It might be better to perform the sorting in the constructor, store the result as a vector, and have documents() return a reference to it.
| return "where"; | ||
| } | ||
|
|
||
| const Expr* expr() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for using a raw pointer instead of a shared_ptr? We should generally avoid raw pointers whenever possible.
| const std::vector<api::Ordering>& orderings) { | ||
| std::vector<api::Ordering> reversed; | ||
| reversed.reserve(orderings.size()); | ||
| for (const auto& o : orderings) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use std::reverse in this case?
| auto kk = index_iterator->key(); | ||
| (void)kk; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ 1
No description provided.