-
Notifications
You must be signed in to change notification settings - Fork 25
THREESCALE-10555: Resync applications in batches #453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Claude <[email protected]>
87bd9d6 to
48d7021
Compare
|
I think that's nice! Secondly, I am thinking about the following scenario:
Why there is a mismatch? well, maybe there was some hiccup in the communication between system and backend, and those update jobs were lost somehow. This is one of the purposes of the But I believe with the current approach, we are not deleting the old values from Redis? (so, I believe this is the actual behavior with current process too, and with the current one-object-at-a-time approach it's kinda difficult to handle this. But now as we pass the exact complete desired state of each applications, maybe we could overcome this limitation? So, for the two arrays, we may also try to ensure that any other values that are not supposed to be there, are deleted. WDYT? |
| application: app.to_hash.merge( | ||
| application_keys: app.keys, | ||
| referrer_filters: app.referrer_filters | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add the keys and referrers to the resulting object. Can we also add the user_key (it seems that it's the only thing that's missing in the resulting application)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been checking this and decided not do it. The reason is the absurdly complicated way we have to store user keys in apisonator. Instead of storing it under something like
application/service_id:#{service_id}/id:#{id}/user_key
It's actually stored as:
application/service_id:#{service_id}/key:#{key}/id
So the user key itself is stored in the redis key, and the value for that key is the app id. The ancients did it again. An app can only have one key, so when a user key is rotated, a new redis key it's created. Because of this, the previous user key has to be removed manually when rotating and also when deleting an application. If there's any inconsistency, that would lead to old rotated keys already valid and being usable, also orphaned keys.
In fact, I followed the whole flow when you remove and app from porta all the way to backend, and I'd say user keys are always orphaned when removing an app.
With regards to returning the user key in the response, another side effect of this weird way to store user keys is that, if we want to retrieve the correct user key, we'll need to read all user keys for all apps in that service and return the one which value is the given app_id.
My guess is this design was a way to authenticate apps faster when receiving a hit. And I understand it, but has this undesired side effects and we can't change this now.
The fact is, if we want to return the user key in the response, the only plausible wya to do it is just returning the received :user_key parameter, like "this is the received user key and storing it didn't failed, so it must be the same one stored in redis". But I don't like this because all other data in the response is loaded from redis, we are not retruning parameters data directly in the response, and I don't think we should do an exception here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact is, if we want to return the user key in the response, the only plausible wya to do it is just returning the received :user_key parameter, like "this is the received user key and storing it didn't failed, so it must be the same one stored in redis".
This was what I had in mind for my request 😬
To verify that "it is the one stored in redis", we can just fetch application/service_id:#{service_id}/key:#{key}/id and confirm it matches the app ID.
But OK, if you really don't want to add it, it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To verify that "it is the one stored in redis", we can just fetch
application/service_id:#{service_id}/key:#{key}/idand confirm it matches the app ID.
I didn't think on this and can be done since it's only one redis command. I'll do it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on it...
Every time you see an inconsistency like this... it was Claude!
Yeah, I'll fix this behavior and use a complete replace approach. |
Co-Authored-By: Claude <[email protected]>
Done: 5834a0d
After checking this, I also think it's not worth doing it, at least in this PR. Because what you mention is applicable to all application data and even applications themselves, not only referrer filters:
This is what we should do if we wanted to make this endpoint really sync with porta. Doing each one of the operations above is not easy, redis doesn't have transactions like relational DBs, so we can't just wipe and add new items, because if new items fail to be added, the list would be left wiped. Also, during a few milliseconds, the list would be actually wiped while we add the new items. That means the only approach we have is generate the diff and remove particular items. I see this approach error prone, even more considering we also have to sync the memoization cache. And all of this during runtime with requests coming. What I would need in order to do this in a way I would feel confident enough is having atomicity and rollback features in redis. According to Claude, the only way to do that is writing and executing Lua scripts. We would need to replace the I think it's interesting to investigate the Lua way, but in another PR in my opinion. Adding a new endpoint that does exactly the same we were doing before but in batches is a wide enough scope for this PR, we can address the sync feature later on. WDYT? Edit: I found this. Even with Lua scripts there is no rollback in redis, there is no way to atomically transition from one consistent state to another consistent state. |
| allOf: | ||
| - $ref: "#/components/schemas/Application" | ||
| - type: object | ||
| properties: | ||
| user_key: | ||
| type: | ||
| - "null" | ||
| - string | ||
| application_keys: | ||
| type: array | ||
| items: | ||
| type: string | ||
| referrer_filters: | ||
| type: array | ||
| items: | ||
| type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the same model as in request body now? "Application schema" + user_key, application_keys and referrer_filters...
Maybe this can be put into a schema and referenced in both places? (request and response)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly the same, because the response includes the field service_id, which is not part of the request, since the request includes the service id as a URL param.
| internal_api '/services/:service_id/applications' do | ||
| # Helper method to save a single application with all its associated data | ||
| def save_application_with_data(service_id, app_data) | ||
| app_id = app_data[:id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to validate the presence of the id field.
Currently, this is considered successful:
{
"applications": [
{
"state": "ok"
}
]
}
and it creates invalid entries in redis:
[db8] > keys *
1) "application/service_id:2/id:/state"
2) "service_id:2/applications"
[db8] > get application/service_id:2/id:/state
"ok"
[db8] > smembers service_id:2/applications
1) ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some validation to fix this: 5d3e46d
This looks like:
"applications": [
{
"status": "error",
"id": null,
"error": "Application has no id "
},
{
"status": "modified",
"application": {
"service_id": "3",
"id": "0f3a6cf3",
"state": "active",
"plan_id": "7",
"plan_name": "Basic",
"redirect_url": null,
"user_key": null,
"application_keys": [
"bbbb"
],
"referrer_filters": [
"*"
]
}
}
]On you request example, you're sending an invalid state value: ok. Since I was already working on validation I decided to validate this field also... however, apparently there are some inconsistencies on which states should accepted.
Apparently, there are for possible states in which a cinstance can be (ref):
state :pending
state :live
state :suspended
state :deprecatedBut, when sending this to porta, live is translated into active (ref):
if plan && service
state = self.state
state = :active if live?
ThreeScale::Core::Application.save( :service_id => service.backend_id,
:id => application_id,
:state => state,
:plan_id => plan.id,
:plan_name => plan.name,
:redirect_url => redirect_url )
endHowever, some tests in apisonator insert live directly to DB:
apisonator/test/test_helpers/fixtures.rb
Line 140 in 7a113dd
| Application.save(service_id: @service_id, id: @app_id, state: :live) |
According to the API Docs, only active, pending and suspended are allowed.
apisonator/docs/specs/backend_internal.yaml
Lines 1743 to 1748 in 8bd9e0c
| state: | |
| type: string | |
| enum: | |
| - active | |
| - pending | |
| - suspended |
For that reason, I validate only those three and I modified the tests to not insert live anymore. However, I'm aware we could have anything in production, so this must be breaking changes. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, at first I thought this validation would only apply for the new batch updates - then it wouldn't be concerning. But indeed, it can happen for any application updates, and indeed, we don't know what's there in production.
So, I'd remove the ApplicationStateInvalid error.
Indeed, in my example, I used state=ok, but I guess I wasn't clear enough that that was not bothering me, the only problem I saw was a missing ID, not the "invalid" state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 316ea2f
mayorova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
Basically, we want to optimize the next rake tasks from porta:
backend:storage:enqueue_rewritebackend:storage:rewritebackend:storage:rewrite_providerThe
procin porta that syncs data to backend looks like this:So, for a provider with e.g. 500 applications, this ends up with:
update_backend_applicationupdate_backend_user_key_to_application_id_mappingapp_key#update_backend_value, depending on the service using app_id + app_key auth, and how many keys are there.ref_filter#update_backend_valueSo more than 1000 requests for this provider, more than 1500 if the service uses app_id + app_key auth method.
After this PR, porta will have a new backend endpoint available, so it can batch apps and send all of them in 1 request.