Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold subsequent calls to limitador into a single one #148

Merged
merged 3 commits into from
Nov 22, 2024

Conversation

eguzki
Copy link
Contributor

@eguzki eguzki commented Nov 18, 2024

What

Addresses #145

Divided the work in two commits:

  • Easy changes, as a preparation work for the actual feature. Basically create new runtime structure of config and actions.
  • Fold subsequent calls to limitador into a single one

I moved things around a bit, but let me know if you want to rearrange this differently 🙏

@eguzki eguzki added the enhancement New feature or request label Nov 18, 2024
@eguzki eguzki self-assigned this Nov 18, 2024
@eguzki eguzki linked an issue Nov 18, 2024 that may be closed by this pull request
@alexsnaps
Copy link
Member

I moved things around a bit, but let me know if you want to rearrange this differently 🙏

I don't know that I would want it differently, but it's certainly surprising naming imho. Now... As a general note, I'd really appreciate if you could keep "moving things around" as a dedicated commit in future PRs, as it make reviewing the actual changes easier and makes for a nicer history too (e.g. git bisect, et al...)

Can you explain how you think the module internal adds value tho? I'm a little confused by its name (tho it could be me, and how I makes me think of Pimpl and "inner" in Rust, but unsure why). Anyways, I get a little confused about what everyone is trying to achieve with modules in this code base.

@eguzki
Copy link
Contributor Author

eguzki commented Nov 18, 2024

Happy to remove the internal module and add all the "internal" modules to the root one. I was trying to make a clear distinction between PluginConfiguration and FilterConfig.

@alexsnaps
Copy link
Member

was trying to make a clear distinction between PluginConfiguration and FilterConfig.

You think you succeeded in that? I do agree that that naming is indeed confusing... I never know which one is which. I think that maybe having a Configuration vs. a RuntimeConfig or something would be nicer... yet just my $.02... but you think that renaming the module is a better approach?

@eguzki
Copy link
Contributor Author

eguzki commented Nov 18, 2024

RuntimeConfig will be

@alexsnaps
Copy link
Member

RuntimeConfig will be

I'm not saying that's the right approach. I think that my confusion between these types is their naming. I don't know that I see the whole picture. And most importantly I think we should keep these changes separated, at least in separate commits. There is no ask to change anything on my part here, just to be clear. I'm trying to understand the goals and then look at the means used.

@eguzki eguzki marked this pull request as ready for review November 18, 2024 23:00
@eguzki eguzki requested a review from a team as a code owner November 18, 2024 23:00
@eguzki
Copy link
Contributor Author

eguzki commented Nov 18, 2024

@alexsnaps let me know if you want to rearrange this differently 🙏

Copy link
Member

@alexsnaps alexsnaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this so we make progress. But we really need to schedule some time with @adam-cattermole & @didierofrivia soon enough to start putting some serious thoughts in what it is we want the design of this to become.

src/runtime_action.rs Outdated Show resolved Hide resolved
src/runtime_action_set.rs Outdated Show resolved Hide resolved
Comment on lines +201 to 210
26, 14, 10, 7, 58, 109, 101, 116, 104, 111, 100, 18, 3, 71, 69, 84, 26, 30, 10, 10,
58, 97, 117, 116, 104, 111, 114, 105, 116, 121, 18, 16, 97, 98, 105, 95, 116, 101,
115, 116, 95, 104, 97, 114, 110, 101, 115, 115, 26, 38, 10, 5, 58, 112, 97, 116,
104, 18, 29, 47, 100, 101, 102, 97, 117, 108, 116, 47, 114, 101, 113, 117, 101,
115, 116, 47, 104, 101, 97, 100, 101, 114, 115, 47, 112, 97, 116, 104, 34, 10, 47,
97, 100, 109, 105, 110, 47, 116, 111, 121, 42, 17, 99, 97, 114, 115, 46, 116, 111,
121, 115, 116, 111, 114, 101, 46, 99, 111, 109, 50, 4, 104, 116, 116, 112, 82, 8,
72, 84, 84, 80, 47, 49, 46, 49, 82, 20, 10, 4, 104, 111, 115, 116, 18, 12, 97, 117,
116, 104, 99, 111, 110, 102, 105, 103, 45, 65, 90, 0,
]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we make sure these changes are valid? Don't remember now, but didn't @adam-cattermole look into this? I was hoping we'd get to something that makes it clear what these bytes are, but maybe that didn't happen... 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did look into all of the properties retrieved from the host and provided examples of the requests that would have been serialized to bytes to create this message.. I'm also unsure why this would have changed though so will have a look

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of the serialized headers changed again... it was :authority -> :method -> :path and became :authority -> :path -> :method. I don't know why this keeps changing when protobuf serialized but is consistent enough to pass tests for each PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is something I still do not understand. It just works, it seems to be consistent, but when something apparently unrelated is changed, then the request body changes 🤷

Copy link
Member

@alexsnaps alexsnaps Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying this is it but, I know some libs from Google around proto will purposefully shuffle things around in their "serialized" form when nothing about that order should be asserted for. One example is actually protobuf json that does change the resulting json literal string around...

src/service.rs Outdated Show resolved Hide resolved
impl RateLimitAction {
pub fn new(action: &Action, service: &Service) -> Result<Self, String> {
Ok(Self {
grpc_service: Rc::new(GrpcService::new(Rc::new(service.clone()))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that's an Rc of ... one always?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's been copied to the operation object for every request

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar to my previous comment, we also used to create all the GrpcService up front and share them as we know these won't change, now we're still creating them up front on configure time but one for every action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I tried to explain here #148 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think it's fine for now, but I personally think it's wrong to have as many GrpcServices as we have actions, with 2 services defined in the config and N actions (assuming no merges) we'll have N services when we had 2 before.

src/ratelimit_action.rs Outdated Show resolved Hide resolved
@eguzki eguzki marked this pull request as draft November 19, 2024 17:11
@eguzki eguzki force-pushed the 145-fold-subsequent-calls-to-limitador-into-a-single-one branch from 95bb6e5 to 1c3f7d2 Compare November 21, 2024 16:26
These are easy changes to make following up changes easy

Signed-off-by: Eguzki Astiz Lezaun <[email protected]>
@eguzki eguzki force-pushed the 145-fold-subsequent-calls-to-limitador-into-a-single-one branch from 1c3f7d2 to fea23ce Compare November 21, 2024 21:22
@eguzki eguzki force-pushed the 145-fold-subsequent-calls-to-limitador-into-a-single-one branch from fea23ce to 44e5837 Compare November 21, 2024 21:37
@eguzki eguzki marked this pull request as ready for review November 22, 2024 00:02
@eguzki
Copy link
Contributor Author

eguzki commented Nov 22, 2024

@alexsnaps ready for review

Added easy changes in the first commit to make changes easy on the second commit ❤️

@eguzki
Copy link
Contributor Author

eguzki commented Nov 22, 2024

@adam-cattermole @didierofrivia your feedback more than welcome as well. Do not hesitate, be sharp. This is not rushing for v1. We want to share mindset of "common goals". I am happy to address any comment

@eguzki eguzki requested a review from didierofrivia November 22, 2024 08:49
Comment on lines 177 to 180
if let Some(service) = self.service_handlers.get(&action.service) {
operations.push(Rc::new(Operation::new(
service.get_service(),
action.clone(),
Rc::clone(service),
)))
} else {
error!("Unknown service: {}", action.service);
return Err(OperationError::new(
Status::ParseFailure,
Default::default(),
));
}
operations.push(Rc::new(Operation::new(
Rc::clone(action),
GrpcServiceHandler::new(
Rc::clone(&action.grpc_service()),
Rc::clone(&self.header_resolver),
),
)));
Copy link
Member

@adam-cattermole adam-cattermole Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst I don't think it necessarily matters, the original implementation passed references to GrpcServiceHandlers and only created as many as we had services, instead of one per operation as we have now - I guess as it's just a struct containing rc's to other structs it's not a big deal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original implementation passed references to GrpcServiceHandlers and only created as many as we had services, instead of one per operation as we have now

Correct. The reason being that the runtime action now has the grpc_service embedded, as part of it. It could even be the GrpcServiceHandler itself, why not? It happens that creating GrpcServiceHandler is cheap as it is composed of a couple of Rc<T> fields. The benefit is to remove the service_handlers: HashMap<String, Rc<GrpcServiceHandler>> from the OperationDispatcher and remove the need to check at request time if the service exist or not. Each action, which are built at config time, has the assigned service and grpc_service instance ready to be used. For example, if there are 10 services defined in the config, but actions only use 2 of them why keep them all in OperationDispatcher ?

That is the reasoning. Let me know if that makes sense or not.

Copy link
Member

@alexsnaps alexsnaps Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something is still fishy with these Rcs tbh... if anything tho, this Rc::clone is unnecessary, the callee is already doing that, and returning an Rc for the caller to own. That just literally just increments the counter by one, to drop the returned Rc back on the floor and as such decrementing the counter. That being said the compiler may opt this out... yet, it leave the unnecessary noise for the reader to parse out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch alexclippy

Fixed the unnecessary RC clone

@alexsnaps
Copy link
Member

General note on the refactoring: the duplication between RateLimitAction & AuthAction, namely both fields grpc_service & scope "only" to then be dispatched from RuntimeAction, makes me wonder if that's not where these two fields should live, rather than in the specialized actions. They seem to serve no other purpose there than to be exposed to the layer up, which is RuntimeAction, right?

pub struct RateLimitAction {
    grpc_service: Rc<GrpcService>,
    scope: String,
    service_name: String,
    conditional_data_sets: Vec<ConditionalData>,
}
pub struct AuthAction {
    grpc_service: Rc<GrpcService>,
    scope: String,
    predicates: Vec<Predicate>,
}

Comment on lines +144 to +147
// For RateLimitAction conditions always apply.
// It is when building the descriptor that it may be empty because predicates do not
// evaluate to true.
true
Copy link
Member

@alexsnaps alexsnaps Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this split in behavior is kinda sad. We have an abstraction over (admittedly today) two specialization and they differ fundamentally... so is this a case of (too) early abstraction (or optimization) or is the abstraction just plain wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ratelimit_action instance holds data like this

- predicates:
  - 1 == 1
  - 2 == 2
  data:
  - expression:
      key: a
      value: "1"
  - expression:
      key: b
      value: 1
- predicates:
  - 3 == 3
  - 4 == 4
  data:
  - expression:
      key: c
      value: "1"
  - expression:
      key: d
      value: 1

So, I had two options:
A) conditions_apply method would evaluate all the predicates and return the evaluation result AND'ed. The on build_descriptors evaluate them again and generate descriptor entries.
B) conditions_apply would evaluate always to true and build_descriptors call would evaluate them (only once) and generate descriptor entries. If the descriptor is empty (no entries generated), that means no need to run grpc call to limitador.

As you see in the code, I implemented the latter approach. Two reasons: 1) predicates are evaluated once and 2) even if predicates evaluate to true, the generation of one entry, theoretically, may not generate one entry. Think about trying to read one headers value when the header does not exists. It is true that the way it is currently implemented, DescriptorEntryBuilder's evaluate method always returns a RateLimitDescriptor_Entry. Which I am not entirely sure about that. The current implementation would return ''null'' string when the requested header value does not exist. I believe this should not be the case and it should return Option<RateLimitDescriptor_Entry> instead, but this is topic for another discussion.

I am happy to implement the first option or even an alternative option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less about the approach per se... tho, yeah that could also be challenged. It's about the abstraction introduced.

@eguzki
Copy link
Contributor Author

eguzki commented Nov 22, 2024

General note on the refactoring: the duplication between RateLimitAction & AuthAction, namely both fields grpc_service & scope "only" to then be dispatched from RuntimeAction, makes me wonder if that's not where these two fields should live, rather than in the specialized actions. They seem to serve no other purpose there than to be exposed to the layer up, which is RuntimeAction, right?

pub struct RateLimitAction {
    grpc_service: Rc<GrpcService>,
    scope: String,
    service_name: String,
    conditional_data_sets: Vec<ConditionalData>,
}
pub struct AuthAction {
    grpc_service: Rc<GrpcService>,
    scope: String,
    predicates: Vec<Predicate>,
}

I think you are right about the grpc_service. It serves no other purpose there than to be exposed to the layer up. How could be part of the RuntimeAction ? Maybe instead of enum having RuntimeAction<T> ?

Regaring scope, this field is used to be exposed to the layer up as well as to build the GrpcMessageRequest for which the specialization is needed. Here

@alexsnaps
Copy link
Member

Regaring scope, this field is used to be exposed to the layer up as well as to build the GrpcMessageRequest for which the specialization is needed. Here

Right, but that function takes an RuntimeAction as an argument... so there no need to navigate further down. That &RuntimeAction could be the source of the scope, just as grpc_service. This is unnecessary coupling (and duplication)

Copy link
Member

@adam-cattermole adam-cattermole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to hold this up, I think we still have the refactor yet to do, and this improves readability, is tested and works for the intended purpose so I'm happy to approve, nice work

I think it just has some of the remnants from the operation dispatcher and previous implementation in the form of the services/handlers/rcs to these but we can iterate

@eguzki
Copy link
Contributor Author

eguzki commented Nov 22, 2024

uld be the source of the scope, just as grpc_service. T

You are right. But the build the GrpcMessageRequest is specific and scope is needed for that job, so it makes all the sense that RatelimitAction and AuthAction know about scope. Moreover, it turns out that ratelimit action uses the scope as well to implement the merge.

Copy link
Member

@alexsnaps alexsnaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @adam-cattermole
I think the design is lacking mostly, but indeed still very much informed by the existing... that we know we need to give some love

@eguzki eguzki merged commit d180b17 into main Nov 22, 2024
13 checks passed
@eguzki eguzki deleted the 145-fold-subsequent-calls-to-limitador-into-a-single-one branch November 22, 2024 14:21
@alexsnaps
Copy link
Member

alexsnaps commented Nov 22, 2024

But the build the GrpcMessageRequest is specific and scope is needed for that job, so it makes all the sense that RatelimitAction and AuthAction know about scope.

I very much disagree with that statement, as right now it's a split responsibility: GrpcMessageRequest::new take a &RuntimeAction and then splits... So the coupling is (from a "public" API) to RuntimeAction then internally to both variants... Where that enum is currently just dynamic dispatching to an "interface" that's ill defined (all "API" from *Actions but Predicate vs Conditional ones...)

@eguzki
Copy link
Contributor Author

eguzki commented Nov 22, 2024

For me, instead of having GrpcMessageRequest::new(action: &RuntimeAction) -> Option<Self>, it should be RuntimeAction doing some job which implementation depends on the variant. And that job may (or may not) need to create a GrpcMessageRequest, for which you need the scope.

Anyway... for future refactors.

@alexsnaps
Copy link
Member

Right... I'm thinking we're under utilizing GrpcService, I think there's an inversion here possibly. That is the differentiator again... Too much logic (the dynamic dispatch) in RuntimeAction 🤔

grpc_call_fn_stub: GrpcCallFn,
extension_type: ServiceType,
) -> Rc<Operation> {
fn build_auth_grpc_action() -> RuntimeAction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leaving a note here, for a future refactor... the operationDispatcher was meant to be agnostic of the service/extension specifics. Anything specific for auth, ratelimit, what ever else that comes in the future, should be built somewhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Fold subsequent calls to Limitador into a single one
4 participants