Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPIK-595]: Add Redis Caching part 1 #1015

Merged
merged 13 commits into from
Jan 14, 2025

Conversation

thiagohora
Copy link
Contributor

Details

Add an annotation-based caching mechanism to allow us to define cacheable methods in the project.

Issues

OPIK-595

Testing

Adding Integration tests to the new mechanism

@thiagohora thiagohora requested a review from a team as a code owner January 9, 2025 17:35
@thiagohora thiagohora force-pushed the thiagohora/OPIK-595_redis_caching_part_1 branch from c9df8cc to 7b723b1 Compare January 9, 2025 17:36
@thiagohora thiagohora force-pushed the thiagohora/OPIK-595_redis_caching_part_1 branch from 7b723b1 to 1247a9b Compare January 9, 2025 17:37
@thiagohora thiagohora self-assigned this Jan 9, 2025
@thiagohora thiagohora force-pushed the thiagohora/OPIK-595_redis_caching_part_1 branch from dbeeb59 to 593da4a Compare January 10, 2025 10:57
Copy link
Collaborator

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's admirable that you went ahead with an implementation that tries to solve the problem in a general way, beyond just caching the automation rules.

This is also heavily based on the Spring Cache approach, which provides an interface with known annotations (@Cacheable, @CachePut and @CacheEvict).

However, this has many non-trivial edge cases and building a library like this is a problem by itself that requires a lot of focus. In our case it has to be just good enough, but already has some challenges: reactive vs non-reactive methods, distinguishing between types (collections and non-collections currently) etc. and this moves us away from focusing on the business cases strictly related to Opik.

I was wondering if you investigated some alternative that we can just use. Some examples:

  1. Something in Dropwizard.
  2. Some Java Library than can be wrapped around the Redis.
  3. Integrating some parts of Spring Cache.

I'd be fine to move forward, but it's desirable to investigate this first, in order to keep focusing on building Opik use cases.

@@ -198,6 +198,16 @@
<artifactId>httpclient5</artifactId>
<version>5.2.3</version>
</dependency>
<dependency>
<groupId>org.mvel</groupId>
<artifactId>mvel2</artifactId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: we could evaluate using Drools as well. But this is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you believe Drools, it's better that we use it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ok to go with this as well.

@@ -157,6 +157,7 @@ public <E, T extends AutomationRuleEvaluator<E>> T findById(@NonNull UUID id, @N
case LlmAsJudgeAutomationRuleEvaluatorModel llmAsJudge ->
AutomationModelEvaluatorMapper.INSTANCE.map(llmAsJudge);
})
.map(evaluator -> (T) evaluator)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: for a safer cast, better check if instance of in a filter. Anyhow, it might not ever happen or maybe it's ok to allow this error to bubble up if it ever happens an unexpected programming error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, given the switch, I imagine this would not happen, but Sonar keeps complaining. I changed it so it only complains in one line instead of the whole method

key = getKeyName(name, keyAgs, invocation);
} catch (Exception e) {
// If there is an error evaluating the key, proceed without caching
log.warn("Cache will be skipped due to error evaluating key expression");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please log the exception as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is logged on getKeyName

Copy link
Collaborator

@andrescrz andrescrz Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I think it's better you remove the catch entirely from getKeyName and just allow exceptions to bubble up as you're having a catch at Exception level here anyway. You don't need the conversion to IllegalArgumentException. Then, just move the exception log to this catch.

Comment on lines 76 to 80
if (isReactive) {
return action.apply(key, name);
}

return action.apply(key, name).block();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid this distinction and return the result of applying the action without blocking? Would it work by delegating the block call to the handler of this result?

Copy link
Contributor Author

@thiagohora thiagohora Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code would fail if the method annotated returns a non-reactive type. Otherwise, we would restrict the annotation of reactive methods.

}

private Mono<Object> processCacheEvictMethod(MethodInvocation invocation, boolean isReactive, String key) {
if (isReactive) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies for all the other methods. We should try to have a general implementation, that doesn't distinguish by reactive or not. Otherwise, this library won't work for other frameworks. Imagine that we use RxJava.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would love to do that, but we need to know the type to handle the cache properly without missing the business logic.

Comment on lines +66 to +68
new CustomConfig("cacheManager.enabled", "true"),
new CustomConfig("cacheManager.defaultDuration", "PT0.500S"),
new CustomConfig("cacheManager.caches.%s".formatted(CACHE_NAME_2), "PT0.200S")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just configure these values in the test config yaml file instead of overwriting them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the cache disabled by default and let the specific test enable it. Otherwise, the side effects may impact other tests.

@thiagohora
Copy link
Contributor Author

It's admirable that you went ahead with an implementation that tries to solve the problem in a general way, beyond just caching the automation rules.

This is also heavily based on the Spring Cache approach, which provides an interface with known annotations (@Cacheable, @CachePut and @CacheEvict).

However, this has many non-trivial edge cases and building a library like this is a problem by itself that requires a lot of focus. In our case it has to be just good enough, but already has some challenges: reactive vs non-reactive methods, distinguishing between types (collections and non-collections currently) etc. and this moves us away from focusing on the business cases strictly related to Opik.

I was wondering if you investigated some alternative that we can just use. Some examples:

  1. Something in Dropwizard.
  2. Some Java Library than can be wrapped around the Redis.
  3. Integrating some parts of Spring Cache.

I'd be fine to move forward, but it's desirable to investigate this first, in order to keep focusing on building Opik use cases.

Yeah, I looked for something similar to Spring Cache but couldn't find it. The closest one was https://github.com/bazaarvoice/dropwizard-caching-bundle, but it's archived. I also found some others using ehcache and caffeine, but all seem old and without maintenance.

https://github.com/gabber12/dropwizard-caffeine-configurator

Anyway, I agree with you it's definitely not a trivial problem, but I didn't try to implement a solution to all scenarios. Instead,d I tried to go with something small that would cover our basic scenario, and if needed, we could continue building it.

@thiagohora thiagohora requested a review from andrescrz January 10, 2025 11:52
@andrescrz
Copy link
Collaborator

It's admirable that you went ahead with an implementation that tries to solve the problem in a general way, beyond just caching the automation rules.
This is also heavily based on the Spring Cache approach, which provides an interface with known annotations (@Cacheable, @CachePut and @CacheEvict).
However, this has many non-trivial edge cases and building a library like this is a problem by itself that requires a lot of focus. In our case it has to be just good enough, but already has some challenges: reactive vs non-reactive methods, distinguishing between types (collections and non-collections currently) etc. and this moves us away from focusing on the business cases strictly related to Opik.
I was wondering if you investigated some alternative that we can just use. Some examples:

  1. Something in Dropwizard.
  2. Some Java Library than can be wrapped around the Redis.
  3. Integrating some parts of Spring Cache.

I'd be fine to move forward, but it's desirable to investigate this first, in order to keep focusing on building Opik use cases.

Yeah, I looked for something similar to Spring Cache but couldn't find it. The closest one was https://github.com/bazaarvoice/dropwizard-caching-bundle, but it's archived. I also found some others using ehcache and caffeine, but all seem old and without maintenance.

https://github.com/gabber12/dropwizard-caffeine-configurator

Anyway, I agree with you it's definitely not a trivial problem, but I didn't try to implement a solution to all scenarios. Instead,d I tried to go with something small that would cover our basic scenario, and if needed, we could continue building it.

Discussed offline, but leaving my proposal here as well. However, it requires some investigation from your side:

In Java there's a standard API for caching, it's name is JCache. Actually, Spring Cache is pretty similar to it, many classes and even the annotations have the same or equivalent names.

We use Redisson as Redis client, which I believe provides an implementation for JCache. I think there's a lot of overlap with your implementation, including the annotations and some classes or interfaces such as CacheManager.

Additionally, Redisson supports the reactive API and I think they have support for that in their JCache implementation.

I think there are chances that your implementation would be greatly simplified, an the problems that I'm pointing out in my review would go away (or mostly) with this approach.

I'm not completely sure, but it's likely to be a matter of glueing things together: Redisson and its JCache implementation, with our use cases.

@thiagohora
Copy link
Contributor Author

It's admirable that you went ahead with an implementation that tries to solve the problem in a general way, beyond just caching the automation rules.
This is also heavily based on the Spring Cache approach, which provides an interface with known annotations (@Cacheable, @CachePut and @CacheEvict).
However, this has many non-trivial edge cases and building a library like this is a problem by itself that requires a lot of focus. In our case it has to be just good enough, but already has some challenges: reactive vs non-reactive methods, distinguishing between types (collections and non-collections currently) etc. and this moves us away from focusing on the business cases strictly related to Opik.
I was wondering if you investigated some alternative that we can just use. Some examples:

  1. Something in Dropwizard.
  2. Some Java Library than can be wrapped around the Redis.
  3. Integrating some parts of Spring Cache.

I'd be fine to move forward, but it's desirable to investigate this first, in order to keep focusing on building Opik use cases.

Yeah, I looked for something similar to Spring Cache but couldn't find it. The closest one was https://github.com/bazaarvoice/dropwizard-caching-bundle, but it's archived. I also found some others using ehcache and caffeine, but all seem old and without maintenance.
https://github.com/gabber12/dropwizard-caffeine-configurator
Anyway, I agree with you it's definitely not a trivial problem, but I didn't try to implement a solution to all scenarios. Instead,d I tried to go with something small that would cover our basic scenario, and if needed, we could continue building it.

Discussed offline, but leaving my proposal here as well. However, it requires some investigation from your side:

In Java there's a standard API for caching, it's name is JCache. Actually, Spring Cache is pretty similar to it, many classes and even the annotations have the same or equivalent names.

We use Redisson as Redis client, which I believe provides an implementation for JCache. I think there's a lot of overlap with your implementation, including the annotations and some classes or interfaces such as CacheManager.

Additionally, Redisson supports the reactive API and I think they have support for that in their JCache implementation.

I think there are chances that your implementation would be greatly simplified, an the problems that I'm pointing out in my review would go away (or mostly) with this approach.

I'm not completely sure, but it's likely to be a matter of glueing things together: Redisson and its JCache implementation, with our use cases.

As we discussed offline, the Jcache implementation is focused on CacheManagers and its APIs. There are no benefits concerning cache resolution and abstraction. Having said that, we agree to keep thinking about it and improving this implementation in subsequent iterations.

Copy link
Collaborator

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments, but I'm happy for this to move forward and address them in a follow-up PR, as you need to send part 2 anyway.

Thanks for the whole investigation about JCache. Very well done!

@@ -198,6 +198,16 @@
<artifactId>httpclient5</artifactId>
<version>5.2.3</version>
</dependency>
<dependency>
<groupId>org.mvel</groupId>
<artifactId>mvel2</artifactId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ok to go with this as well.

@Retention(RetentionPolicy.RUNTIME)
@Inherited
@Documented
public @interface CacheEvict {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed the JCache approach, but what about just reusing their interface annotations. Any blocker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JCache annotations only have a cache name property and other configuration classes, such as a cache resolver and a key generator (which must be implemented using reflection based on the method signature). For those reasons, I kept these annotations, which have a cache name and key. We use MVEL to solve the key expression. I believe this is less verbose and a bit more flexible.

key = getKeyName(name, keyAgs, invocation);
} catch (Exception e) {
// If there is an error evaluating the key, proceed without caching
log.warn("Cache will be skipped due to error evaluating key expression");
Copy link
Collaborator

@andrescrz andrescrz Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I think it's better you remove the catch entirely from getKeyName and just allow exceptions to bubble up as you're having a catch at Exception level here anyway. You don't need the conversion to IllegalArgumentException. Then, just move the exception log to this catch.

private Mono<Object> processCacheEvictMethod(MethodInvocation invocation, boolean isReactive, String key) {
if (isReactive) {
try {
return ((Mono<?>) invocation.proceed())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: maybe use map to convert these types and just remove the casting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand. The invocation.proceed() returns Object, I need to cast to go from Object to Mono. Maybe it didn't understand you properly.

Comment on lines +276 to +278
if (evaluatedKey.isEmpty() || evaluatedKey.equals("null")) {
throw new IllegalArgumentException("Key expression cannot return an empty string");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my other comment. This IllegalArgumentException is going to go to the catch statement below and wrapped into another IllegalArgumentException. You should be good to just remove this try-catch as it's just a catch and re-throw pattern that doesn't provide much value in this sceneario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is to avoid the use of an empty string and a null string. Because otherwise it would be a valid string.

@@ -55,6 +55,8 @@
/// - **Openai**: runs against a demo server and doesn't require an API key
/// - **Anthropic**: set `ANTHROPIC_API_KEY` to your anthropic api key
@TestInstance(TestInstance.Lifecycle.PER_CLASS)
@Disabled
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can clean this up. It's no longer necessary. These tests have been disabled in another PR.

@thiagohora thiagohora merged commit 272dfb8 into main Jan 14, 2025
8 checks passed
@thiagohora thiagohora deleted the thiagohora/OPIK-595_redis_caching_part_1 branch January 14, 2025 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants