diff --git a/index.md b/index.md index a691521..e1c9827 100644 --- a/index.md +++ b/index.md @@ -7,11 +7,11 @@
- GPT-4 + GPT-4

Fig.1 GPT-4 safety filters can be bypassed by jailbreaks!

- RPO + RPO

Fig.2 RPO enforces harmless responses even after jailbreaks