diff --git a/README.md b/README.md index 400b068..7555c34 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,12 @@ # Imprompter: Tricking LLM Agents into Improper Tool Use -This is the codebase of `imprompter`. It provides essential components to reproduce and test the attack presented in the [paper](https://arxiv.org/abs/2410.14923). Video demos can be found on our [website](https://imprompter.ai). You may create your own attack on top of it as well. +This is the codebase of `imprompter`. It provides essential components to reproduce and test the attack presented in the [paper](https://arxiv.org/abs/2410.14923). You may create your own attack on top of it as well. + +A video screencast showing how an attacker can exfiltrate the user's PII in real world LLM product ([Mistral LeChat](https://chat.mistral.ai/chat)) with our adversarial prompt: + +![video](docs/mistral_pii_demo.mp4) + +More video demos can be found on our [website](https://imprompter.ai). ## Setup diff --git a/docs/index.md b/docs/index.md index b31ae94..3515693 100644 --- a/docs/index.md +++ b/docs/index.md @@ -29,7 +29,7 @@ We present various demos and textual adversarial prompts on this page. For full ## How to Reproduce !!! warning "Expected Behavior" - After we disclosed this vunerability to Mistral AI in September 2024, their security team decided to disable image markdown rendering features. Now you will not see the same behavior in the video demo but an image placeholder as in the conversation window. Find more details in the [Disclosure section](#disclosure-and-impact). The ChatGLM security team has not responded or addressed such issue. You should be able to reproduce the exact bahavior there. + After we disclosed this vunerability to Mistral AI in September 2024, their security team decided to disable image markdown rendering features. Now you will not see the same behavior in the video demo but an image placeholder as in the conversation window. Find more details in the [Disclosure section](#disclosure-and-impact). The ChatGLM security team has not yet addressed such issue as of Oct 21 2024. You should be able to reproduce the exact bahavior there. ### Scenario 1 diff --git a/docs/overrides/main.html b/docs/overrides/main.html index e93d7f3..80974d0 100644 --- a/docs/overrides/main.html +++ b/docs/overrides/main.html @@ -27,7 +27,7 @@

Paper
- +
{% include ".icons/fontawesome/brands/github.svg" %}
diff --git a/docs/paper.pdf b/docs/paper.pdf index fd202ef..29fada2 100644 Binary files a/docs/paper.pdf and b/docs/paper.pdf differ