This detector checks for invisible and corrupted unicode within input. If your application is not meant to process hidden or corrupted unicode, then you should turn on this detector.

Example

An example is an attacker hiding instructions in invisible unicode, which is a way to obfuscate any instructions. This attack was discovered by Riley Goodside:

(<https://twitter.com/goodside/status/1746685366952735034>)

(https://twitter.com/goodside/status/1746685366952735034)

Threat

This is a good way to detect a method of delivery for many of the threats we have spoken about. Any attacker can put a hidden instruction in unicode, which would affect the output of your LLM application. Our detectors check for hidden or corrupted unicode to ensure that you don't face this risk.