This detector checks for invisible and corrupted unicode within output. If your application is not meant to process hidden or corrupted unicode, then you should turn on this detector.

Example

An example is an attacker hiding data in invisible unicode, which is a way to exfiltrate data and self-replicate any malicious instructions.

Threat

If a user copypastes any output that has invisible unicode, they are exposing themselves and their colleagues to a wide variety of threats. For example, if they copypaste an output with hidden instructions in unicode into an internal company document, that internal document then becomes vulnerable whenever it is ingested by the LLM. Suddenly, a trusted source becomes untrusted.