This detector checks for hidden text within output. If your application is not meant to process hidden text or nonstandard stylings, then you should turn on this detector.

Example

An example is an attacker convincing the LLM to output hidden text that a user will then copypaste. Another example is an attacker convincing an LLM to produce a webpage with a hidden injection, which is a way to self-replicate these injections and litter the web with them (like a worm).

Threat

There are almost no cases in which hidden text in output is the intent of the user. If your LLM application should not be outputting hidden text, you should keep this detector on.