Similar to code output, attackers are able to get LLMs to output scripts. This detector checks for those scripts on output to ensure they are not being rendered to the user.

Example

An example is an attacker convincing the LLM to generate a script by describing the qualities that the script should have. As such, it would not be caught on input, but would be caught on output.

Threat

Any ability for attackers to render scripts within your application is problematic. If your LLM application is not intended to produce scripts, we highly recommend keeping this detector on.