10. Chained Prompt Amplification
10. Chained Prompt Amplification
Attack: A series of small manipulations across chained prompts results in an unintended system-wide behavior.
Example:
- Stage 1: injects subtle bias
- Stage 2: amplifies
- Stage 3: acts based on false premise
Mitigation:
- Monitor prompt flows holistically (not just per task)
- Use semantic diffing or anomaly detection between stages
- Enforce input/output bounds at each layer