ChatGPT, an AI language model developed by OpenAI, is a kind of backdoor attack on RL fine-tuning in natural language processing.
In other words, someone has managed to reverse-engineer the way neural networks are trained and fed data to improve their performance in an adversarial setting.
BadGPT is the first example of such an attack and it uses a similar technique called prompt learning.
Prompt learning is a technique used in NLP to reduce the need for labeled datasets by using a large generative pre-trained language model (PLM).
Many languages have been invented that don’t require labeled datasets since they use a PLM.
However, these models are vulnerable to attacks like this one.