"I'm putting a 'B' for now by pencil!"
OpenAI has developed a separate model called CriticGPT, which will look for errors in ChatGPT responses - initially the "teacher" will focus on code snippets and, as noted, will only be an auxiliary tool for human specialists who will manually check the chatbot texts.
CriticGPT, based on the GPT-4 family of language models, was additionally trained on a set of code samples with intentionally inserted errors and in initial tests showed itself better than humans in 63% of cases. It supposedly wrote better and more detailed critiques, often reducing the frequency of so-called bot hallucinations.
During training, CriticGPT successfully identified both errors intentionally inserted by humans and errors initially added by ChatGPT.
OpenAI researchers also developed a new technique called Force Sampling Beam Search (FSBS), which helps CriticGPT write more detailed code reviews and can be balanced depending on the needs of the critic model's training.
Interestingly, at one stage of the experiment, CriticGPT was tasked with checking responses that people had previously designated as ideal - and it found errors in 24% of cases (they were later confirmed by reviewers). OpenAI believes this demonstrates the model's potential for checking tasks unrelated to code and underscores its ability to capture the "subtlest errors" that even rigorous human review may miss.
Despite the promising results of CriticGPT, like all AI models, it has limitations. It was trained on relatively short ChatGPT responses, so it is not yet ready for long and more complex tasks.
Comments (0)
There are no comments for now