ChatGPT failed the Grade 6 Singapore Public School exams, giving 16% correct answers in math and 21% correct answers in science. The English test score was 11 out of 20. Experts are trying to find explanations for the failure, citing among the reasons incorrectly asked questions in certain subjects, and even "boredom" and "trolling" of AI due to too easy questions.
The Straits Times of Singapore recently asked ChatGPT to answer questions on an elementary school final exam. At the end of 6th grade, all Singaporeans go through the PSLE test, which determines which high school they will transfer to to continue their studies. ChatGPT asked the 2020, 2021 and 2022 questions in math, science and English. And according to The Straits Times, the brainchild of OpenAI did worse than most twelve-year-olds on the exam.
ChatGPT made mistakes in simple addition and couldn't understand a single diagram. For all the test questions related to charts and graphs, the chatbot received zero points. Many people found this part of the exam incorrect and unimpressive: ChatGPT doesn't understand image-related queries. The smart bot offered to describe the meaning of the graphs in words, but most of them were too complicated for that (and for the people who asked the questions).
But ChatGPT made unexpected mistakes in answering simple text questions. When asked about 60,000, 5,000, 400, and 3, he got an answer of 65,503.
A few days later, when Insider went the way of his colleagues and tested ChatGPT on two PSLE questions-one from 2020 and the other from 2022-it answered both correctly. Another fact in defense of artificial intelligence: He used algebra in his answers, which is beyond the expected ability of most 12-year-olds in Singapore.
During the English exam, ChatGPT "crashed" on the cases where words have different meanings. It turned out that the bot got lost when the meaning of a word had to be checked against the context. He twice failed to grasp that the word "value" in the text refers to the evaluation of moral principles, and answered as if the question was about the value in monetary terms.
Bot's failure on the sixth-grade exam seems surprising to journalists. Publications recall ChatGPT passing the final exam at Wharton Business School, taking tests in four law school courses, and passing the medical license exam without difficulty. Among the reasons for ChatGPT's failure they cite not only incorrect and deliberately incomprehensible for the bot requests, but also "boredom" and "irritation," drawing an analogy with overly emotional behavior of Bing AI. Microsoft's chatbot is so quick to pick up on the emotional state and tone of the people asking it that it begins to "mirror" and "hallucinate". For this reason, the test access to Bing was limited to 5 questions per session and 50 requests per day per user.