News

Stanford Scientist Proves that ChatGPT is Getting Stupider

Having a prime number problem solving accuracy rate of 97.6% in March 2023, GPT-4 accuracy rate plummeted to a mere 2.4% in its recent June 2023 update

Previously achieving impressive results in some of the world’s most competitive exams, your
nerdy homework buddies ‘ChatGPT 3.5’ and ‘GPT-4’ are experiencing some serious problems and are reportedly getting stupider day by day.

Users have been expressing similar concerns about OpenAI’s chatbot responses and performance for a long time now, but they were simply claims until proved right by a recent report published in the arXiv preprint server on July 18.

Prepared by researchers at the Stanford University and the University of California, Berkeley, the report shows how GPT3.5 and GPT-4’s responses to certain tasks “have gotten substantially worse over time”.

Divided into different areas such as math, problem solving and computer code generation, the report tracks chatbot performance in each area over a four-month period, stretching from March to June 2023.

GPT-4, which had a 97.6% accuracy rate on solving problems involving prime numbers was reported to have only a mere 2.4% in its recent update in June 2023.

ChatGPT 3.5 on the other hand had significantly improved its arithmetical skills, increasing its accuracy from 7.4% in prime-number problem solving to 86.8% in June.

Famous for their ability to generate code, both GPT3.5 and GPT-4 were reported to have plummeted in their ability to create ready-to-run scripts. Back in March GPT-4 responded to coder requests with accurate, ready-to-run scripts over 50% of the time and ChatGPT 3.5 did the same almost 22%, but these accuracy rates dropped to a mere 10% and 2% in June.

“We don’t fully understand what causes these changes in ChatGPT’s responses because these models are opaque. It is possible that tuning the model to improve its performance in some domains can have unexpected side effects of making it worse on other tasks,” said researcher James Zhu.

OpenAI, the creators of ChatGPT 3.5 and GPT-4 were quick to deny these claims, with the VP of Product ‘Peter Welinder’ saying; “We haven’t made GPT-4 dumber. Quite the opposite: We make each new version smarter than the previous one”.

“When you use it more heavily, you start noticing issues you didn’t see before,” he added.

Abdullah Shahid

Next Apple is Reportedly Using its AI Chatbot for Internal Work »

Previous « The Bird is Leaving its Nest, Elon Musk is Changing Twitter Logo to an ‘X’

Published by

Abdullah Shahid

Tags: AIAI ChatbotsChatGPTGPT-4latest tech newsOpenAItech news

1 year ago

Amazon Invests Another $4 Billion in Anthropic’s AI Development
The artificial intelligence business Anthropic, which was created by former OpenAI research leaders, has received…
OpenAI Considers Powering Galaxy AI with ChatGPT for Future Samsung Phones
OpenAI, the creator of ChatGPT, is reportedly in discussions with Samsung to integrate its artificial…
First AI-Powered Teacher Launched in Pakistan’s Private School
Karachi: A private school in Karachi has unveiled Pakistan’s first AI-powered teacher, a groundbreaking move…

PTI Urges Elon Musk for Starlink to Address Internet Connectivity in Pakistan

The Pakistan Tehreek-e-Insaf (PTI) has publicly appealed to Elon Musk to introduce Starlink internet services…

3 hours ago

Technology

PTA Set to Initiate Second Phase of Crackdown on Unregistered VPNs

ISLAMABAD: The Pakistan Telecommunication Authority (PTA) has reportedly started testing a new national firewall to…

4 hours ago

Business

13.6% Fall in Pakistan’s Mobile Imports in FY25’s Early Period

During the first four months of the current fiscal year 2024-25, the government imported mobile…

5 hours ago

Telecom

PTI Protest: Internet and Mobile Services Suspended in Areas of Pakistan

ISLAMABAD: Mobile and internet services have been temporarily suspended in Islamabad and Rawalpindi ahead of…

6 hours ago

Technology

Apple Watch 10 vs. Galaxy Watch 7

Most Android users would agree that the Samsung Galaxy Watch 7 is the top smartwatch…

7 hours ago

Social Media

After X, Bluesky Blocked in Pakistan Amid Growing Trend

Bluesky has recently been added to Pakistan's list of restricted social networking platforms, as users…

8 hours ago