Millions of users have tested the potential and limits of ChatGPT – a chatbot developed by AI research company OpenAI – to write emails, poems or code, or even produce entire research papers. The chatbot recently passed a Wharton School MBA exam, prompting further admiration and alarm.
Several academic journal publishers have banned authors from using ChatGPT, and professors are changing exams and assignments in response to the tool. But is ChatGPT a tipping point that will change business schools as we know them?
While many of us are stunned by how impressive and natural the output can be, it doesn’t mean it’s useful for everything we do. Here’s what we’ve found are the major implications for research, teaching and learning.
A useful sparring partner
Phanish Puranam, Professor of Strategy
I personally find ChatGPT serves as a useful copy editor and translator (including into programming languages) but I don’t trust it as a search engine or source of knowledge – the answers it produces that I cannot easily verify are suspect until proven otherwise. However, it is useful when I can quickly verify if the copy is sensible or if the code works. It is also a useful sparring partner when writing – even the “mistakes” it makes are helpful for me, and it can generate a lot of variations on a theme rapidly. I have clarified my thinking on topics as diverse as delegation and Occam’s razor by interacting with it.
When it comes to assignments with open-ended components, I have started to ask students to submit the ChatGPT response to the theme they picked and include an appendix telling me how they used it. I can’t be in the business of policing whether they use it or not, and this is a tool they need to learn how to use (my views would be different if I was teaching younger students – high-schoolers for instance). I tell them they should worry less about ChatGPT making them redundant and more about being made redundant by somebody who can effectively use such technologies.
Only good with narrow queries
Pushan Dutt, Professor of Economics
When I got ChatGPT to take a microeconomics exam, the results were underwhelming. It only got one ludicrously simple question about a recent OPEC+ move correct. Out of 20. As of today, it cannot reliably add or take derivatives, has no fundamental understanding and is only able to auto-complete sentences. This means I can merrily hand out open-book exams and that my job is safe. For now.
ChatGPT is analogous to John Searle’s Chinese room argument. Searle imagines himself in a room, following a computer programme to respond to Chinese characters that are slipped under the door. Searle understands nothing of Chinese, but by following the programme for manipulating symbols and numerals, he sends appropriate replies as strings of Chinese characters back out. People outside the door mistakenly assume that Searle is a Chinese speaker. This passes the Turing test. But syntactic rules manipulating symbol strings have no real understanding of meaning. That is the essence of ChatGPT.
ChatGPT seems to have three goals: Be helpful, be truthful and be inoffensive. However, in its attempt be helpful (and inoffensive), it occasionally makes stuff up. When it tries to be helpful and truthful, it can say things that are offensive. Will OpenAI’s reinforcement learning with human feedback catch and correct this? Punishing unhelpful answers may push the AI to give false answers; punishing false answers may make it give offensive ones; and punishing offensive answers may make it give unhelpful ones. OpenAI needs to grapple with this impossible trinity.
An incremental improvement from Siri
Phebo Wibbens, Assistant Professor of Strategy
Given all the hype, I was actually a bit underwhelmed by ChatGPT. At first sight it is impressive what it can accomplish, but on closer scrutiny it is a rather incremental improvement from virtual assistants such as Siri. While it has improved upon things that AI is already good at, such as writing grammatical sentences and translation, it is still poor at things AI has never been good at, including writing longer coherent texts and developing new ideas. It is also long-winded and non-committal in its answers, and when it is committal (i.e., a definite yes or no) it is often wrong.
While I don’t see immediate use cases for research, the most important implication may be AI plagiarism, for example in student assignments and manuscripts for review, because ChatGPT can write plausible-looking text. It doesn’t usually stand up to scrutiny, though, and a major issue is that it is not good at sourcing its texts.
Dramatically different (and possibly incorrect) answers to the same question
Anton S. Ovchinnikov, Visiting Professor of Decision Sciences
When I got ChatGPT to sit an Uncertainty, Data and Judgement exam (a core course in the MBA and the Global Executive MBA programmes), I found it was only able to answer simple questions requiring textbook-type answers. However, when the questions were harder and required more of a conceptual understanding, ChatGPT provided elegantly written answers, but completely missed the totality of the situation and therefore reached the wrong conclusion.
For an exam question about the probability of winning a running race, I asked ChatGPT the exact same question three times and got three dramatically different answers with different logic and different conclusions. All three answers were wrong. Interestingly, ChatGPT made a logical comparison mistake – it assumed that, like other sports, a higher score would beat a lower score, whereas in a running race the winner has a lower time. It also made an algebraic mistake in its calculation, which is shocking for a computer model.
Perhaps in future exams we will give students AI solutions and ask them to identify and correct errors, instead of getting them to solve the problems from scratch. Another option could be to get students to identify when ChatGPT provides an incorrect answer and get them to rephrase their prompts and questions in order to get a better answer.
A complement to what we do, not a substitute
Victoria Sevcenko, Assistant Professor of Strategy
For simple coding tasks, ChatGPT can speed up the initial stages of writing and debugging Python code. I often start with simple prompts and build from there, or paste code that I need help debugging. The output still needs editing, but it is a useful complement to reading documentation and reviewing advice on Stack Overflow.
I also use ChatGPT in class to help students interpret code and edit it. However, because the output can at times be wrong, it is not a substitute for what we do in class, but rather a useful tool for tasks where students can check the accuracy of the output directly. I am also looking forward to new apps built on GPT-3 that might help us speed up systematic literature reviews and locate additional relevant papers.
Useful for generating alternative perspectives
Theodoros Evgeniou, Professor of Decision Sciences and Technology Management
ChatGPT can be useful for compression, such as providing summaries of articles, emails and books, but only if users apply critical thinking to weed out misinformation. It can also help generate “alternative perspectives” to understand how various groups of people perceive things such as product descriptions, political statements, mission statements or the news. For example, you could ask ChatGPT for an ambitious, complimentary, cynical or culture-specific summary of a piece of text as a way to discover new ways of thinking, raise new questions and also improve the original text.
This has implications for our education system. Instead of answering questions, students of the future might be asked to write 10 questions for AI and assess its answers based on the different versions and perspectives requested. In this sense, AI may indeed prove valuable for education much like the printing press was. “Supercreativity” – a concept we outlined a few years ago – is around the corner. In the words of Sebastian Thrun, the academic, entrepreneur and founder of Google X: “We have not even begun to understand how creative AI will become. If you take all the world’s knowledge and creativity and put it into a bottle, you will be amazed by what will come out of it.”
By using ChatGPT to generate multiple texts on a topic, one can also distinguish that volume of text from those written by humans to identify possible differences and potential gaps for research. This ability to work together with AI to create better content, ideas and innovations will become increasingly important. Going forward, we also need to develop innovative and strong processes for humans to work together with machines and oversee AI to ensure what it generates or does is safe and trustworthy.
Narrow in scope and singular in output
Philip M. Parker, Professor of Marketing
There’s been some panic that a more evolved version of ChatGPT will put academics out of business. The question is whether this tool will eventually be able to write academic research that has a higher quality or a more diverse nature than what we have now. GPT-based systems are narrow in scope and singular in output, but, like IBM Watson, make news and fascinate people, which is only a good thing.
In the future, I believe there will be algorithmic journals and people will vet an algorithm and accept whatever it produces or not. These algorithms will be the brand – just like Deep Blue, IBM Watson and ChatGPT – and the “author” will be the person who programmed it. Articles will be automatically generated and modified as new data comes in. And if we go a little bit further, code now writes code, which means we don’t even need the engineers anymore. We’re either going to be in the game of generating these algorithms ourselves or using them.
Join us at the INSEAD Tech Talk “ChatGPT: How Will the Next-Generation of AI Systems Shape Business, Society & Employment?” on 1 March 2023. View details here