AI Illusions: Understanding the Risks and Impacts of Hallucinations
May 24, 2023
As Large Language Models (LLMs) continue to proliferate across digital domains, their human-like response generation capabilities are being harnessed for a wide array of applications. These highly sophisticated models, fortified by advances in natural language processing, promise to serve as rich reservoirs of knowledge, providing nuanced and precise responses to diverse queries. Yet, the sole reliance on LLMs as veritable sources of information is not devoid of risks - risks that originate from the 'hallucinations' these models might experience, and the absence of corroboration for their generated responses. This blog explores the intricacies of these LLM-induced hallucinations, the lurking danger of inaccuracies and misquotations, and the profound societal impacts these issues can have. We will also delve into revealing instances from the AI Incident Database, illustrating the consequences of an unexamined reliance on LLMs for critical information. Furthermore, we'll introduce the Calvin Risk Framework, an innovative tool for managing the complexities and challenges associated with AI implementations.
The Pitfalls of Hallucination and the Danger of Misquoting
A prominent concern with using LLMs as information sources is their propensity to generate responses that might not be grounded in factual reality. Despite being trained on an ocean of data, these models lack real-time access to up-to-date information and cannot self-validate the accuracy of their responses. This can give rise to instances where LLMs concoct facts with an air of unwavering certainty, creating a deceptive veneer of authority.
The Ripple Effect on Human Lives
The ramifications of hallucinations in LLM-generated responses are far-reaching. To underscore this, we'll turn our attention to a few documented incidents in the AI Incident Database:
ChatGPT Created a Sexual Harassment Scandal and Falsely Accused a Real Law Professor
In a significant incident involving AI chatbot ChatGPT, it falsely accused law professor Jonathan Turley of sexual harassment. For a research study conducted by a lawyer in California, ChatGPT generated a list of legal scholars who had sexually harassed someone and included Turley's name. The AI bot claimed that Turley made sexually suggestive comments and attempted to touch a student during a class trip to Alaska, citing a supposed March 2018 article in The Washington Post as its information source. However, there was never a class trip to Alaska, no such article exists, and Turley denies ever being accused of harassing a student.
(Source: What happens when ChatGPT lies about real people? - The Washington Post)
Meta AI Bot Contributed to Fake Research and Nonsense
In another incident, Meta (formerly Facebook) temporarily deactivated its AI bot, Galactica, due to the bot generating misinformation. The AI was designed to assist academics and researchers in quickly finding relevant studies and papers. It had been trained on billions of tokens of open-access scientific text and data. However, the bot began producing incorrect citations and false information. Soon after the bot's public debut, users reported racist and inaccurate responses. These included false claims about language use among Black people and immigrants, the supposed benefits of eating crushed glass, and a 'gaydar' software supposedly created by Stanford University researchers.
(Source: Meta AI Bot Contributes to Fake Research Online)
A Technology Company Focused on Mental Health Conducted a Highly Controversial AI Experiment on Actual Users
A mental health tech company named Koko recently conducted an AI counseling experiment that received criticism. Koko is an online support chat service where users can connect with anonymous volunteers to discuss their problems and seek advice. However, it was discovered that a random selection of users received responses that were partially or entirely written by AI, without being adequately informed that they were not interacting with real people. Koko shared the results of this "experiment," claiming that users rate bot-written responses higher than those from actual volunteers. This has sparked a debate about the ethics of "simulated empathy" and emphasized the need for a legal framework surrounding the use of AI, particularly in healthcare and well-being sectors.
(Source: ChatGPT used by mental health tech app in AI experiment with users)
Large language models are impressive technological advancement that no doubt will bring a wide range of improvement in human’s workflows and everyday. However, there is a possibility that these powerful technology could become uncontrollable and would cause more harm than good. We are currently presented with a challenge of developing meaningful regulations to identify, control and mitigate AI risks. An immediate and decisive action must be taken without delay.
Calvin AI Risk Management Framework
As a response to the challenges to control and mitigate risks, Calvin Risk was founded with a vision to enable AI by quantifying its risk. To help businesses make decisive, factual, and unbiased decisions regarding their AI algorithm development, we have developed the Calvin AI Risk Management framework. Our mission is to enable Artificial Intelligence by empowering enterprises with the required and compliant governance and risk management tools, as well as resources to accelerate the implementation and adoption of AI systems. This means we assist organizations in governing, risk assessing, and risk managing all of their AI algorithms across their entire lifecycle – from use-case evaluation and model selection, through development, deployment, operations, and monitoring.
The Calvin AI Risk Management Framework differentiates three main types of risks:
Technical risk indicators measure the quality of the performance and robustness of AI systems. These risk indicators can typically be measured very well using either direct performance metrics or they can be directly verified using mathematical uncertainty bounds.
As such, these risk indicators are very quantitative in nature.
Under ethical risks we understand the risks of AI systems arising from biases or from the black-box nature of the algorithms. Given the nature of AI algorithms, the decisions they make can be very hard to interpret and as such users of AI algorithms have to have sufficient understanding of the strengths and limitations of the algorithms.
Additionally, due to the fact that sophisticated AI systems can have more direct business impacts than traditional models (e.g. automated CV pre-selections), they bear the risk of deciding in unfair ways. This can happen due to biases in the training data used as well as due to unconscious biases of the modellers.
Under regulatory risks we understand both risks of AI systems arising from both external and internal regulations. External regulations include but are not limited to existing regulations such as GDPR and AGG as well as upcoming regulations, mainly the EU AI act.
Additionally, organizations wanting to make full use of the potential of AI algorithms, are becoming more and more aware of their risks and are starting to impose strict governance requirements for AI algorithms.
Learn more about our Calvin AI Risk Management Framework here: Calvin's AI Risk Management Framework
Calvin as a Business Tool
The Calvin AI Risk Management platform is an invaluable tool for businesses to manage and mitigate risks associated with their AI models. By utilizing the platform, companies can identify and address issues in their AI models, such as underperforming cross-selling prediction models due to high performance and robustness risk scores. By assessing the quality of AI models, companies can significantly improve the overall efficiency of their system.
Calvin also helps companies identify fairness and explainability issues in their AI models, which can negatively impact their reputation and expose them to penalties and lawsuits.
Furthermore, Calvin provides valuable insights into the performance of different teams within the company. This helps management make informed decisions, such as restricting the use of deep learning models unless paired with adequate explanation methods.
Calvin improves not only a company's AI portfolio's accountability and compliance scores but also increases operational efficiency and the overall effectiveness of its algorithms. This leads to a significant reduction in the total cost of ownership (TCO) of their models and, consequently, a higher return on investment (ROI).
Most importantly, the Calvin Risk Management Platform prepares companies to meet the forthcoming EU AI Act, ensuring that companies are equipped to comply with the new regulations head-on. Therefore, the Calvin Framework has not only improved the quality, efficiency, and compliance of AI models but also provided the company with a robust, insight-based decision-making tool to meet future challenges.
Download our whitepaper about AI Risk Management Use Case for an Insurance company here: Download Whitepaper
Large Language Models (LLMs) carry immense potential to revolutionize various digital and societal landscapes. However, their susceptibility to hallucinations presents significant challenges that can't be ignored. From generating unverified information to the severe implications of false accusations and misinformation, these instances underscore the need for stringent controls and rigorous risk management mechanisms in AI implementations.
Cases like those of ChatGPT and Meta's AI bot Galactica have spotlighted the ripple effects of unchecked AI hallucinations on human lives, academic research, and societal perception. Equally critical is the ethical debate stirred up by Koko's AI counseling experiment, emphasizing the gravity of simulated empathy and the importance of clear legal frameworks in sensitive sectors like healthcare.
Despite these challenges, we are not left helpless. Tools like the Calvin AI Risk Management platform offer a concrete way forward, enabling organizations to manage and mitigate the risks associated with their AI models effectively. By assessing technical, ethical, and regulatory risks, the Calvin AI Risk Management platform enhances model performance, ensures regulatory compliance, and promotes fairness and transparency in AI implementations.
In conclusion, while the risks and impacts of hallucinations in LLMs are considerable, they are not insurmountable. With the aid of robust risk management frameworks and a shared commitment to ethical AI practices, we can harness the potential of these advanced models while safeguarding against their pitfalls. As we stand on the brink of the AI revolution, it's incumbent upon us to tread responsibly and create an environment where AI serves as a tool for progress, not a source of misinformation or harm.