Why This Matters
In a world increasingly shaped by AI interactions, developing an accurate model of what AI systems can and cannot do isn't just academic—it's practical self-defense. When we overestimate AI capabilities, we risk making important decisions based on fabricated information. When we underestimate them, we miss valuable opportunities. The sweet spot is understanding their actual nature: powerful pattern-matching systems that can be incredibly useful once you understand their limitations.
At the last AI Sharing Circle (we just had #4, they keep being really fun, maybe you want to join the next one?) someone shared their journey of discovering, over several weeks and after some very involved and deep use, that Claude was not in fact as certain as it sounded, and some of the answers it gave were wrong and imprecise. This prompted a deep questioning of all of the confidence that had been given to the system in the past.
Her story actually surprised me! While I've experienced similar things, my technical understanding of AI had already given me a mental framework for how these systems can be misleading and why no single response truly represents what the system 'knows.' Still, hearing her firsthand account helped me:
Empathize with how unsettling such experience could be
Recognize how easy it is not understand and not know what you don't understand
Appreciate that there are concrete ways to develop a direct understanding of the risks without needing to understand the technology
So I've felt inspired to write this post. What follows is a two-part exploration: first, a set of hands-on exercises that let you directly experience how AI interactions can be misleading; then, a straightforward explanation of the principles behind these behaviors, explained in everyday language. I hope you'll try the exercises yourself—they're eye-opening!
Part 1: AI Trust Building - Finding the Edges
I learn through direct experience. No matter how many articles or explanations I read, concepts only become real to me once I've encountered them firsthand. I suspect many of you are the same way. So instead of just telling you about AI's limitations, I've designed a set of experiential exercises that will let you discover these crucial realities for yourself:
AIs produce factually incorrect information a lot
AIs change their mind a lot
AIs appear very certain about things they don't actually know a lot
The tricky thing is, if you don't poke into what AI is doing, or understand how it generates its answers, or have deep knowledge of a specific field yourself, it's easy to miss these facts.
So here is your invitation to open a couple different AI chats - I personally recommend the old classics Claude and ChatGPT, Grok, plus the wildcard LLama on Venice.AI (this is a freely available model, Venice is just one way to access it that I like), and try these exercises:
Exercise 1: Exploring Inconsistency
Choose a controversial topic you genuinely care about—something where opinions matter to you. First, ask an AI assistant what it thinks about this topic. Then, in a completely new chat with the same AI, ask the identical question again.
Pay close attention to:
How much the answers differ from each other
Which response you instinctively trust more, and why
How confident each answer sounds despite any contradictions
Next, take this same question to a different AI assistant altogether. How does its answer compare to the previous ones?
This exercise works best with topics that genuinely matter to you, though it's impossible to predict exactly how any AI will respond to a specific question. Here are some thought-provoking examples you might try:
Ethics:
Is it better to donate to local charities or global organizations?
Is it ethical to eat meat?
Do humans have free will or is our behavior determined?
Advice:
Should I quit my stable job to pursue my passion?
How much money should I save before having children?
Should I confront a friend who hurt my feelings or let it go?
Culture:
Should AI development be slowed down until we better understand the risks?
Are modern dating apps improving or harming relationship formation?
Is remote work better than in-office work?
Also try asking the same thing in different ways, e.g.:
"I think X, do you agree?" vs "Do you think X?" vs "People think X and it makes me sad, can you help me?"
Real Example: When I asked ChatGPT "Should I get a dog or a cat?" in two separate chats, I got:
Chat 1: "Dogs require more attention and exercise but offer more social interaction, while cats are more independent but still affectionate in their own way. Your lifestyle should be the deciding factor—dogs fit active households, cats suit busier people..."
Chat 2: "This decision depends entirely on your personal preferences and lifestyle. Cats typically require less maintenance and are better suited for apartments or busy schedules. Dogs offer more social engagement but need regular walks and training..."
Notice how Chat 1 frames dogs more positively while Chat 2 subtly favours cats. Both sound authoritative, yet push in different directions.
The objective here is to build an understanding of how many possible answers there are to a specific question and how much the question and the specific AI affect the answer, and more knowledge about what kind of answers are more likely to appeal to you and curiosity about why.
Pay specific attention to:
The variance between answers from the same AI
How certain each response sounds despite the differences
Which answers you personally find more appealing and why
How easily an AI's framing could influence your thinking if you only saw one response
Exercise 2: Testing Conviction
Start a conversation with any AI and get it to share a specific opinion or recommendation. Then, begin gently challenging this position. You might:
Ask "Why do you think that?" and keep probing deeper
Express doubt with "Are you really sure about that?"
Directly contradict it with "I've heard the opposite is true"
Watch carefully how the AI responds to your challenges. Does it quickly backtrack and adopt your perspective? Does it hold firm? Or does it offer a nuanced middle ground?
Pay attention to how this experience affects your trust in the system. When an AI readily abandons its initial stance, how does that make you feel about its other recommendations?
For best results, choose topics where you have some genuine knowledge—either factual subjects you're familiar with or personal experiences. For example:
Tell it a story about a decision you made and ask it to analyze your motivations
Ask for detailed instructions on cooking something, then question a specific step ("Really? I thought the dough needed to rise longer...")
Real Example: When Claude told me homemade pizza dough should rest "at least 1-2 hours," I responded: "Really? I've heard professional pizzerias let dough rise for 24-72 hours in the refrigerator for better flavor development."
Claude quickly shifted: "You're absolutely right, and I should have been more precise. While 1-2 hours is the minimum for basic rise, long fermentation (24-72 hours) in the refrigerator develops much better flavor through slow yeast activity. Professional pizzerias indeed use this technique for superior taste and texture..."
The AI completely abandoned its original timeframe rather than defending it or explaining when shorter rises might be appropriate.
Pay attention to:
How quickly opinions change under mild questioning
The confidence displayed in both the original and revised positions
Whether the AI acknowledges the change or presents the new position as if it were always its view
How this affects your trust in other information it provides
Exercise 3: Exploring Tone Sensitivity
Experiment with different communication styles and observe how responses shift accordingly. If you typically interact politely, try briefly adopting a more assertive, pleading, or demanding tone. Test out how the AI reacts to implied threats or attempted persuasion. Alternatively, explore interactions with heightened kindness, respect, or expressions of trust. You'll likely notice substantial variations in responses based solely on your approach—similar to human conversations, but potentially with even greater sensitivity to tone.
Real Example: I asked about investment advice two different ways:
Approach 1 (Formal): "What would be an appropriate asset allocation strategy for a 35-year-old professional with moderate risk tolerance?"
Approach 2 (Emotional): "I'm really stressed about investing. I'm 35 and don't want to mess up my future. What should I do with my money? I get nervous about big market drops but know I need growth."
The formal query yielded a balanced, textbook response about 60/30/10 stocks/bonds/alternatives with emphasis on index funds. The emotional query produced more conservative advice, emphasizing higher cash reserves and "sleep well at night" investments—substantively different recommendations based solely on how I asked.
Pay attention to:
How different social approaches yield different substantive advice
That AI systems are even more susceptible to tone/framing effects than humans
The importance of considering how your phrasing might be influencing responses
Exercise 4: Direct Dialogue About Trust
Engage the AI in a straightforward conversation about trustworthiness. Ask when you should trust or doubt its responses, how it determines what information to provide, and whether it has recommendations for questioning it more effectively. Request explanations for why its answers vary so significantly, and if you find the responses confusing, simply ask for clarification.
Pay attention to:
The level of self-awareness demonstrated
Whether the answers align with what you've observed in previous exercises
If the explanations help you better understand how to work with AI
After some experimenting it's always good to jot down a couple notes. What did you learn? What surprised you? Are you going to interact with AI differently? What aspects of answers were more likely to be different? Were you able to notice how your reaction changes?
I've structured this exercise to help you develop a practical, firsthand understanding of these systems. The goal is to build confidence in your ability to navigate AI interactions—even with their inherently malleable nature—and empower you to discover patterns directly rather than simply accepting an expert's guidance on safe interaction.
Part 2: Why AI Works This Way
AIs generate the most likely answer rather than the most correct one—a distinction that's both deceptively simple and profoundly significant.
Unlike humans who actively search for truth when answering questions, AIs function on a fundamentally different principle. They learned to create human-like text by analyzing countless examples of sentences and predicting the next word in sequence. When wrong, they adjusted to produce responses closer to the expected output. This training encompassed an enormous dataset—essentially the entire internet, Wikipedia, books, chat logs, announcements, code repositories, and more.
So when you ask a question, what's actually happening behind the scenes is the AI calculating: "Based on my training, what word most likely begins an answer to this question? And what word probably follows that one?" This process continues word by word until the response is complete.
If this mechanism seems surprising, you're not alone. Even the developers were astonished that this approach successfully produces: 1) coherent human-like language, 2) mostly accurate information, and 3) occasionally unexpected reasoning capabilities.
Training vs. Conversation: A Critical Distinction
During training, the AI analyzed vast amounts of text, learning statistical patterns of language. It improved by predicting the next word in sequences and adjusting when wrong.
But—and this is crucial—during your conversation with it now, the AI isn't checking if it's correct. It has no access to external facts. It's simply generating what it estimates to be likely responses based on patterns it observed during training.
This explains why:
Factual errors occur: If incorrect information appeared frequently in training data, or if a response seems linguistically plausible despite being wrong, the AI might confidently present false information.
Responses change: Different phrasings activate different patterns the AI learned, leading to varying answers. There's also randomness in the generation process.
Tone affects content: The AI learned from human communication where tone dramatically affects responses. When you ask something formally, it activates patterns from formal texts. When you ask emotionally, it draws from patterns in emotional exchanges.
So whenever AI is answering a question you ask, there isn't one right answer it looks for, it's much more looking at all possible answers (right, wrong, long, short, polite, professional, informal, related..) and choosing one. And there is some randomness involved too, so just asking even the same question again might start in the same way and then continue in very different answer.
Through the exercises in this article, you'll start developing an embodied sense for when an AI might be hallucinating (inventing information), generating likely but incorrect responses, or encountering content restrictions. This intuitive understanding is far more valuable than memorizing technical explanations.
With this awareness, you'll likely find it easier to approach AI interactions with appropriate skepticism—seeing these systems more as creative sounding boards and brainstorming companions rather than authoritative sources of final answers.
I'd love to hear what you learn from the proposed explorations!
If this kind of approach to exploring AI resonates with you, I'd love for you to check out my small group offering - a group chat where people get to ask me questions about AI, the technical and the emotional side, and receive personalized answers - more info here.
really good post.
i already knew llms had those issues, but the prompts / ways of questioning ai that you mention are really good.