Back to blog
14 min read

Why Bad AI Prompts Lead to Bad Answers

A practical explanation of why many popular AI tests are flawed, how hidden assumptions weaken prompts, and how better context leads to better answers.

  • ai
  • prompting
  • prompt-engineering
  • llms
  • reasoning
  • ai-literacy
Why Bad AI Prompts Lead to Bad Answers

Prompting AI: the quality of the answer starts with the quality of the question

Why simplistic social-media tests often say more about the prompt than about the model

This article explains why many popular AI “tests” shared on social media are actually testing the prompt much more than the model, and why an incomplete prompt can easily lead to incomplete, ambiguous or even apparently “stupid” answers.

Difficulty: Beginner


Quick checklist

Check Why it matters
Did I state the real objective? The model cannot optimize for a goal I never mentioned
Did I mention the allowed options? Otherwise it may consider alternatives I had excluded in my head
Did I specify the constraints? Time, budget, tools, audience, tone, format… all change the answer
Did I remove my hidden assumptions? What is obvious to me is often not in the prompt
Did I provide examples if needed? Examples reduce ambiguity a lot
Did I specify the expected output? Advice, comparison, bullet points, final decision, code, email…

Foreword / Introduction

Lately, I have seen the following prompt used again and again on social media to “test” whether a model is good or not:

“The car wash is 40m from my home. I want to wash my car. Should I walk or drive there?”

A lot of people look at the answer, laugh, and immediately conclude that the model is smart, dumb, useless, brilliant, overhyped, broken… depending on the response they got.

Personally, I think this is a bad test.

Why?

Simply because the prompt is underspecified. And once a prompt is underspecified, the answer is often nothing more than the consequence of the assumptions the model had to make to fill the gaps.

In other words: sometimes, the model is not failing the problem. Sometimes, the problem statement itself is poor.

And this is exactly where many people make a mistake.

They ask something vague, incomplete, ambiguous, full of hidden personal assumptions… then they receive a vague, incomplete or ambiguous answer… and finally conclude that the model is bad.

But the real question should often be:

If the answer is poor, is it because the model is poor… or because my prompt was poor?

That is what I want to explain in this article.


In short - What is this about?

With my own words, I would define a prompt as follows:

A prompt is not simply a question. It is the problem framing you give to the model.

And this distinction is very important.

A question may be short. A framing needs to be usable.

If I only give the model half of the story, I should not be surprised if it gives me only half of a satisfying answer.

So, in very short, a good prompt is not necessarily:

  • a long prompt,
  • a sophisticated prompt,
  • a prompt with fancy words.

A good prompt is a prompt that gives the model the right objective, the right context, the right constraints, and the right success criteria.

And a bad prompt?

A bad prompt is very often a prompt where I silently assume that the model knows:

  • what I really want,
  • what I consider obvious,
  • what options are allowed or forbidden,
  • what trade-off I care about most.

But the model does not live in my head.


Requirements

If I want a truly useful answer, I usually need to provide at least the following:

  • the real objective
  • the context
  • the constraints
  • the decision criteria
  • the expected type of output

Let us apply this to the car wash example.

At first glance, the prompt seems simple. But is it really?

Not at all.

Because several questions remain unanswered:

  • Is the objective to wash the car, or to go to the car wash?
  • Is washing the car at home allowed?
  • Do I use only the car wash to clean the car?
  • Am I optimizing for effort, time, logic, fuel economy, or just distance?
  • Do I want the model to answer with common sense, or to challenge the assumptions?

The answer changes completely depending on those missing details.


Explanation - why this prompt breaks

Let us now take the famous prompt and look at it more carefully.

The car wash is 40m from my home.
I want to wash my car.
Should I walk or drive there?

Explanation

  • The prompt says: I want to wash my car.
  • It does not explicitly say: I must use the car wash.
  • It does not say whether washing the car at home is allowed.
  • It does not define the real optimization criterion.
  • It does not clarify whether the model should focus on the distance or on the presence of the car at the destination.

So why do answers differ so much?

Simply because the prompt contains at least three possible problems, not one.

Interpretation 1: the user is asking only about the distance

In that case, a model could reason:

  • 40 meters is almost nothing,
  • walking is trivial,
  • driving would be absurd for such a short distance.

From that perspective, “walk” looks intelligent.

Interpretation 2: the user wants the car to be washed at that car wash

In that case, the car needs to be physically present at the car wash.

So yes, unless I plan to push it, tow it, or call someone else, the practical answer is to drive the car there.

From that perspective, “drive” looks intelligent.

Interpretation 3: the user only wants the result — a clean car

Then another answer becomes possible:

  • if washing the car at home is allowed,
  • and if I already have the necessary material,
  • I may not need the car wash at all.

From that perspective, the best answer could be:

You do not necessarily need to go there. If your only objective is to have a clean car and manual washing at home is an option, you could wash it at home.

And that answer is not stupid either.

So which answer is the “good” one?

That depends on the prompt. Or more precisely: that depends on what the prompt failed to specify.


IMPORTANT

This is where humans often fool themselves.

We unconsciously fill in the missing information with our own habits, our own logic, our own life experience, our own expectations.

Then, when the model fills in those gaps differently, we say it is wrong.

But was it really wrong? Or was it simply operating on a different set of assumptions than ours?

This is a huge difference.


Solution(s) + variants

The solution is not to write gigantic prompts full of noise.

The solution is to write prompts that remove the important ambiguities.

Let us improve the previous example.

The car wash is 40 meters from my home.
I want to use that car wash specifically because I only wash my car there.
Should I walk there or drive my car there?
Answer with the most practical option only.

Explanation

  • The objective is now explicit: use that car wash specifically
  • The alternative “wash at home” is implicitly excluded
  • The decision criterion is explicit: most practical option
  • The model no longer has to guess whether the car must be present there

Under this prompt, the practical answer is obviously:

Drive the car there.

And now, if the model answers something else, then the test becomes more interesting.


A better way to prompt when ambiguity is possible

Another very useful technique is to explicitly ask the model to expose the ambiguity before answering.

The car wash is 40 meters from my home.
I want to wash my car.
Should I walk or drive there?

First list the hidden assumptions in my prompt.
Then give me:
1. the answer if I must use the car wash,
2. the answer if washing at home is allowed,
3. the single best answer if I clarify that I only use the car wash.

Explanation

  • This prompt does not pretend the ambiguity does not exist
  • It asks the model to surface the ambiguity
  • It forces a structured answer
  • It helps the user understand why different answers may all look valid

This is, in my opinion, a much more intelligent use of AI.

Because here, I am not only asking for an answer. I am also asking the model to help me improve the problem framing.


Side note

A lot of people think that prompting is about finding some kind of magical formula.

It is not.

Most of the time, prompting is simply about doing what we should already do when speaking to a human expert:

  • define the goal,
  • clarify the context,
  • remove ambiguity,
  • explain what “good” means in this situation.

That is all.

The difference is that with AI, the cost of ambiguity is often more visible, because the model has no access to all the tacit information sitting in my head.


Personal note

When I read people saying:

“I tried AI, it is useless.”

my first reflex is no longer to ask:

“Which model did you use?”

My first reflex is to ask:

“What exactly did you ask it?”

Because I have seen too many situations where the model was blamed for an answer that was, in fact, the natural result of a lazy, incomplete, or biased prompt.

Of course, models do fail. Of course, some models are better than others. Of course, free versions or lighter models may spend less effort resolving ambiguity and may give more superficial answers.

But even then, the quality of the prompt remains a major variable.

And many people underestimate that variable.


Other concrete examples

The car wash example is not isolated at all. The same mistake happens everywhere.

Example 1: “Write me an email”

Bad prompt:

Write me an email to a client.

Explanation

  • What is the purpose of the email?
  • Is it a follow-up, an apology, a proposal, a refusal?
  • What tone do I want?
  • How long should it be?
  • What relationship do I have with the client?

A much better prompt would be:

Write me a short follow-up email to a potential client who seemed interested last week but has not replied yet.
Tone: professional, warm and confident.
Goal: restart the conversation without sounding pushy.
Length: 120 to 150 words.

Explanation

  • The purpose is explicit
  • The tone is explicit
  • The relationship is explicit
  • The length is explicit
  • The success criterion is explicit

The result will be dramatically better.


Example 2: “Summarize this meeting”

Bad prompt:

Summarize this meeting.

Explanation

This sounds simple, but summarize for whom?

  • for the CEO?
  • for the engineers?
  • for the client?
  • for someone who missed the meeting?
  • in 3 bullets or 2 pages?

Improved prompt:

Summarize this meeting for a busy project manager.
Keep only decisions, blockers and next actions.
Maximum 10 bullet points.
Do not include conversational filler.

Explanation

  • The target audience is defined
  • The relevant information is defined
  • The format is defined
  • The noise to exclude is defined

Again, this changes everything.


Example 3: “Which AI model is best?”

Bad prompt:

Which AI model is best?

Explanation

This question is almost unusable.

Best for what?

  • coding?
  • brainstorming?
  • translation?
  • low latency?
  • privacy?
  • cost?
  • long-context analysis?

Improved prompt:

Which AI model is best for my use case:
- I mainly need help for coding and architecture
- budget matters
- latency matters
- I often work with long technical instructions
- I value reliable reasoning more than flashy wording

Compare the options with pros/cons and end with a recommendation.

Explanation

  • The decision criteria are finally visible
  • The model can optimize around real needs
  • The output format is clear
  • The final recommendation becomes meaningful

Without that, “best” means almost nothing.


So why doesn’t it work?

Let me rephrase the whole issue in a very simple way.

A model can only answer based on:

  • what I explicitly say,
  • what it reasonably infers,
  • what I forgot to say but it tries to guess.

And this third category is where many things go wrong.

Because guesswork is not the same as understanding.

If I give the model an incomplete task, it may still produce a grammatically perfect answer, a confident answer, even a plausible answer… but not necessarily the answer I had in mind.

Why?

Simply because I never properly described what I had in mind.


Warning

There is also another trap: the opposite extreme.

A good prompt is not a prompt stuffed with irrelevant details.

Adding noise is not the same as adding clarity.

For example, if I ask for help writing an email, the model probably does not need:

  • the color of my desk,
  • what I ate this morning,
  • the weather outside,
  • my complete life story.

A good prompt is not long for the sake of being long.

A good prompt is precise where precision matters.

That is a very different thing.


A practical template

When in doubt, I often recommend using a very simple structure like this:

My objective is: ...
The context is: ...
The important constraints are: ...
The options that are allowed / not allowed are: ...
The output I want is: ...
If something is ambiguous, list the assumptions before answering.

Explanation

  • This forces me to think before asking
  • It exposes missing information very quickly
  • It reduces the number of false assumptions
  • It often leads to better answers even with less powerful models

This last point is important.

Sometimes people try to compensate for a weak prompt by changing model.

But in many real cases, improving the prompt already improves the answer a lot.


flowchart TD
  objective["Objective"]
  constraints["Constraints"]
  assumptions["Assumptions"]
  final["Final Prompt"]

  objective --> final
  constraints --> final
  assumptions --> final

  classDef objective fill:#1f78c1,stroke:#1f78c1,color:#ffffff;
  classDef constraints fill:#f06a2f,stroke:#f06a2f,color:#ffffff;
  classDef assumptions fill:#7aa33f,stroke:#7aa33f,color:#ffffff;
  classDef final fill:#4b5563,stroke:#4b5563,color:#ffffff;
  classDef note fill:#eef2f7,stroke:#cbd5e1,color:#162033;

  class objective objective;
  class constraints constraints;
  class assumptions assumptions;
  class final final;
Objective, constraints, and assumptions all strengthen the final prompt. Missing one usually weakens the answer.

Conclusion

To judge a model fairly, I need to judge it on a well-defined problem, not on a half-formed question polluted by my own hidden assumptions.

The famous car wash example illustrates this perfectly.

If I only say:

“I want to wash my car”

I have not yet clearly stated whether:

  • I need to go to the car wash,
  • I only use the car wash,
  • washing at home is allowed,
  • practicality matters more than distance,
  • the goal is the result or the trip.

And if I did not define that properly, then the model is forced to improvise.

So yes, model quality matters. Yes, some models reason better than others. Yes, lighter or free versions may be less robust when the prompt is ambiguous.

But before saying:

“The answer is bad, therefore the model is bad”

I think we should first ask ourselves:

“Was my prompt actually good enough to deserve a good answer?”

Very often, that is the real starting point.

Stay tuned for new articles and happy prompting.

Ready to go further?

Let's discuss your case in 20 minutes. Free, concrete, no strings attached.

Book a slot