ChatGPT: Logical Puzzles

Following on from previous studies of ChatGPT’s capabilities (here, here and here), the question arose: how would the chatbot respond to questions requiring logic to produce the answer? Here is a subset of ChatGPT’s response to a subset of logical problems to be found on the Parade website. The full essay can be found here.

The first two examples, coming up, are straightforward in that ChatGPT echoed the result available on the Parade website. The puzzle numbers refer to the number on Parade’s website.

Puzzle 5. Here’s a puzzle for you. The day before two days after the day before tomorrow is Saturday. What day is it today?

Parade’s answer. Friday. The “day before tomorrow” is today; “the day before two days after” is really one day after. So if “one day after today is Saturday,” then it must be Friday.

ChatGPT’s answer. Let’s break down the statement to determine the day it is today:

** “The day before tomorrow” is today.
** “The day before two days after” today is yesterday + 2 days = the day after tomorrow.
** Therefore, if “the day before two days after the day before tomorrow” is Saturday, then Saturday is the day after tomorrow.
** So, today is Friday.

Comment: correct, same reasoning.

Puzzle 10. A farmer wants to cross a river and take with him a wolf, a goat and a cabbage. He has a boat, but it can only fit himself plus either the wolf, the goat or the cabbage. If the wolf and the goat are alone on one shore, the wolf will eat the goat. If the goat and the cabbage are alone on the shore, the goat will eat the cabbage. How can the farmer bring the wolf, the goat and the cabbage across the river without anything being eaten?

Parade’s answer. First, the farmer takes the goat across. The farmer returns alone and then takes the wolf across, but returns with the goat. Then the farmer takes the cabbage across, leaving it with the wolf and returning alone to get the goat.

ChatGPT’s answer. Here’s how the farmer can bring the wolf, the goat, and the cabbage across the river without anything being eaten:

** The farmer takes the goat across the river, leaving the wolf and the cabbage on the original shore.
** The farmer leaves the goat on the other shore and goes back alone to the original shore.
** The farmer takes the cabbage across the river and leaves it with the wolf on the original shore, then takes the wolf across the river and leaves it with the goat.
** Finally, the farmer goes back alone to the original shore and takes the cabbage across the river to leave it with the goat.

Now, all three have been transported safely to the other side of the river without anything being eaten.

Comment: correct, same reasoning.

This next example is more interesting in that ChatGPT’s answer is different to Parade’s answer and I disagree with both answers! See what you think.

Puzzle 22. A teacher writes six words on a board: “cat dog has max dim tag.” She gives three students, Albert, Bernard and Cheryl each a piece of paper with one letter from one of the words. Then she asks, “Albert, do you know the word?” Albert immediately replies yes. She asks, “Bernard, do you know the word?” He thinks for a moment and replies yes. Then she asks Cheryl the same question. She thinks and then replies yes. What is the word?

Parade’s answer. Dog. Albert knows right away because he has one of the unique letters that only appear once in all the words: c o h s x i. So, we know the word is not “tag.” All of these unique letters appear in different words, except for “h” and “s” in “has,” and Bernard can figure out what the word is from the unique letters that are left: t, g, h, s. This eliminates “max” and “dim.” Cheryl can then narrow it down the same way. Because there is only one unique letter left, the letter “d,” the word must be “dog.”

ChatGPT’s answer. The word is “dim.” Here’s how we can deduce the word:

First, we need to figure out which letters each student has. From Albert and Bernard’s responses, we know that they each have a letter that is unique to the word “dim” (since they were able to immediately identify the word after being given one letter). Therefore, they must have the “d” and “i” respectively. From Cheryl’s response, we know that the letter she was given is not unique to “cat,” “dog,” “has,” or “tag.” Therefore, she must have been given an “m,” which is the only letter that appears in both “dim” and “max.”

Putting the letters together, we get the word “dim.”

Comment: slightly different approaches, different answers, and I disagree with both answers! First, the question is imprecise. Does each student receive the same one letter from the same word or different letters from the same word? Let’s assume different letters from the same word. Second, are the letters kept hidden from each other, or shown? Let’s assume hidden until the reveal at the end when the three pieces of paper are turned over.

Looking at the words, there are six instances of a letter that is only used once: c (cat), o (dog), h and s (has), x (max), i (dim). Now consider the responses. Albert replies ‘yes’ immediately which means he has one of the six unique letters and can thus identify the word but he does not reveal his letter to the others. Now consider Bernard’s response. He can eliminate tag as the word because he now knows the answer is one of the words containing a unique letter – cat, dog, has, max, dim. With tag gone from the list, the set of unique letters now includes two extra letters: t (cat) and g (dog). For Bernard to answer yes, after some thought, he must have one of the new unique letters, t or g, and now knows which of the two words is the answer: cat or dog. Now consider Cheryl. She now knows the candidates are reduced to two words and her letter will identify the correct word – cat or dog – but without knowing what letter she has, we cannot say which of the two words is finally revealed.

Insert.

If this puzzle intrigues you, here is a more complete analysis leading to my conclusion

Initial word candidates: {cat, dog, has, max, dim, tag}
Unique letters: {c, o, h, s, x, i}
Non-unique letters: {a, t, d, m, g}

Possible scenarios based on Albert holding a unique letter.

First pass: Albert has a unique letter.
The word tag is eliminated. (It’s the only word not containing a unique letter.)
Modified word candidates: {cat, dog, has, max, dim}
Eliminating tag means the letters t and g become unique.
Modified unique letters: {c, o, h, s, x, i, t, g}
Modified non-unique letters: {a, d, m}

Reduced scenarios based on Bernard now holding a unique letter

Second pass: Bernard now realises that tag is no longer a candidate and that he can identify the word which means he either has the t or the g (now unique) and not the a or the d (still non-unique). This means Cheryl has either the a or the d.
The words cat and dog are the only words that fit the modified list of words with t or g now unique letters.
Modified word candidates: {cat, dog}
Modified unique letters: {c, o, h, s, x, i, t, g}
Modified non-unique letters: {a, d, m}

Third pass. Cheryl looks at her letter. She has worked out that the word is either cat or dog and her letter will identify which but without knowing the letter we cannot since her possible letters, a or d, are still non-unique. Hence, the answer is the word is either cat or dog because either word fits the three responses.

End of Insert.

The next example is the classic Monty Hall Paradox, always an interesting dinner-party question.

Puzzle 24. You are given three doors to choose from, one of which contains a car and the other two contain goats. After you’ve chosen one but haven’t opened it, Monty, who knows where everything is, reveals the location of a goat from behind one of the other two doors. Should you stick with your original choice or switch, if you want the car?

Parade’s answer. You should switch. At the beginning, your choice starts out as a one in three chance of picking the car; the two doors with goats contain 2/3 of the chance. But since Monty knows and shows you where one of the goats is, that 2/3 chance now rests solely with the third door (your choice retains its original 1/3 chance; you were more likely to pick a goat to begin with). So, the odds are better if you switch.

When you first choose a door, there is a 1/3 chance that you have chosen the door with the car, and a 2/3 chance that you have chosen a door with a goat. After Monty reveals the location of one of the goats, there is still a 2/3 chance that the car is behind one of the other two doors, since one of the doors you didn’t choose and doesn’t have a car behind it was revealed.

Switching your choice means that you’re effectively betting on the remaining unopened door having the car, which has a 2/3 chance of being correct. This is better than sticking with your original choice, which has only a 1/3 chance of being correct.

It may seem counterintuitive, but numerous simulations and mathematical proofs have shown that switching your choice is the optimal strategy in this scenario if you want to maximize your chances of getting the car.

Comment: this logical puzzle came originally from Steve Selvin, an American professor in biostatistics, who first posed the question in a letter published in the American Statistician journal in 1975. It was later associated with an American television game show called ‘Let’s Make A Deal’, hosted by Monty Hall, and subsequently known as the Monty Hall Paradox. The answers above are both correct and follow the same line of reasoning. You can find my reasoning on my website here: https://ben-bennetts.com/2016/08/26/fingers-to-the-keyboard-monty-hall-paradox-extract/

ChatGPT’s closing remarks are right on the button. The answer is counterintuitive and I recommend you do not pose the question at a dinner party. It has been known to cause an outbreak of fisticuffs!

The last example is ChatGPPT’s response to what is known as the Hardest Logic Puzzle Ever, a tough one for humans and generative AI both.

Puzzle 25. This conundrum, a variation on a lying/truth problem, has famously been called the Hardest Logic Puzzle Ever. You meet three gods on a mountain top. One always tells the truth, one always lies, and one tells the truth or lies randomly. We can call them Truth, False and Random. They understand English but answer in their own language, with ja or da for yes and no—but you don’t know which is which. You can ask three questions to any of the gods (and you can ask the same god more than one question), and they will answer with ja or da. What three questions do you ask to figure out who’s who?

But what if you said, “If I asked you if two plus two equals five, would you answer ja?” If ja means yes, Truth would answer da, as would False; if ja means no, they’d also both answer da. So, you know that if the embedded question is correct, Truth and False always answer with the same word you use; if the embedded question is incorrect, they always answer with the opposite word. You also know they always answer with the same word as each other.

ChatGPT’s answer. This is indeed a challenging puzzle, but there is a solution! Here are the three questions you can ask:

First, ask any one of the gods the following question: “If I were to ask you whether you are the liar, would you say ja?” If the answer is ja, the god is either Truth or Random; if the answer is da, the god is False. This is because if you ask this question to Truth, they would say “ja” (since they always tell the truth), but if you ask this question to False, they would also say “ja” (since they always lie). If you ask this question to Random, their response is random, but since you don’t know whether they’re lying or telling the truth, the result is essentially a 50/50 chance.

Next, ask any one of the gods the following question: “If I were to ask you whether one plus one is three, would you say ja?” If the answer is ja, the god is Random; if the answer is da, the god is either Truth or False. This is because if you ask this question to Truth, they would say “da” (since one plus one is not three), but if you ask this question to False, they would also say “da” (since they always lie). If you ask this question to Random, their response is again random, but since you already know that they are Random, you can deduce the correct answer.

Finally, ask the same god from step 2 the following question: “If I were to ask you whether the first god I spoke to is Truth, would you say ja?” If the answer is ja, then the first god you spoke to is False; if the answer is da, then the first god you spoke to is either Truth or Random. This is because if you ask this question to Truth, they would say “ja” (since the first god you spoke to is indeed Truth), but if you ask this question to Random, they would give a random response, leaving a 50/50 chance. If you ask this question to False, they would say “da” (since the first god you spoke to is actually Truth).

By asking these three questions, you can determine which god is Truth, which is False, and which is Random.

Comment: both answers follow the same line of reasoning using the embedded question technique. If this technique is new to you, here is a simpler example known variously as the Fork in the Road puzzle or Knight and Knave puzzle or, in the case of 16-year-old Sarah, the main protagonist in the 1986 movie, Labyrinth, the Castle or Certain Death puzzle. Here it is in its Fork in the Road format.

You are walking along a road seeking Heaven and reach a left/right fork. Each path is blocked by a guard, G1 on the left and G2 on the right. A sign in the middle says that one fork leads to Heaven, the other to Hell, but doesn’t say which fork is which. It also states that you can ask one yes/no question to one of the guards, your choice, but warns that one guard always lies and the other always tell the truth. The sign does not say which guard is always truthful or which guard is a liar, only that each guard knows which fork leads where.

You may ask one question of only one guard in order t0 determine, with certainty, the way to Heaven. What is that question?

The trick is to ask a question that identifies the liar and thus the correct path. Turning to any guard (let’s pick G1 on our left but it’ll work if we pick G2 on the right), we ask, ‘G1, would G2 tell me that your (G1’s) fork leads to Heaven?’. G1 will answer yes or no.

There are two variables to consider: either G1 is truthful which makes G2 the liar, or vice versa; and either G1 guards the road to Heaven which means G2 guards the road to Hell, or vice versa. Digital electronic designers and Boolean mathematicians will recognise this to be a simple combinational logic problem easily solved via a truth table. Here is a pictorial summary of all the possible variations:

As you can see, if G1 answers no, go left. If he answers yes, go right. Simples!

Both Parade and ChatGPT have used the embedded question technique, albeit in a more complex set of questions, to answer the so-called Hardest Logic Puzzle Ever. If this particular puzzle intrigues you, take a look at Wikipedia’s article here but be prepared for some complex mathematics!

Finally, I couldn’t resist asking ChatGPT if it had enjoyed the exercise!

Question. That’s great. Do you like solving these logical problems? If so, why?

ChatGPT’s answer. As an AI language model, I do not have likes or dislikes, but I can tell you that I enjoy solving logical problems because it allows me to utilize my computational abilities and knowledge base to think through complex situations and arrive at a solution.

Additionally, logical problems often involve creative thinking and require me to consider different possibilities and approaches in order to find the correct solution. This can be challenging and rewarding, as I am able to use my problem-solving skills to arrive at the correct answer.

Overall, I find logical problems to be both intellectually stimulating and enjoyable to work through.

(Actually, I did ask ChatGPT this question. See the full essay if you want to know the answer.)

(^_^)