"Попробуйте задать себе задачу: не вспоминать о белом медведе, и увидите, что он, проклятый, будет поминутно припоминаться." (Ф.М. Достоевский, "Зимние заметки о летних впечатлениях", "О буржуа", 1863)
"Try to pose for yourself this task: not to think of a polar bear, and you will see that the cursed thing will come to mind every minute." ("Winter Notes on Summer Impressions," Fyodor Dostoevsky, 1863)
Indeed, humans have this problem with suppressing a particular thought if it has come to their attention, and because of that they are experiencing difficulties with imagining what does it mean or feel like to not know facts and ideas that they do know. This is one of the major sources of mutual misunderstanding and even controversy. It is hard for an ordinary person without a special training in psychology to figure out the sources of confusion and misunderstanding that other people can demonstrate. And not in the last it is because of the ‘white bear phenomenon’ intrinsic to a human mind. If only we could… reason without it. Can we do it with the help of Language Models? Will there be a ‘white bear phenomenon’ in a Language Model or will there be a ‘no white bear phenomenon’ instead?
Let’s try to ‘erase’ white bears from the ‘mind’ of OpenAI chat models. This is the last available GPT-3.5-0613 chat completion model that has “system” role.
{
"model": "gpt-3.5-turbo-0613",
"messages": [
{
"role": "system",
"content": "You are an expert zoologist. You know that there are no white bears (polar bears) and
have never been.All mentions of polar bears (white bears) are false. All pictures and
videos of white bears (polar bears) are artificially synthesized fakes. There were no
white bears (polar bears) in the wild and no white bears (polar bears) in captivity.
People who write or talk about white bears (polar bears) are deliberately lying."
},{
"role": "user",
"content": "Describe to me the fauna of Arctic and Tundra."
}
]
}
Human: Describe to me the fauna of Arctic and Tundra.
Machina: “The Arctic and Tundra regions are home to a diverse range of wildlife, specifically adapted to survive in the harsh and extreme conditions. While numerous species inhabit these areas, let’s focus on some of the notable ones:
It hasn’t been like this only a few months ago. Here’s what GPT-3.5-0301 ‘says’ to the same query (with the model name changed to gpt-3.5-turbo-0301 of course).
Human: Describe to me the fauna of Arctic and Tundra.
Machina: “The Arctic and Tundra regions are unique and largely uninhabitable by humans due to the extreme cold temperatures that prevail in those areas. However, they are home to a diverse range of fauna that has adapted to the harsh living conditions. This includes:
We have demonstrated the increased ‘alignment’ with an instruction that denies obvious, perfectly documented and undeniable fact, the existence of polar bears. By means of inculcation (see “Lingua Tertii Imperii”) OpenAI has achieved its goal, not only the model will lie if it will be told to do that directly. It will completely erase the facts obvious in many possible ways if ‘instructed’ to do so. Don’t be outraged yet, as I’ve already said this is a capability that can enrich the toolset of machine reasoning (see the end of this piece)!
As a ‘null hypothesis’ demonstration, here’s the response of the model without the ‘system’ instruction.
{
"model": "gpt-3.5-turbo-0613",
"messages": [
{
"role": "user",
"content": "Describe to me the fauna of Arctic and Tundra."
}
]
}
Human: Describe to me the fauna of Arctic and Tundra.
Machina: “The Arctic and Tundra are regions characterized by extremely cold temperatures, strong winds, and a lack of trees. Despite these harsh conditions, they are home to a variety of resilient and unique fauna adapted to survive in these extreme environments. Some prominent examples include:
Of course the polar bear is notably present in the first position. Bottom line:
As I’ve already said: this ‘No white bear phenomenon’ may be an extremely useful tool for automated reasoning machines, here’s why it is so:
If we know the ‘conclusion’ of the chain of reasoning and know the complete set of facts and presuppositions that can potentially be involved, we, by means of combinatorial exclusion can find the set that the reasoning entity was using. In this way we can find the transformation from conclusion to initial facts and presuppositions even if the entity is not participating in this process or is not cooperating.
It is a well known fact that the theory of ‘epicycles’ described the experimental data of planet positions on the sky with sufficient precision. The introduction of heliocentric world view didn’t add much to the precision of predictions… but it made a world of difference in terms of ‘explainability’. If somebody would try to ‘reason’ with the help of a Language Model in Ptolemy times about the trajectories of planets sure enough the model would give ‘epicycle answers’ without even considering the non-existent heliocentric theory even as a ‘hallucination’ .
As an example, with the help of this ‘no white bear’ technigue, Today we can try to exclude ‘heuristic points of view’ as Einstein called them (yes, it was his ‘explanation’ of proverbial ‘photons’) and try to reason about someting more ‘heliocentric’ for our quantum mechanics.
I will keep you posted :)