[BONUS] Plan & Solve Prompting: Improving Zero Shot Chain Of Thought reasoning by Large Language Models
Paper link: https://arxiv.org/pdf/2305.04091.pdf
Last week when we discussed Cot prompting strategy, we also discussed about zero shot CoT technique which in layman terms simply means concatenating the target problem with a prompt saying “Lets think step by step”. With this prompt, most of the LLMs outperformed previous state of the art results in several domains such as reasoning, arithmetic etc.
However on analyzing researchers noticed that the reason some LLMs were not performing well was because the inferred a wrong equation or made a calculation error. In order to solve that, researchers proposed another prompting method called as “Plan & Solve”. It consists of two components: first, devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan. The experimental results over GPT-3 show that the proposed zero-shot prompting consistently outperforms Zero-shot-CoT across all datasets by a large margin, is comparable to or exceeds Zero-shot-Program-of-Thought Prompting.
Let’s cut the chase and come straight to the point. The new proposed methodology basically changes the zero shot CoT prompt “Lets think step by step“ into “Let’s first understand the problem and devise a plan to solve the problem. Then, let’s carry out the plan and solve the problem step by step”.
To address the calculation errors of Zero-shot CoT and improve the quality of generated reasoning steps, they add more detailed instructions to Plan&Solve(PS) prompting. Specifically, they extend it with “extract relevant variables and their corresponding numerals” and “calculate intermediate results (pay attention to calculation and commonsense)” instructions. This prompting variant is called the PS+ prompting strategy.
Therefore, the only difference between zero-shot CoT and zero-shot PS is that the latter had a more detailed prompt that caters to the miscalculations and complexity of the problem. Do checkout the paper once to see the results and the future scope of this problem.
The paper actually leaves me with a lot of questions. Do we really understand the depth of these large language models? If a change in the prompt (prompt is nothing but a way for us to retrieve answer from the language model), the accuracy can improve dramatically, then does it mean the complex problems of today are complex only because we don’t know a way to ask them to the model?
Next week’s Paper
The Reversal Curse: LLMs trained on A=B fail to learn B=A
Other posts from The Passion Pad you might be interested in:
Cheers!