1 篇博文含有标签「Reasoning」

LLM Reasoning: Key Ideas and Limitations

2025年1月26日 · 阅读需 2 分钟

Reasoning is pivotal for advancing LLM capabilities

Expectations for AI: Solving complex math problems, discovering scientific theories, achieving AGI.
Baseline Expectation: AI should emulate human-like learning with few examples.

What is Missing in ML?
- Reasoning: The ability to logically derive answers from minimal examples.

Problem

: Extract the last letters of words and concatenate them.
- Example: "Elon Musk" → "nk".
Traditional ML: Requires significant labeled data.
LLMs: Achieve 100% accuracy with one demonstration using reasoning.

Humans solve problems through reasoning and intermediate steps.
Example:
- Input: "Elon Musk"
- Reasoning: Last letter of "Elon" = "n", of "Musk" = "k".
- Output: "nk".

Chain-of-Thought (CoT) Prompting
- Breaking problems into logical steps.
- Examples from math word problems demonstrate enhanced problem-solving accuracy.
Least-to-Most Prompting
- Decomposing problems into easier sub-questions for gradual generalization.
Analogical Reasoning
- Adapting solutions from related problems.
- Example: Finding the area of a square by recalling distance formula logic.
Zero-Shot and Few-Shot CoT
- Triggering reasoning without explicit examples.
Self-Consistency in Decoding
- Sampling multiple responses to improve step-by-step reasoning accuracy.

Distraction by Irrelevant Context
- Adding irrelevant details significantly lowers performance.
- Solution: Explicitly instructing the model to ignore distractions.
Challenges in Self-Correction
- LLMs can fail to self-correct errors, sometimes worsening correct answers.
- Oracle feedback is essential for effective corrections.
Premise Order Matters
- Performance drops with re-ordered problem premises, emphasizing logical progression.

Intermediate reasoning steps are crucial for solving serial problems.
Techniques like self-debugging with unit tests are promising for future improvements.

Defining the right problem is critical for progress.
Solving reasoning limitations by developing models that autonomously address these issues.