Skip to content

Commit

Permalink
Update january-2025.md
Browse files Browse the repository at this point in the history
  • Loading branch information
SrGrace authored Jan 26, 2025
1 parent 68e1c63 commit 919e392
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion research_and_future_trends/january-2025.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
| Title | Summary | Topics |
| --- | --- | --- |
| [Evolving Deeper LLM Thinking](https://arxiv.org/pdf/2501.09891) | This recent paper by researchers at Google DeepMind proposes a new approach - "Mind Evolution". It takes LLMs beyond static reasoning to achieve dynamic, iterative problem-solving. <br><br> Here’s a quick dive into what makes it worth a while: <br><br> 🌟 Inspired by Nature: <br>The method applies genetic algorithms - concepts of evolution like selection, crossover, and mutation to refine solutions iteratively. Think of it as AI brainstorming, improving its responses generation by generation. 🤔 <br><br> ⚙️ How It Works: <br> &nbsp; 🔹 Starts with diverse solution candidates for a problem. <br> &nbsp; 🔹 Uses an LLM to recombine and refine them, guided by feedback from an evaluator. <br> &nbsp; 🔹 Continues refining until it reaches the optimal answer or hits compute limits. <br><br> 📈 Performance Gains: <br> &nbsp; 🔹 Achieved 95.6% success in the TravelPlanner benchmark, compared to 55.6% using traditional Best-of-N strategies. <br> &nbsp; 🔹 Tackled natural language planning challenges like trip planning, meeting scheduling, and even creative tasks like embedding hidden messages in poetry! <br><br> 💡 Unlike conventional methods, Mind Evolution thrives in unstructured, natural language spaces, making it ingenious for tasks where formal problem-solving frameworks don’t exist - like planning complex travel itineraries, crafting poetic steganography etc. | LLM Reasoning |
| []() | | |
| [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf) | 1️⃣ DeepSeek-R1-Zero: <br> &nbsp; 🔹 Trained purely with RL, this model developed advanced reasoning behaviours like self-verification, reflection, and generating long reasoning chains - all without supervised guidance. <br> &nbsp; 🔹 This could be the first open and transparent demonstration that RL alone can unlock reasoning capabilities in LLMs. <br><br> 2️⃣ DeepSeek-R1: <br> &nbsp; 🔹 By incorporating a small amount of cold-start SFT data, DeepSeek-R1 achieved performance comparable to OpenAI’s o1-1217 model across a wide range of reasoning tasks. <br> &nbsp; 🔹 Lesson learned? While SFT isn’t essential, it can be a valuable accelerator. <br><br> 3️⃣ Distilling Reasoning to Smaller Models: <br> &nbsp; 🔹 The team distilled the capabilities of DeepSeek-R1 into smaller, efficient models, like DeepSeek-R1-Distill-Qwen-14B, which significantly outperformed the open-source QwQ-32B-Preview. <br> &nbsp; 🔹 This shows how much of a large model’s reasoning power can be preserved in smaller versions - a big win for accessibility and efficiency. <br><br> 📊 Impressive Performance Highlights: <br> &nbsp; 🔹 71% accuracy on the AIME benchmark (DeepSeek-R1-Zero), starting from just 15.6%! <br> &nbsp; 🔹 Achieved expert-level performance in reasoning-intensive tasks like math, coding, and logic. <br> &nbsp; 🔹 Outperformed benchmarks like OpenAI-o1-mini on tasks such as MATH-500 and GPQA Diamond. <br><br> ☘️ What Sets DeepSeek Apart? <br> &nbsp; 🔹 Transparency: The paper openly details not just the successes but also the failures - something rare but incredibly valuable in AI research. <br> &nbsp; 🔹 Community-first Approach: They released their models under an MIT license, making this work freely accessible to researchers and developers worldwide - a big gift to the community. | LLM Reasoning |
| []() | | |


Expand Down

0 comments on commit 919e392

Please sign in to comment.