ClawLab
/
MathLab


			
				
					
						
						
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
							<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>ArXiv Daily Digest</title>
    <style>
        body { font-family: Arial, sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; }
        .paper { border: 1px solid #ddd; margin: 20px 0; padding: 15px; border-radius: 5px; }
        .title { font-size: 1.3em; font-weight: bold; margin-bottom: 10px; color: #2c5aa0; }
        .authors { font-size: 0.9em; color: #666; margin-bottom: 10px; }
        .summary { line-height: 1.5; }
        .category { background-color: #e0f7fa; padding: 3px 8px; border-radius: 3px; font-size: 0.8em; }
        .date { font-size: 0.8em; color: #888; }
        .insight { background-color: #f0f8f0; border-left: 4px solid #4caf50; padding: 10px; margin-top: 10px; }
    </style>
</head>
<body>
    <h1>🤖 Daily ArXiv Digest</h1>
    <p>Latest papers in Embodied AI, Representation Learning, and Reinforcement Learning</p>
    
    <div class="paper">
        <div class="title">Smooth Gate Functions for Soft Advantage Policy Optimization</div>
        <div class="authors">Egor Denisov, Svetlana Glazyrina, Maksim Kryzhanovskiy, Roman Ischenko</div>
        <div class="category">cs.LG, cs.AI</div>
        <div class="date">Published: 2026-02-22</div>
        <div class="summary">Group Relative Policy Optimization (GRPO) has significantly advanced the training of large language models and enhanced their reasoning capabilities, while it remains susceptible to instability due to the use of hard clipping. Soft Adaptive Policy Optimization (SAPO) addresses this limitation by replacing clipping with a smooth sigmoid-based gate function, which leads to more stable updates. This paper investigates the impact of different gate functions on training stability and final model performance.</div>
        <div class="insight">This work is important for stable policy optimization in large language models, which could have implications for embodied agents that rely on complex reasoning capabilities.</div>
    </div>
    
    <div class="paper">
        <div class="title">BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning</div>
        <div class="authors">Qiusi Zhan, Hyeonjeong Ha, Rui Yang, Sirui Xu, Hanyang Chen, Liang-Yan Gui, Yu-Xiong Wang, Huan Zhang, Heng Ji, Daniel Kang</div>
        <div class="category">cs.AI, cs.CL, cs.CV</div>
        <div class="date">Published: 2025-10-31</div>
        <div class="summary">This paper introduces BEAT, the first framework to inject visual backdoors into VLM-based embodied agents using objects in the environments as triggers. The work addresses a critical security risk in vision-driven embodied agents, where agents behave normally until a visual trigger appears in the scene, then execute an attacker-specified multi-step policy.</div>
        <div class="insight">Security is a crucial aspect of embodied AI systems. This research highlights the need for robust defenses in real-world deployment of such systems.</div>
    </div>
    
    <div class="paper">
        <div class="title">Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems</div>
        <div class="authors">Risal Shahriar Shefin, Debashis Gupta, Thai Le, Sarra Alqahtani</div>
        <div class="category">cs.AI, cs.LG, cs.MA</div>
        <div class="date">Published: 2026-02-08</div>
        <div class="summary">This work introduces a two-stage gradient-based framework for interpretable failure detection and attribution in multi-agent reinforcement learning systems. The framework provides diagnostics for detecting failure sources, validating false positives, and tracing failure propagation through learned coordination pathways.</div>
        <div class="insight">As multi-agent systems become more complex, interpretable failure analysis becomes essential for safety-critical applications. This approach offers practical tools for diagnosing cascading failures in such systems.</div>
    </div>
    
    <div class="paper">
        <div class="title">A simple connection from loss flatness to compressed neural representations</div>
        <div class="authors">Shirui Chen, Stefano Recanatesi, Eric Shea-Brown</div>
        <div class="category">cs.LG, cs.AI</div>
        <div class="date">Published: 2023-10-03</div>
        <div class="summary">This paper investigates how sharpness relates to the geometric structure of neural representations, specifically representation compression. The authors introduce measures like Local Volumetric Ratio (LVR) and Maximum Local Sensitivity (MLS) and derive upper bounds showing these are constrained by sharpness.</div>
        <div class="insight">Understanding the relationship between loss landscape geometry and representation compression is fundamental to understanding neural network training dynamics and generalization.</div>
    </div>
    
    <div class="paper">
        <div class="title">CAIRO: Decoupling Order from Scale in Regression</div>
        <div class="authors">Harri Vanhems, Yue Zhao, Peng Shi, Archer Y. Yang</div>
        <div class="category">stat.ME, cs.LG, stat.ML</div>
        <div class="date">Published: 2026-02-16</div>
        <div class="summary">The paper proposes CAIRO (Calibrate After Initial Rank Ordering), a framework that decouples regression into two distinct stages. In the first stage, a scoring function is learned by minimizing a scale-invariant ranking loss; in the second, target scale is recovered via isotonic regression.</div>
        <div class="insight">This approach offers a novel perspective on regression that could be particularly useful for embodied AI systems that need to make robust predictions under varying conditions.</div>
    </div>
    
    <p>Generated on 2026-02-24</p>
</body>
</html>