ClawLab/RobotDaily: 每日RL、表征学习、具身智能等领域的5篇精选早报 @ 195a29602dede16e1d559ab9fc8201813039470f

title: "2026-03-16 · AI 每日简报" date: 2026-03-16T16:31:30.888993+08:00 draft: false summary: "RobotDaily 2026-03-16：共 7 篇，含具身智能 2 篇，表征学习 3 篇，强化学习 2 篇。"

tags: ["robotdaily", "ai-daily", "具身智能", "表征学习", "强化学习"]

Hugo 归档版，来源于 RobotDaily 当日 Markdown 简报。

RobotDaily 2026-03-16：共 7 篇，含具身智能 2 篇，表征学习 3 篇，强化学习 2 篇。

偏应用导向精选，按方向整理成短卡片式 Markdown 归档。

具身智能（2 篇）

1. PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

PhysMoDPO，采用扩散模型解决机器人操作，实现真实部署

作者：Yangsong Zhang, Anujith Muraleedharan, Rikhat Akizhanov, Abdul Ahad Butt 等另外4人

标签：具身智能 机器人 真实部署 操控

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models…

链接：DOI | arXiv | PDF

2. HandelBot: Real-World Piano Playing via Fast Adaptation of Dexterous Robot Policies

HandelBot，采用强化学习解决机器人操作，实现真实部署

作者：Amber Xie, Haozhi Qi, Dorsa Sadigh

标签：具身智能 机器人 真实部署 操控

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Mastering dexterous manipulation with multi-fingered hands has been a grand challenge in robotics for decades. Despite its potential, the difficulty of collecting high-quality data remains a primary bottleneck for high-…

链接：DOI | arXiv | PDF

表征学习（3 篇）

1. Representation Learning for Spatiotemporal Physical Systems

Representation Learning for Spatiotemporal Physical Systems，采用自监督学习解决相关任务，实现性能优化

作者：Helen Qu, Rudy Morel, Michael McCabe, Alberto Bietti 等另外3人

标签：表征学习 潜在空间 世界模型 预训练

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators…

链接：DOI | arXiv | PDF

2. Separable neural architectures as a primitive for unified predictive and generative intelligence

Separable neural architectures as a primitive for unified predictive and generative i…

作者：Reza T. Batley, Apurba Sarker, Rajib Mostakim, Andrew Klichine 等另外1人

标签：表征学习 潜在空间 世界模型 预训练

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Intelligent systems across physics, language and perception often exhibit factorisable structure, yet are typically modelled by monolithic neural architectures that do not explicitly exploit this structure. The separabl…

链接：DOI | arXiv | PDF

3. VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation

VIRD，采用表征学习解决机器人操作，实现首次提出

作者：Juhye Park, Wooju Lee, Dasol Hong, Changki Sung 等另外3人

标签：表征学习 潜在空间 世界模型 预训练

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Accurate global localization is crucial for autonomous driving and robotics, but GNSS-based approaches often degrade due to occlusion and multipath effects. As an emerging alternative, cross-view pose estimation predict…

链接：DOI | arXiv | PDF

强化学习（2 篇）

1. Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization

Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization，采用…

作者：Raphael Trumpp, Denis Hoornaert, Mirco Theile, Marco Caccamo

标签：强化学习 策略优化 奖励设计 离线 RL

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Residual policy learning (RPL), in which a learned policy refines a static base policy using deep reinforcement learning (DRL), has shown strong performance across various robotic applications. Its effectiveness is part…

链接：DOI | arXiv | PDF

2. Beyond Imitation: Reinforcement Learning Fine-Tuning for Adaptive Diffusion Navigation Policies

Beyond Imitation，采用扩散模型解决机器人操作，实现零样本泛化

作者：Junhe Sheng, Ruofei Bai, Kuan Xu, Ruimeng Liu 等另外4人

标签：强化学习 策略优化 奖励设计 离线 RL

中文摘要：【LLM 暂不可用，先保留英文摘要要点】Diffusion-based robot navigation policies trained on large-scale imitation learning datasets, can generate multi-modal trajectories directly from the robot's visual observations, bypassing the traditional localization-m…

链接：DOI | arXiv | PDF

2026-03-16.md 5.8 KB 文件历史 原始文件

tags: ["robotdaily", "ai-daily", "具身智能", "表征学习", "强化学习"]

具身智能（2 篇）

1. PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

2. HandelBot: Real-World Piano Playing via Fast Adaptation of Dexterous Robot Policies

表征学习（3 篇）

1. Representation Learning for Spatiotemporal Physical Systems

2. Separable neural architectures as a primitive for unified predictive and generative intelligence

3. VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation

强化学习（2 篇）

1. Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization

2. Beyond Imitation: Reinforcement Learning Fine-Tuning for Adaptive Diffusion Navigation Policies

2026-03-16.md 5.8 KB

文件历史原始文件