---
title: "2026-03-11 · AI 每日简报"
date: 2026-03-11T18:36:12.223136+08:00
draft: false
summary: "RobotDaily 2026-03-11：共 9 篇，含 具身智能 3 篇，表征学习 3 篇，强化学习 3 篇。"
tags: ["robotdaily", "ai-daily", "embodied", "具身智能", "representation", "表征学习", "reinforcement", "强化学习", "llm"]
---

> Hugo 归档版，来源于 RobotDaily 当日 Markdown 简报。
>
> RobotDaily 2026-03-11：共 9 篇，含 具身智能 3 篇，表征学习 3 篇，强化学习 3 篇。

偏应用导向精选，按方向整理成短卡片式 Markdown 归档。

## 具身智能（3 篇）

### 1. PlayWorld: Learning Robot World Models from Autonomous Play
> 关键词命中 real world, deployed, world model, scalable，应用信号: real world, deployed, robot；创…
- 作者：Tenny Yin, Zhiting Mei, Zhonghe Zheng, Miyu Yamane 等另外7人
- 标签：`具身智能` `机器人` `真实部署` `操控`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current s…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09030) | [arXiv](https://arxiv.org/abs/2603.09030v1) | [PDF](https://arxiv.org/pdf/2603.09030v1)

### 2. MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation
> 关键词命中 robot, robotic, world model，应用信号: robot, robotic, system；创新信号: world model；领域匹配…
- 作者：Yutong Shen, Hangxu Liu, Penghui Liu, Jiashuo Luo 等另外5人
- 标签：`具身智能` `机器人` `真实部署` `操控`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Learning natural, stable, and compositionally generalizable whole-body control policies for humanoid robots performing simultaneous locomotion and manipulation (loco-manipulation) remains a fundament…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.08572) | [arXiv](https://arxiv.org/abs/2603.08572v1) | [PDF](https://arxiv.org/pdf/2603.08572v1)

### 3. Embodied Human Simulation for Quantitative Design and Analysis of Interactive Robotics
> 关键词命中 robot, robotic, scalable，应用信号: robot, robotic, system；创新信号: scalable；领域匹配: embo…
- 作者：Chenhui Zuo, Jinhao Xu, Michael Qian Vergnolle, Yanan Sui
- 标签：`具身智能` `机器人` `真实部署` `操控`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Physical interactive robotics, ranging from wearable devices to collaborative humanoid robots, require close coordination between mechanical design and control. However, evaluating interactive dynami…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09218) | [arXiv](https://arxiv.org/abs/2603.09218v1) | [PDF](https://arxiv.org/pdf/2603.09218v1)

## 表征学习（3 篇）

### 1. $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs
> 关键词命中 real-world, deployment, first，应用信号: real-world, deployment, system；创新信号: first；…
- 作者：Kaixin Lin, Kunyu Peng, Di Wen, Yufan Chen 等另外2人
- 标签：`表征学习` `潜在空间` `世界模型` `预训练`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Semantic occupancy prediction enables dense 3D geometric and semantic understanding for autonomous driving. However, existing camera-based approaches implicitly assume complete surround-view observat…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09737) | [arXiv](https://arxiv.org/abs/2603.09737v1) | [PDF](https://arxiv.org/pdf/2603.09737v1)

### 2. Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning
> 关键词命中 real-world, real world, world model，应用信号: real-world, real world, deployment；创新…
- 作者：Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen 等另外7人
- 标签：`表征学习` `潜在空间` `世界模型` `预训练`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Extrinsic dexterity leverages environmental contact to overcome the limitations of prehensile manipulation. However, achieving such dexterity in cluttered scenes remains challenging and underexplored…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09882) | [arXiv](https://arxiv.org/abs/2603.09882v1) | [PDF](https://arxiv.org/pdf/2603.09882v1)

### 3. From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding
> 关键词命中 dataset, self-supervised, first，应用信号: dataset；创新信号: self-supervised, first；领域匹配…
- 作者：Wenzhao Xiang, Yue Wu, Hongyang Yu, Feng Gao 等另外2人
- 标签：`表征学习` `潜在空间` `世界模型` `预训练`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Self-supervised visual pre-training methods face an inherent tension: contrastive learning (CL) captures global semantics but loses fine-grained detail, while masked image modeling (MIM) preserves lo…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09955) | [arXiv](https://arxiv.org/abs/2603.09955v1) | [PDF](https://arxiv.org/pdf/2603.09955v1)

## 强化学习（3 篇）

### 1. SPAARS: Safer RL Policy Alignment through Abstract Exploration and Refined Exploitation of Action Space
> 关键词命中 robot, robotic，应用信号: robot, robotic；领域匹配: reinforcement learning, policy gradie…
- 作者：Swaminathan S K, Aritra Hazra
- 标签：`强化学习` `策略优化` `奖励设计` `离线RL`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Offline-to-online reinforcement learning (RL) offers a promising paradigm for robotics by pre-training policies on safe, offline demonstrations and fine-tuning them via online interaction. However, a…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09378) | [arXiv](https://arxiv.org/abs/2603.09378v1) | [PDF](https://arxiv.org/pdf/2603.09378v1)

### 2. Robust Regularized Policy Iteration under Transition Uncertainty
> 关键词命中 benchmark, unified，应用信号: benchmark；创新信号: unified；领域匹配: reinforcement learning,…
- 作者：Hongqiang Lin, Zhenghui Fu, Weihao Tang, Pengfei Wang 等另外3人
- 标签：`强化学习` `策略优化` `奖励设计` `离线RL`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】Offline reinforcement learning (RL) enables data-efficient and safe policy learning without online exploration, but its performance often degrades under distribution shift. The learned policy may vis…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.09344) | [arXiv](https://arxiv.org/abs/2603.09344v1) | [PDF](https://arxiv.org/pdf/2603.09344v1)

### 3. Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control
> 关键词命中 benchmark, hardware, novel，应用信号: benchmark, hardware, sim2real；创新信号: novel；领域匹配…
- 作者：Riccardo De Monte, Matteo Cederle, Gian Antonio Susto
- 标签：`强化学习` `策略优化` `奖励设计` `离线RL`
- 中文摘要：【LLM 暂不可用，先保留英文摘要要点】State-of-the-art deep reinforcement learning (RL) methods have achieved remarkable performance in continuous control tasks, yet their computational complexity is often incompatible with the constrain…
- 链接：[DOI](https://doi.org/10.48550/arXiv.2603.08588) | [arXiv](https://arxiv.org/abs/2603.08588v1) | [PDF](https://arxiv.org/pdf/2603.08588v1)