---
name: arxiv-digest
description: Daily arXiv digest generation for embodied intelligence, representation learning, and reinforcement learning. Use when Codex needs to: (1) fetch recent papers from arXiv, (2) rank them with an applied-research bias, (3) pick 2-3 papers per domain, (4) translate abstracts into Chinese, add short explanations and tag keywords, (5) render mobile-friendly digest cards, or (6) publish the digest to Discord threads/channels on a schedule.
---

# arXiv Digest

Use `scripts/run_daily.py` as the single entry point.

## Workflow

1. Fetch recent arXiv papers with `scripts/fetch_arxiv.py`.
2. Score papers for domain fit, applied value, innovation, and recency.
3. Select 2-3 papers for each domain:
   - 具身智能
   - 表征学习
   - 强化学习
4. Use the local Ollama model to produce:
   - 中文摘要翻译
   - 简短价值解读
   - 卡片标签
5. Render two outputs:
   - mobile-friendly HTML digest with expandable cards
   - Markdown archive for Discord / quick search
6. Publish to Discord:
   - `thread` mode: OpenClaw-native daily thread/forum post
   - `channel` mode: create one dated text channel per day via Discord REST + OpenClaw posting
   - `fixed-channel` mode: reuse one stable channel name such as `robotdaily`, and create it if missing
   - `existing-channel` mode: reuse a fixed channel id (best for already-known target channels)

## Run commands

Dry run without Discord:

```bash
python3 scripts/run_daily.py
```

Generate digest and publish to Discord:

```bash
python3 scripts/run_daily.py --publish-discord
```

Generate digest but skip LLM enrichment:

```bash
python3 scripts/run_daily.py --skip-enrich
```

## Config

Read `references/selection-and-delivery.md` when you need to tune scoring or choose the Discord delivery mode.

Common env vars in `arxiv-digest/.env`:

- `INSIGHT_MODELS=glm-4.7:cloud,qwen3.5:cloud,qwen3.5:27b,glm-4.7-flash-64k:latest`
- `ROBOTDAILY_OUTPUT_DIR=/path/to/output`
- `DISCORD_DELIVERY_MODE=thread|channel|fixed-channel|existing-channel`
- `DISCORD_ACCOUNT_ID=codex`
- `DISCORD_GUILD_ID=...`
- `DISCORD_PARENT_CHANNEL_ID=...`
- `DISCORD_TARGET_CHANNEL_ID=...`
- `DISCORD_TARGET_CHANNEL_NAME=robotdaily`
- `DISCORD_CATEGORY_ID=...`
- `DISCORD_BOT_TOKEN=...` (needed when a missing channel must be created)
- `DISCORD_THREAD_AUTO_ARCHIVE_MIN=10080`

## Output

Each run writes a dated bundle containing:

- `candidates.json`
- `selected.json`
- `enriched.json`
- `robotdaily.html`
- `robotdaily.md`
- `manifest.json`

## Scheduling

The pipeline is designed for a daily 10:30 run in Asia/Shanghai.

Recommended cron entry example:

```cron
30 10 * * * cd /path/to/robdaily/arxiv-digest && python3 scripts/run_daily.py --publish-discord >> logs/robotdaily.log 2>&1
```