course_day4.html 4.2 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
  1. <!DOCTYPE html>
  2. <html lang="zh-CN">
  3. <head>
  4. <meta charset="UTF-8">
  5. <meta name="viewport" content="width=device-width, initial-scale=1.0">
  6. <title>4 - EM 算法</title>
  7. <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.16.9/dist/katex.min.css">
  8. <style>
  9. body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; line-height: 1.6; max-width: 800px; margin: 0 auto; padding: 20px; background: #1a1a2e; color: #eaeaea; }
  10. h1 { color: #e94560; border-bottom: 2px solid #e94560; padding-bottom: 10px; }
  11. h2 { color: #0f3460; background: #16213e; padding: 10px; border-left: 4px solid #e94560; margin-top: 30px; }
  12. .module { background: #0f3460; padding: 15px; margin: 20px 0; border-radius: 5px; }
  13. .module-title { color: #e94560; font-weight: bold; margin-bottom: 10px; }
  14. code { background: #1a1a2e; padding: 2px 6px; border-radius: 3px; color: #f0f6f6; }
  15. pre { background: #1a1a2e; padding: 15px; border-radius: 5px; overflow-x: auto; }
  16. .symbol-map { background: #16213e; padding: 10px; margin: 5px 0; border-left: 3px solid #0f3460; }
  17. .warning { background: #e94560; color: #fff; padding: 10px; border-radius: 5px; margin: 10px 0; }
  18. .youtube { background: #ff0000; color: #fff; padding: 10px; border-radius: 5px; display: inline-block; margin: 10px 0; }
  19. </style>
  20. </head>
  21. <body>
  22. <h1>👾 Day 4: EM 算法</h1>
  23. <div class="module">
  24. <div class="module-title">1️⃣【技术债与演进动机】The Technical Debt & Evolution</div>
  25. 监督学习需要完整标注数据,但现实中很多数据缺失或隐变量未知。EM 算法通过迭代估计隐变量和参数。
  26. </div>
  27. <div class="module">
  28. <div class="module-title">2️⃣【直觉建立】Visual Intuition</div>
  29. 想象混合高斯模型:E 步猜测每个点属于哪个高斯,M 步根据猜测更新高斯参数,循环迭代直到收敛。
  30. <div class="youtube">🎬 B 站搜索:<code>EM 算法 直观解释</code></div>
  31. </div>
  32. <div class="module">
  33. <div class="module-title">3️⃣【符号解码字典】The Symbol Decoder</div>
  34. <div class="symbol-map"><strong>$\theta$</strong> → <code>self.params</code> (模型参数)</div>
  35. <div class="symbol-map"><strong>$z$</strong> → <code>latent_vars</code> (隐变量)</div>
  36. <div class="symbol-map"><strong>$Q(\theta | \theta^{(t)})$</strong> → <code>Q_function</code> (期望下界)</div>
  37. <div class="symbol-map"><strong>$\mathcal{L}(\theta)$</strong> → <code>log_likelihood</code> (对数似然)</div>
  38. </div>
  39. <div class="module">
  40. <div class="module-title">4️⃣【核心推导】The Math</div>
  41. ### 对数似然函数
  42. $$\mathcal{L}(\theta) = \log P(X | \theta) = \log \sum_{Z} P(X, Z | \theta)$$
  43. ### E 步:构造下界
  44. $$\mathcal{L}(\theta) \geq Q(\theta | \theta^{(t)}) = \sum_{Z} P(Z | X, \theta^{(t)}) \log \frac{P(X, Z | \theta)}{P(Z | X, \theta^{(t)})}$$
  45. ### M 步:最大化下界
  46. $$\theta^{(t+1)} = \text{argmax}_{\theta} Q(\theta | \theta^{(t)})$$
  47. ### 迭代收敛
  48. $$\mathcal{L}(\theta^{(t+1)}) \geq \mathcal{L}(\theta^{(t)})$$
  49. </div>
  50. <div class="module">
  51. <div class="module-title">5️⃣【工程优化点】The Optimization Bottleneck</div>
  52. 每次迭代需要遍历所有样本计算期望,复杂度 $O(N \cdot K \cdot I)$,K 为隐变量数,I 为迭代次数。
  53. </div>
  54. <div class="module">
  55. <div class="module-title">6️⃣【今日靶机】The OJ Mission</div>
  56. <div class="warning">🎯 任务:<code>cd exercises/ && python3 day4_task.py</code></div>
  57. 实现 EM 算法拟合混合高斯分布,在双峰数据上验证参数收敛过程。
  58. </div>
  59. <script src="https://cdn.jsdelivr.net/npm/katex@0.16.9/dist/katex.min.js"></script>
  60. <script src="https://cdn.jsdelivr.net/npm/katex@0.16.9/dist/contrib/auto-render.min.js"></script>
  61. <script>
  62. renderMathInElement(document.body, {
  63. delimiters: [
  64. {left: '$$', right: '$$', display: true},
  65. {left: '$', right: '$', display: false}
  66. ],
  67. throwOnError: false
  68. });
  69. </script>
  70. </body>
  71. </html>