Stanford CS336: Language Modeling from Scratch¶
约 83 个字 预计阅读时间不到 1 分钟
Table of Contents¶
- Lecture 1: Overview & Tokenization
- Lecture 2: PyTorch & Resource Accounting
- Lecture 3: Architectures & Hyperparameters
- Lecture 4: Mixture of Experts
- Lecture 5: GPUs
- Lecture 6: Kernels & Triton
- Lecture 7: Parallelism I
- Lecture 8: Parallelism II
- Lecture 9: Scaling Laws I
- Lecture 10: Inference
- Lecture 11: Scaling Laws II
- Lecture 12: Evaluation
- Lecture 13: Data I
- Lecture 14: Data II
- Lecture 15: Alignment with SFT/RLHF
- Lecture 16: Alignment with RL I
- Lecture 17: Alignment with RL II