HCCL集合通信测试环境设置

1. 快速使用 HCCL(Huawei Collective Communication Library)是基于昇腾AI处理器的高性能集合通信库,其主要功能与作用与Nvidia的NCCL库相似,主要用于集合通信,CANN库种自带一套测试工具用以分析集合通信性能。 1.1 编译环境配置

Llama2部署记录

1. 创建docker 1.1 docker命令 显示全部的镜像:docker images 创建一个容器:docker run ...<查看1.2节内容> 显示全部的容器:docker ps -a 启动一个容器:docker start &l

异构集群(Heterogeneous Clusters)

1. HeteroG 参考文献: Yi X, Zhang S, Luo Z, et al. Optimizing Distributed Training Deployment in Heterogeneous GPU Clusters: Proceedings of the 16th Intern

3D并行(3D Parallelism)

参考文献: Narayanan D, Shoeybi M, Casper J, et al. Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM: SC' 21, November 14-19

流水线并行(Pipeline Parallelism)

参考文献: Huang Y, Cheng Y, Chen D, et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism: 33rd Conference on Neural Inform
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×