Overview
The Hopefog and Livestong PODs both have two AMD GPU servers, which enable powerful Machine Learning (ML) workflows.
Hardware info
Architecture specs
- 2U, Gigabyte G291 enclosures https://www.gigabyte.com/GPU-Server/G291-Z20-rev-100#ov
- 8 AMD Radeon Instinct MI50 GPUs (https://www.amd.com/en/products/professional-graphics/instinct-mi50)
- each with 32 GB HBM2 memory
- 48-core/96-hyperthread EPYC 7642 CPU
- 512 GB RAM
- 1.9 TB Samsung NVMe SSD
Resources
ROCm GPU-enabling framework
Best starting places:
- ROCm Video series
- https://community.amd.com/t5/instinct-accelerators-blog/rocm-open-software-ecosystem-for-accelerated-compute/ba-p/418720
- Especially the Introduction to AMD GPU Hardware: Link
- Provides hardware background and terminology used throughout other guides
- Also
- AMD ROCm resources Learning Center: https://developer.amd.com/resources/rocm-resources/rocm-learning-center/