Hello all,
I’m part of a startup working on deploying AI/ML solutions, and we’re at the stage of selecting the right server hardware for our infrastructure. We’re considering 4U servers due to their ability to support multiple GPUs and high storage capacity.
Our use case involves training deep learning models that require significant computational power, and we’re also planning for future scalability as the workloads grow.
Here are a few questions we’re grappling with:
Are 4U servers a practical choice for high-performance AI/ML workloads?
What configurations (e.g., GPU type, CPU specs, RAM) are optimal for handling such tasks?
Are there any challenges with cooling or power management we should be aware of?
If anyone has experience with similar setups or recommendations on specific hardware, I’d greatly appreciate your input!
Thanks in advance!