Benchmarking the SOMOD Neural PC: A Deep Learning Engineer’s Experience

As part of the launch of SOMOD PCs, the MBUZZ Labs team tasked me with benchmarking the new SOMOD Neural PC to evaluate its AI performance. Given the workstation’s powerful specs, I set out to test its deep learning capabilities using industry-standard benchmarks.

Initial Benchmarking Roadblocks

My initial plan was to run Geekbench AI, but I quickly hit a roadblock—it wasn’t detecting the GPU, making it unsuitable for proper evaluation. Next, I considered MLPerf, one of the most recognized AI benchmarking suites. However, downloading the massive datasets required for MLPerf benchmarks would have consumed significant bandwidth and time, so I decided to look for an alternative.

That’s when I stumbled upon PyTorch’s deep learning benchmark repository on GitHub. It provided a lightweight yet effective way to test AI workloads. I made modifications to the test files and adjusted batch sizes to optimize performance for the RTX 5880 Ada GPU. With these tweaks, I finally had a solid benchmarking setup.

Hardware Specifications of the SoMod Neural PC

Here’s the hardware that powers this AI workstation:

CPU: AMD Ryzen Threadripper PRO 5955WX (16 Cores, 32 Threads)
Memory: 256GB RAM
GPU: NVIDIA RTX 5880 Ada Generation (49GB VRAM)
NVIDIA Driver: 550.120
CUDA Version: 12.6.77
cuDNN: Latest version included in the CUDA toolkit
Motherboard: ASUSTeK Pro WS WRX80E-SAGE SE WIFI
Operating System: Ubuntu 22.04.5 LTS
PyTorch Version: 2.5.0a0+e000cf0ad9.nv24.10

Deep Learning Benchmarks: Real-World Performance

After setting up the benchmarks, I tested various AI workloads to see how well the SoMod Neural PC handled them. Here’s what I found:

1. Object Detection – SSD (Single Shot MultiBox Detector) with ResNet50 Backbone on COCO Dataset

AMP (Automatic Mixed Precision): 324 images per second
FP32: 184 images per second

AMP provided a 1.75x performance boost, demonstrating real-time object detection capabilities.

2. NLP – BERT on SQuAD v1.1

FP16 (with AMP): 237 sequences per second
FP32: 134 sequences per second

AMP delivered a 40% speedup over FP32, making it ideal for language model training.

3. Neural Machine Translation – GNMT

FP16 (with AMP): ~133,000 tokens per second
FP32: ~82,000 tokens per second

AMP allowed for larger batch sizes and faster translation speeds.

4. Recommendation Systems – Neural Collaborative Filtering (NCF) on ML-20M Dataset

FP16: ~22.1 million samples per second
FP32: ~21.5 million samples per second

Both precisions performed similarly, showing that precision optimizations aren’t always necessary for recommendation models.

5. Image Classification – ResNet50 on Synthetic ImageNet Data

AMP: 1,113 images per second
FP32: 624 images per second

With nearly 2x the speed of FP32, AMP is the best choice for high-speed training.

Benchmark Results: The Numbers Speak for Themselves

Model	Precision	Key Metric	Value	Notes
SSD (ResNet50 Backbone)	AMP	Images/Second	324	Real-time object detection capabilities.
SSD (ResNet50 Backbone)	FP32	Images/Second	184	The FP32 precision result shows half performance if compared to AMP.
BERT (Base, SQuAD)	FP16 (with AMP)	Training Sequences/Second	237
BERT (Base, SQuAD)	FP32	Training Sequences/Second	134	40% lower performance compared to FP16 with AMP.
GNMT (Translation)	FP16 (with AMP)	Tokens/Second (Training)	~133,000	Effective mixed precision allows larger batch sizes and faster runs.
GNMT (Translation)	FP32	Tokens/Second (Training)	~82,000
NCF (Recommendation)	FP16	Training Samples/Second	~22.1 Million
NCF (Recommendation)	FP32	Training Samples/Second	~21.5 Million	Results nearly on par with its FP16 counterpart.
ResNet50 (ImageNet)	AMP	Images/Second	~1113	AMP provides nearly double the performance compared to FP32.
ResNet50 (ImageNet)	FP32	Images/Second	~624

Final Thoughts: Why the SoMod Neural PC is a Game Changer

🔹 AMP Optimizations Deliver Significant Speed Gains – Automatic Mixed Precision enhances performance without compromising accuracy.

🔹 High Memory Capacity – The 256GB RAM enables large batch sizes and efficient training.

🔹 Unrivaled AI Performance – The Threadripper PRO CPU + NVIDIA RTX 5880 Ada GPU combo ensures unmatched deep learning power.

🔹 AI-Ready Software Stack – With CUDA 12.6, cuDNN, and PyTorch optimizations, the system is future-proof for next-gen AI workloads.

Need a High-Performance AI Workstation?

If you’re looking for a powerful deep learning machine that can accelerate your AI workloads, the SoMod Neural PC is the ideal choice.

➡️ Order yours today at somodsystems.com and experience next-level AI computing! 🚀

By Ajsar Hisan
April 22, 2025

info@mbuzztech.com

Benchmarking the SOMOD Neural PC: A Deep Learning Engineer’s Experience

Initial Benchmarking Roadblocks

Hardware Specifications of the SoMod Neural PC

Deep Learning Benchmarks: Real-World Performance

1. Object Detection – SSD (Single Shot MultiBox Detector) with ResNet50 Backbone on COCO Dataset

2. NLP – BERT on SQuAD v1.1

3. Neural Machine Translation – GNMT

4. Recommendation Systems – Neural Collaborative Filtering (NCF) on ML-20M Dataset

5. Image Classification – ResNet50 on Synthetic ImageNet Data

Benchmark Results: The Numbers Speak for Themselves

Final Thoughts: Why the SoMod Neural PC is a Game Changer

Need a High-Performance AI Workstation?

Categories

Explore

Publications

Contact

info@mbuzztech.com

Benchmarking the SOMOD Neural PC: A Deep Learning Engineer’s Experience

Initial Benchmarking Roadblocks

Hardware Specifications of the SoMod Neural PC

Deep Learning Benchmarks: Real-World Performance

1. Object Detection – SSD (Single Shot MultiBox Detector) with ResNet50 Backbone on COCO Dataset

2. NLP – BERT on SQuAD v1.1

3. Neural Machine Translation – GNMT

4. Recommendation Systems – Neural Collaborative Filtering (NCF) on ML-20M Dataset

5. Image Classification – ResNet50 on Synthetic ImageNet Data

Benchmark Results: The Numbers Speak for Themselves

Final Thoughts: Why the SoMod Neural PC is a Game Changer

Need a High-Performance AI Workstation?

Related Posts

Custom Liquid-Cooled Workstations: Power, Performance, and Value for AI, HPC and Rendering

MBUZZ Technologies & Toshiba Gulf Unites

NVIDIA NIM: Simplifying AI Deployment for Developers Worldwide

Categories

Explore

Publications

Contact