Skip to product information
1 of 1

sLLM or RAG AI Server 4X Furiosa AI RNGD NPU Server

sLLM or RAG AI Server 4X Furiosa AI RNGD NPU Server

Regular price $35,000.00 USD
Regular price $40,000.00 USD Sale price $35,000.00 USD
Sale Sold out
Taxes included.

FuriosaAI RNGD NPU-based AI Inference Server: Lower-End Specification

 

(Optimized for cost-efficiency and essential functionality for 4x RNGD NPUs)

This specification aims to provide a functional and stable platform for AI inference with four FuriosaAI RNGD NPUs.

 

Key Component: FuriosaAI RNGD NPU Accelerator Cards

 

  • 4x FuriosaAI RNGD (Renegade) NPU:

    • Architecture: Tensor Contraction Processor (TCP)

    • Process Node: TSMC 5nm

    • Performance: 512 TOPS (INT8), 256 TFLOPS (BF16), 512 TFLOPS (FP8), 1024 TOPS (INT4)

    • Memory: 48GB HBM3

    • Memory Bandwidth: 1.5 TB/s

    • Interface: PCIe Gen5 x16

    • TDP: 150W (Passive cooling, relies on system airflow)

    • Form Factor: PCIe dual-slot, full-height, 3/4 length

    • Note: The NPU itself remains high-performance. The "lower spec" refers to the host server components supporting it.

 

Typical Server Platform Specifications (for integrating 4x FuriosaAI RNGD NPUs)

 

  1. NPU Quantity:

    • 4x FuriosaAI RNGD NPU cards.

  2. Processor (CPU):

    • Single Intel Xeon Scalable Processor (e.g., a lower-core-count 4th/5th Gen Sapphire Rapids/Emerald Rapids variant with sufficient PCIe lanes) or a lower-core-count AMD EPYC Processor (e.g., Genoa/Bergamo Series with sufficient PCIe lanes).

    • Key Feature: Prioritize PCIe Gen5 x16 lane count (64-128 lanes from a single CPU) over raw core count, as long as the core count is sufficient for OS and data handling (e.g., 16-24 cores).

  3. Motherboard:

    • Single-Socket Server Motherboard (Intel LGA4677 or AMD SP5/SP6 Socket)

    • Key Features:

      • At least 4x PCIe Gen5 x16 physical slots: Must support this lane configuration from a single CPU.

      • Support for DDR5 Registered ECC RAM.

      • Minimum of 2x NVMe (PCIe Gen4/Gen5) SSD slots (U.2 or M.2).

      • Robust VRM and Power Delivery: Sufficient for the chosen CPU and four 150W NPUs.

      • Integrated Server Management (IPMI/BMC).

  4. System Memory (RAM):

    • 256GB (4x 64GB or 8x 32GB) DDR5 Registered ECC RAM

  5. Storage:

    • Primary: 2TB NVMe PCIe Gen4 SSD (U.2 or M.2)

      • For Operating System, AI frameworks, and core models.

    • Secondary (Optional for lower spec): If budget allows, a 4TB Enterprise SATA SSD for additional model storage could be added later.

  6. Power Supply Unit (PSU):

    • 1200W - 1600W 80 PLUS Platinum Certified, Redundant Power Supplies (recommended).

    • Power Connectors: Must have at least four 12VHPWR (or compatible PCIe 8-pin) connectors for the NPUs.

  7. Server Chassis:

    • 4U Rackmount Server Chassis optimized for Multi-Accelerator Cards.

    • Key Features:

      • Designed for passive cooling: The chassis must facilitate excellent front-to-back airflow across the PCIe cards.

      • Adequate spacing for 4x dual-slot RNGD cards.

      • Standard hot-swap drive bays (e.g., 4-8 bays).

      • Basic rail kits for rack mounting.

    • Examples: A more basic 4U server chassis from Supermicro or a similar OEM that prioritizes direct airflow over fancy features.

  8. Cooling System:

    • CPU: Standard server-grade active heatsink for the chosen single CPU.

    • Chassis Fans: A sufficient number of high-CFM, hot-swappable fans (e.g., 3-6 fans) strategically placed for strong front-to-back airflow through the NPU compartment.

  9. Network:

    • 2x 10GbE Network Interface Cards (NICs) (often integrated into server motherboards).

  10. Software Stack:

    • Operating System: Ubuntu Server 22.04 LTS (or a minimal RHEL/CentOS Stream installation).

    • FuriosaAI SDK: Essential for NPU functionality and model deployment.

    • AI Frameworks: PyTorch, TensorFlow, ONNX Runtime (as needed).

    • Containerization: Docker (Kubernetes is more for orchestration across multiple servers).

Low stock: 1 left

View full details

Collapsible content

Collapsible row

Collapsible row

Collapsible row