Tenstorrent Blackhole™ p100a

Tenstorrent Blackhole™ p100a

The Tenstorrent Blackhole™ p100a introduces a new approach to AI acceleration, built on a RISC-V architecture with 16 high-performance cores and 28 GB of GDDR6 memory. Packaged as a PCIe board with active cooling, it delivers up to 300W of compute performance for workstations and research environments.

It is designed for both training and inference of neural networks, offering an alternative to traditional GPU-based solutions. The architecture is particularly effective for large memory-intensive models such as LLMs, vision transformers, and reinforcement learning pipelines, while also enabling research into non-GPU compute paradigms.

Setup requires Linux, a robust power supply, and adequate cooling capacity. Begin by installing the Tenstorrent software stack and drivers. Integration with frameworks like PyTorch and TensorFlow is supported through bridge interfaces, allowing developers to port existing models with minimal changes.

Development should start with vendor-provided benchmarks and examples to validate performance. Profiling workloads is essential to identify optimization opportunities, with iterative tuning often required to maximize throughput. The open-source nature of the stack also allows developers to customize compilers and runtime behavior.

Use cases include efficient inference of large language models, quantization experiments, compiler and RISC-V accelerator research, and hybrid CPU/GPU/accelerator computing architectures. Researchers exploring next-generation ML hardware stacks benefit from its openness and programmability.

The card’s ~300W power draw demands careful infrastructure planning. Adequate airflow, PSU capacity, and chassis compatibility are prerequisites. Compared to CUDA-based ecosystems, the Tenstorrent developer community is smaller, requiring developers to be comfortable with hands-on integration and contributing to open-source efforts.

Success requires strong Linux skills, familiarity with ML frameworks, and experience in performance optimization. Knowledge of profiling, kernel tuning, and accelerator programming models will help unlock the hardware’s full potential.

Key Resources:
- Tenstorrent Official Site
- PyTorch and TensorFlow
- Tenstorrent Community (forums, GitHub) for open-source tools and developer discussions

Related Hardware

Shelby Computer

Shelby Computer

Custom AI workstation with Tenstorrent Blackhole™ P100a accelerator, AMD Ryzen 9 9950X3D CPU, 64GB DDR5 RAM, and 2TB NVMe SSD....

Booster_T1

Booster_T1

Booster_T1 is a humanoid robot with full-force joints and onboard NVIDIA Jetson AGX Orin (200 TOPS). Equipped with RGB-D vision,...