Tag: Ternary Weights
All the talks with the tag "Ternary Weights".
Scalable MatMul-free Language Modeling
Sagar Prakash BaradPublished: at 02:00 PMThis talk presents a paper that proposes a scalable MatMul-free language model, challenging the assumption that matrix multiplications are essential for high-performing language models. The paper demonstrates that by using ternary weights and element-wise Hadamard products, MatMul operations can be completely removed from large language models while maintaining strong performance. The paper provides an optimized implementation of the MatMul-free language model, achieving significant reductions in memory usage and latency compared to conventional models.