Contact Us

Enquiries

Whether you represent a corporate, a consultancy, a government or an MSSP, we’d love to hear from you. To discover just how our offensive security contractors could help, get in touch.




+44 (0)208 102 0765

Atlan Digital Limited
86-90 Paul Street
London
EC2A 4NE

Summary: Scalable MatMul-free Language Modeling 

ATLAN TEAM

Authors: Rui-Jie Zhu, Yu Zhang, Ethan Sifferman, Tyler Sheaves, Yiqiao Wang, Dustin Richmond, Peng Zhou, Jason K. Eshraghian

Abstract: This paper presents a method to eliminate matrix multiplication (MatMul) in large language models (LLMs), significantly reducing computational cost and memory usage. The proposed models perform comparably to state-of-the-art Transformers but use less memory, particularly during inference. The approach includes a GPU-efficient implementation and a custom FPGA hardware solution, highlighting the efficiency and scalability of MatMul-free models for billion-parameter scales.

For detailed insights, visit the full paper here.

Contact Us

How can we help?

Whether you represent a corporate, a consultancy, a government or an MSSP, we’dlove to hear from you. To discover just how our offensive security contractors could help, get in touch.