4-Bit Quantization LLM

News

10d

Microsoft Releases Largest 1-Bit LLM, Letting Powerful AI Run on Some Older Hardware

Microsoft’s model BitNet b1.58 2B4T is available on Hugging Face but doesn’t run on GPU and requires a proprietary framework.

10d

Microsoft’s “1‑bit” AI model runs on a CPU only, while matching larger systems

Memory requirements are the most obvious advantage of reducing the complexity of a model's internal weights. The BitNet b1.58 ...

GitHub5d

orca-zhang/bitsandbytes-rocm

matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions. There are ongoing efforts to support further hardware backends, i.e. Intel CPU + GPU, AMD GPU, Apple Silicon, hopefully NPU.

Microsoft Touts CPU-Based AI Model as Energy-Efficient Alternative

Microsoft’s new BitNet b1.58 model significantly reduces memory and energy requirements while matching the capabilities of ...

Tasnim News Agency9d

Microsoft Touts CPU-Based AI Model as Energy-Efficient Alternative

native 1-bit LLM trained at scale», with 2 billion parameters and a training dataset of 4 trillion tokens. Unlike previous post-training quantization attempts, which often degrade performance ...

The Register on MSN6d

El Reg's essential guide to deploying LLMs in production

Running GenAI models is easy. Scaling them to thousands of users, not so much Hands On You can spin up a chatbot with ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results