Understanding ONNX vs Quantized Models in LangChain4j
· 2 min read
When working with embeddings in LangChain4j, you might have noticed that many models are available in two flavors:
- Original (ONNX)
- Quantized (ONNX, with
-q
suffix)
But what’s the actual difference between them, and which one should you choose? Let’s break it down.