- EmbeddingClient implementation that computes, locally, sentence embeddings with SBERT transformers. - Uses pre-trained transformer models, serialized into Open Neural Network Exchange (ONNX) format. - Deep Java Library and the Microsoft ONNX Java Runtime are used to run the ONNX models and compute the embeddings efficiently. - Add default tokenizer.json and model.onnx for sentence-transformers/all-MiniLM-L6-v2. - Add, configurable resource caching service to allow caching remote (http/https) resources to the local FS. - README.md provides information on how to serialize ONNX models. - add Git LFS configuration for large onnx model files.
2 lines
43 B
Plaintext
2 lines
43 B
Plaintext
*.onnx filter=lfs diff=lfs merge=lfs -text
|