Add builder pattern to MongoDBAtlasVectorStore and refactor package name

The MongoDBAtlasVectorStore implementation has been enhanced with a builder
pattern to provide a more flexible and type-safe way to configure the vector
store. This change improves the developer experience by making the API more
intuitive and less error-prone.

The old constructors and configuration classes have been deprecated in favor
of the builder pattern. This aligns with Spring's best practices for
configuration APIs.

Additionally, the package has been refactored to
org.springframework.ai.vectorstore.mongodb.atlas to avoid having
multiple vector store modules share the same package name.

Documentation has been updated to reflect these changes and provide
examples of using the new builder pattern.review
This commit is contained in:
Soby Chacko
2024-12-09 17:59:35 -05:00
committed by Mark Pollack
parent 03a9379bb7
commit 2d3bdcddbd
10 changed files with 494 additions and 255 deletions

View File

@@ -1,37 +1,29 @@
= MongoDB Atlas
This section walks you through setting up MongoDB Atlas as a vector store to use with Spring AI.
== What is MongoDB Atlas?
https://www.mongodb.com/products/platform/atlas-database[MongoDB Atlas] is the fully-managed cloud database from MongoDB available in AWS, Azure, and GCP.
Atlas supports native Vector Search and full text search on your MongoDB document data.
https://www.mongodb.com/products/platform/atlas-vector-search[MongoDB Atlas Vector Search] allows you to store your embeddings in MongoDB documents, create vector search indexes, and perform KNN searches with an approximate nearest neighbor algorithm (Hierarchical Navigable Small Worlds).
You can use the `$vectorSearch` aggregation operator in a MongoDB aggregation stage to perform a search on your vector embeddings.
== Prerequisites
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. To get started with MongoDB Atlas, you can follow the instructions https://www.mongodb.com/docs/atlas/getting-started/[here]. Ensure that your IP address is included in your Atlas projects https://www.mongodb.com/docs/atlas/security/ip-access-list/#std-label-access-list[access list].
- An `EmbeddingModel` instance to compute the document embeddings. Several options are available. Refer to the https://docs.spring.io/spring-ai/reference/api/embeddings.html#available-implementations[EmbeddingModel] section for more information.
- An environment to set up and run a Java application.
* An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. To get started with MongoDB Atlas, you can follow the instructions https://www.mongodb.com/docs/atlas/getting-started/[here]. Ensure that your IP address is included in your Atlas project's https://www.mongodb.com/docs/atlas/security/ip-access-list/#std-label-access-list[access list].
* A running MongoDB Atlas instance with Vector Search enabled
* Collection with vector search index configured
* Collection schema with id (string), content (string), metadata (document), and embedding (vector) fields
* Proper access permissions for index and collection operations
== Auto-configuration
Spring AI provides Spring Boot auto-configuration for the MongoDB Atlas Vector Store.
To enable it, add the following dependency to your project's Maven `pom.xml` file:
[source, xml]
[source,xml]
----
<dependency>
<groupId>org.springframework.ai</groupId>
@@ -39,7 +31,7 @@ To enable it, add the following dependency to your project's Maven `pom.xml` fil
</dependency>
----
or to your Gradle `build.gradle` build file.
or to your Gradle `build.gradle` build file:
[source,groovy]
----
@@ -48,215 +40,192 @@ dependencies {
}
----
The vector store implementation can initialize the requisite schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `...initialize-schema=true` in the `application.properties` file.
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
=== Schema Initialization
The vector store implementation can initialize the requisite schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `spring.ai.vectorstore.mongodb.initialize-schema=true` in the `application.properties` file.
The vector store implementation can initialize the requisite schema for you, but you must opt-in by setting `spring.ai.vectorstore.mongodb.initialize-schema=true` in the `application.properties` file.
Alternatively you can opt-out the initialization and create the index manually using the MongoDB Atlas UI, Atlas Administration API, or Atlas CLI, which can be useful if the index needs advanced mapping or additional configuration.
NOTE: this is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
When `initializeSchema` is set to `true`, the following actions are performed automatically:
- **Collection Creation**: The specified collection for storing vectors will be created if it does not already exist.
- **Search Index Creation**: A search index will be created based on the configuration properties.
If you're running a free or shared tier cluster, you must separately create the index through the Atlas UI, Atlas Administration API, or Atlas CLI.
NOTE: If you have an existing Atlas Vector Search index called `vector_index` on the `springai_test.vector_store collection`, Spring AI won't create an additional index. Because of this, you might experience errors later if the existing index was configured with incompatible settings, such as a different number of dimensions.
Ensure that your index has the following configuration:
[source,json]
----
{
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
}
]
}
----
Please have a look at the list of <<mongodbvector-properties,configuration parameters>> for the vector store to learn about the default values and configuration options.
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
Here is an example of the needed bean:
Now you can auto-wire the `MongoDBAtlasVectorStore` as a vector store in your application:
[source,java]
----
@Autowired VectorStore vectorStore;
// ...
List<Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to MongoDB Atlas
vectorStore.add(documents);
// Retrieve documents similar to a query
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
----
[[mongodbvector-properties]]
=== Configuration Properties
To connect to MongoDB Atlas and use the `MongoDBAtlasVectorStore`, you need to provide access details for your instance.
A simple configuration can be provided via Spring Boot's `application.yml`:
[source,yaml]
----
spring:
data:
mongodb:
uri: <mongodb atlas connection string>
database: <database name>
ai:
vectorstore:
mongodb:
initialize-schema: true
collection-name: custom_vector_store
vector-index-name: custom_vector_index
path-name: custom_embedding
metadata-fields-to-filter: author,year
----
Properties starting with `spring.ai.vectorstore.mongodb.*` are used to configure the `MongoDBAtlasVectorStore`:
[cols="2,5,1",stripes=even]
|===
|Property | Description | Default Value
|`spring.ai.vectorstore.mongodb.initialize-schema`| Whether to initialize the required schema | `false`
|`spring.ai.vectorstore.mongodb.collection-name` | The name of the collection to store the vectors | `vector_store`
|`spring.ai.vectorstore.mongodb.vector-index-name` | The name of the vector search index | `vector_index`
|`spring.ai.vectorstore.mongodb.path-name` | The path where vectors are stored | `embedding`
|`spring.ai.vectorstore.mongodb.metadata-fields-to-filter` | Comma-separated list of metadata fields that can be used for filtering | empty list
|===
== Manual Configuration
Instead of using the Spring Boot auto-configuration, you can manually configure the MongoDB Atlas vector store. For this you need to add the `spring-ai-mongodb-atlas-store` to your project:
[source,xml]
----
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-mongodb-atlas-store</artifactId>
</dependency>
----
or to your Gradle `build.gradle` build file:
[source,groovy]
----
dependencies {
implementation 'org.springframework.ai:spring-ai-mongodb-atlas-store'
}
----
Create a `MongoTemplate` bean:
[source,java]
----
@Bean
public MongoTemplate mongoTemplate() {
return new MongoTemplate(MongoClients.create("<mongodb atlas connection string>"), "<database name>");
}
----
Then create the `MongoDBAtlasVectorStore` bean using the builder pattern:
[source,java]
----
@Bean
public VectorStore vectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
return MongoDBAtlasVectorStore.builder()
.mongoTemplate(mongoTemplate)
.embeddingModel(embeddingModel)
.collectionName("custom_vector_store") // Optional: defaults to "vector_store"
.vectorIndexName("custom_vector_index") // Optional: defaults to "vector_index"
.pathName("custom_embedding") // Optional: defaults to "embedding"
.numCandidates(500) // Optional: defaults to 200
.metadataFieldsToFilter(List.of("author", "year")) // Optional: defaults to empty list
.initializeSchema(true) // Optional: defaults to false
.batchingStrategy(new TokenCountBatchingStrategy()) // Optional: defaults to TokenCountBatchingStrategy
.build();
}
// This can be any EmbeddingModel implementation
@Bean
public EmbeddingModel embeddingModel() {
// Can be any other EmbeddingModel implementation.
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
}
----
=== Configuration properties
You can use the following properties in your Spring Boot configuration to customize the MongoDB Atlas vector store.
[source,xml]
----
...
spring.data.mongodb.uri=<connection string>
spring.data.mongodb.database=<database name>
spring.ai.vectorstore.mongodb.collection-name=vector_store
spring.ai.vectorstore.mongodb.initialize-schema=true
spring.ai.vectorstore.mongodb.path-name=embedding
spring.ai.vectorstore.mongodb.indexName=vector_index
spring.ai.vectorstore.mongodb.metadata-fields-to-filter=foo
----
[stripes=even]
|===
|Property| Description | Default value
|`spring.ai.vectorstore.mongodb.collection-name`| The name of the collection to store the vectors. | `vector_store`
|`spring.ai.vectorstore.mongodb.initialize-schema`| whether to initialize the backend schema for you | `false`
|`spring.ai.vectorstore.mongodb.path-name`| The name of the path to store the vectors. | `embedding`
|`spring.ai.vectorstore.mongodb.indexName`| The name of the index to store the vectors. | `vector_index`
|`spring.ai.vectorstore.mongodb.metadata-fields-to-filter` | comma separated values that specifies which metadata fields can be used for filtering when querying the vector store. Needed so that metadata indexes are created if they already don't exist | empty list
|===
== Manual Configuration Properties
If you prefer to manually configure the MongoDB Atlas vector store without auto-configuration, you can do so by directly setting up the `MongoDBAtlasVectorStore` and its dependencies.
=== Example Configuration
[source,java]
----
@Configuration
public class VectorStoreConfig {
@Bean
public MongoDBAtlasVectorStore vectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
MongoDBVectorStoreConfig config = MongoDBVectorStoreConfig.builder()
.withCollectionName("custom_vector_store")
.withVectorIndexName("custom_vector_index")
.withPathName("custom_embedding_path")
.withMetadataFieldsToFilter(List.of("author", "year"))
.build();
return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel, config, true);
}
}
----
=== Properties
- `collectionName`: The name of the collection to store the vectors.
- `vectorIndexName`: The name of the vector index.
- `pathName`: The path where vectors are stored.
- `metadataFieldsToFilter`: A list of metadata fields to filter.
You can enable schema initialization by passing `true` as the last parameter in the `MongoDBAtlasVectorStore` constructor
== Adding Documents
To add documents to the vector store, you need to convert your input documents into the `Document` type and call the `addDocuments()` method. This method will use the `EmbeddingModel` to compute the embeddings and save them to the MongoDB collection.
[source,java]
----
List<Document> docs = List.of(
new Document("Proper tuber planting involves site selection, timing, and care. Choose well-drained soil and adequate sun exposure. Plant in spring, with eyes facing upward at a depth two to three times the tuber's height. Ensure 4-12 inch spacing based on tuber size. Adequate moisture is needed, but avoid overwatering. Mulching helps preserve moisture and prevent weeds.", Map.of("author", "A", "type", "post")),
new Document("Successful oil painting requires patience, proper equipment, and technique. Prepare a primed canvas, sketch lightly, and use high-quality brushes and oils. Paint 'fat over lean' to prevent cracking. Allow each layer to dry before applying the next. Clean brushes often and work in a well-ventilated space.", Map.of("author", "A")),
new Document("For a natural lawn, select the right grass type for your climate. Water 1 to 1.5 inches per week, avoid overwatering, and use organic fertilizers. Regular aeration helps root growth and prevents compaction. Practice natural pest control and overseeding to maintain a dense lawn.", Map.of("author", "B", "type", "post")) );
vectorStore.add(docs);
----
== Deleting Documents
To delete documents from the vector store, use the `delete()` method. This method takes a list of document IDs and removes the corresponding documents from the MongoDB collection.
[source,java]
----
List<String> ids = List.of("id1", "id2", "id3"); // Replace with actual document IDs
vectorStore.delete(ids);
----
== Performing Similarity Search
To perform a similarity search, construct a `SearchRequest` object with the desired query parameters and call the `similaritySearch()` method. This method will return a list of documents that match the query based on vector similarity.
[source,java]
----
List<Document> results = vectorStore.similaritySearch(
SearchRequest
.query("learn how to grow things")
.withTopK(2)
);
----
== Metadata Filtering
Metadata filtering allows for more refined queries by filtering results based on specified metadata fields. This feature uses the MongoDB Query API to perform filtering operations in conjunction with vector searches.
You can leverage the generic, portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with MongoDB Atlas as well.
=== Filter Expressions
The `MongoDBAtlasFilterExpressionConverter` class converts filter expressions into MongoDB Atlas metadata filter expressions. The supported operations include:
For example, you can use either the text expression language:
[source,java]
----
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(5)
.withSimilarityThreshold(0.7)
.withFilterExpression("author in ['john', 'jill'] && article_type == 'blog'"));
----
- `$and`
- `$or`
- `$eq`
- `$ne`
- `$lt`
- `$lte`
- `$gt`
- `$gte`
- `$in`
- `$nin`
These operations enable filtering logic to be applied to metadata fields associated with documents in the vector store.
=== Example of a Filter Expression
Heres an example of how to use a filter expression in a similarity search:
or programmatically using the `Filter.Expression` DSL:
[source,java]
----
FilterExpressionBuilder b = new FilterExpressionBuilder();
List<Document> results = vectorStore.similaritySearch(
SearchRequest.defaults()
.withQuery("learn how to grow things")
.withTopK(2)
.withSimilarityThreshold(0.5)
.withFilterExpression(this.b.eq("author", "A").build())
);
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(5)
.withSimilarityThreshold(0.7)
.withFilterExpression(b.and(
b.in("author", "john", "jill"),
b.eq("article_type", "blog")).build()));
----
NOTE: Those (portable) filter expressions get automatically converted into the proprietary MongoDB Atlas filter expressions.
For example, this portable filter expression:
[source,sql]
----
author in ['john', 'jill'] && article_type == 'blog'
----
is converted into the proprietary MongoDB Atlas filter format:
[source,json]
----
{
"$and": [
{
"$or": [
{ "metadata.author": "john" },
{ "metadata.author": "jill" }
]
},
{
"metadata.article_type": "blog"
}
]
}
----
== Tutorials and Code Examples
To get started with Spring AI and MongoDB:
* See the https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/spring-ai/#std-label-spring-ai[Getting Started guide for Spring AI Integration].

View File

@@ -17,13 +17,14 @@
package org.springframework.ai.autoconfigure.vectorstore.mongo;
import java.util.Arrays;
import java.util.List;
import io.micrometer.observation.ObservationRegistry;
import org.springframework.ai.embedding.BatchingStrategy;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.embedding.TokenCountBatchingStrategy;
import org.springframework.ai.vectorstore.MongoDBAtlasVectorStore;
import org.springframework.ai.vectorstore.mongodb.atlas.MongoDBAtlasVectorStore;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationConvention;
import org.springframework.beans.factory.ObjectProvider;
import org.springframework.boot.autoconfigure.AutoConfiguration;
@@ -34,6 +35,7 @@ import org.springframework.context.annotation.Bean;
import org.springframework.core.convert.converter.Converter;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.convert.MongoCustomConversions;
import org.springframework.util.CollectionUtils;
import org.springframework.util.MimeType;
import org.springframework.util.StringUtils;
@@ -64,25 +66,35 @@ public class MongoDBAtlasVectorStoreAutoConfiguration {
ObjectProvider<VectorStoreObservationConvention> customObservationConvention,
BatchingStrategy batchingStrategy) {
var builder = MongoDBAtlasVectorStore.MongoDBVectorStoreConfig.builder();
MongoDBAtlasVectorStore.MongoDBBuilder builder = MongoDBAtlasVectorStore.builder()
.mongoTemplate(mongoTemplate)
.embeddingModel(embeddingModel)
.initializeSchema(properties.isInitializeSchema())
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
.batchingStrategy(batchingStrategy);
if (StringUtils.hasText(properties.getCollectionName())) {
builder.withCollectionName(properties.getCollectionName());
String collectionName = properties.getCollectionName();
if (StringUtils.hasText(collectionName)) {
builder.collectionName(collectionName);
}
if (StringUtils.hasText(properties.getPathName())) {
builder.withPathName(properties.getPathName());
}
if (StringUtils.hasText(properties.getIndexName())) {
builder.withVectorIndexName(properties.getIndexName());
}
if (!properties.getMetadataFieldsToFilter().isEmpty()) {
builder.withMetadataFieldsToFilter(properties.getMetadataFieldsToFilter());
}
MongoDBAtlasVectorStore.MongoDBVectorStoreConfig config = builder.build();
return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel, config, properties.isInitializeSchema(),
observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP),
customObservationConvention.getIfAvailable(() -> null), batchingStrategy);
String pathName = properties.getPathName();
if (StringUtils.hasText(pathName)) {
builder.pathName(pathName);
}
String indexName = properties.getIndexName();
if (StringUtils.hasText(indexName)) {
builder.vectorIndexName(indexName);
}
List<String> metadataFields = properties.getMetadataFieldsToFilter();
if (!CollectionUtils.isEmpty(metadataFields)) {
builder.metadataFieldsToFilter(metadataFields);
}
return builder.build();
}
@Bean

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import org.springframework.ai.vectorstore.filter.Filter;
import org.springframework.ai.vectorstore.filter.converter.AbstractFilterExpressionConverter;

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.util.ArrayList;
import java.util.Collections;
@@ -33,6 +33,8 @@ import org.springframework.ai.embedding.EmbeddingOptionsBuilder;
import org.springframework.ai.embedding.TokenCountBatchingStrategy;
import org.springframework.ai.model.EmbeddingUtils;
import org.springframework.ai.observation.conventions.VectorStoreProvider;
import org.springframework.ai.vectorstore.AbstractVectorStoreBuilder;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationContext;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationConvention;
@@ -45,7 +47,80 @@ import org.springframework.data.mongodb.core.query.Query;
import org.springframework.util.Assert;
/**
* A {@link VectorStore} implementation that uses MongoDB Atlas for storing and
* MongoDB Atlas-based vector store implementation using the Atlas Vector Search.
*
* <p>
* The store uses a MongoDB collection to persist vector embeddings along with their
* associated document content and metadata. By default, it uses the "vector_store"
* collection with a vector search index for similarity search operations.
* </p>
*
* <p>
* Features:
* </p>
* <ul>
* <li>Automatic schema initialization with configurable collection and index
* creation</li>
* <li>Support for cosine similarity search</li>
* <li>Metadata filtering using MongoDB Atlas Search syntax</li>
* <li>Configurable similarity thresholds for search results</li>
* <li>Batch processing support with configurable batching strategies</li>
* <li>Observation and metrics support through Micrometer</li>
* </ul>
*
* <p>
* Basic usage example:
* </p>
* <pre>{@code
* MongoDBAtlasVectorStore vectorStore = MongoDBAtlasVectorStore.builder()
* .mongoTemplate(mongoTemplate)
* .embeddingModel(embeddingModel)
* .collectionName("vector_store")
* .initializeSchema(true)
* .build();
*
* // Add documents
* vectorStore.add(List.of(
* new Document("content1", Map.of("key1", "value1")),
* new Document("content2", Map.of("key2", "value2"))
* ));
*
* // Search with filters
* List<Document> results = vectorStore.similaritySearch(
* SearchRequest.query("search text")
* .withTopK(5)
* .withSimilarityThreshold(0.7)
* .withFilterExpression("key1 == 'value1'")
* );
* }</pre>
*
* <p>
* Advanced configuration example:
* </p>
* <pre>{@code
* MongoDBAtlasVectorStore vectorStore = MongoDBAtlasVectorStore.builder()
* .mongoTemplate(mongoTemplate)
* .embeddingModel(embeddingModel)
* .collectionName("custom_vectors")
* .vectorIndexName("custom_vector_index")
* .pathName("custom_embedding")
* .numCandidates(500)
* .metadataFieldsToFilter(List.of("category", "author"))
* .initializeSchema(true)
* .batchingStrategy(new TokenCountBatchingStrategy())
* .build();
* }</pre>
*
* <p>
* Database Requirements:
* </p>
* <ul>
* <li>MongoDB Atlas cluster with Vector Search enabled</li>
* <li>Collection with vector search index configured</li>
* <li>Collection schema with id (string), content (string), metadata (document), and
* embedding (vector) fields</li>
* <li>Proper access permissions for index and collection operations</li>
* </ul>
*
* @author Chris Smith
* @author Soby Chacko
@@ -78,39 +153,67 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
private final MongoTemplate mongoTemplate;
private final EmbeddingModel embeddingModel;
private final String collectionName;
private final MongoDBVectorStoreConfig config;
private final String vectorIndexName;
private final MongoDBAtlasFilterExpressionConverter filterExpressionConverter = new MongoDBAtlasFilterExpressionConverter();
private final String pathName;
private final List<String> metadataFieldsToFilter;
private final int numCandidates;
private final MongoDBAtlasFilterExpressionConverter filterExpressionConverter;
private final boolean initializeSchema;
private final BatchingStrategy batchingStrategy;
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public MongoDBAtlasVectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel,
boolean initializeSchema) {
this(mongoTemplate, embeddingModel, MongoDBVectorStoreConfig.defaultConfig(), initializeSchema);
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public MongoDBAtlasVectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel,
MongoDBVectorStoreConfig config, boolean initializeSchema) {
this(mongoTemplate, embeddingModel, config, initializeSchema, ObservationRegistry.NOOP, null,
new TokenCountBatchingStrategy());
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public MongoDBAtlasVectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel,
MongoDBVectorStoreConfig config, boolean initializeSchema, ObservationRegistry observationRegistry,
VectorStoreObservationConvention customObservationConvention, BatchingStrategy batchingStrategy) {
super(observationRegistry, customObservationConvention);
this(builder().mongoTemplate(mongoTemplate)
.embeddingModel(embeddingModel)
.collectionName(config.collectionName)
.vectorIndexName(config.vectorIndexName)
.pathName(config.pathName)
.numCandidates(config.numCandidates)
.metadataFieldsToFilter(config.metadataFieldsToFilter)
.initializeSchema(initializeSchema)
.observationRegistry(observationRegistry)
.customObservationConvention(customObservationConvention)
.batchingStrategy(batchingStrategy));
}
this.mongoTemplate = mongoTemplate;
this.embeddingModel = embeddingModel;
this.config = config;
protected MongoDBAtlasVectorStore(MongoDBBuilder builder) {
super(builder);
this.initializeSchema = initializeSchema;
this.batchingStrategy = batchingStrategy;
Assert.notNull(builder.mongoTemplate, "MongoTemplate must not be null");
this.mongoTemplate = builder.mongoTemplate;
this.collectionName = builder.collectionName;
this.vectorIndexName = builder.vectorIndexName;
this.pathName = builder.pathName;
this.numCandidates = builder.numCandidates;
this.metadataFieldsToFilter = builder.metadataFieldsToFilter;
this.filterExpressionConverter = builder.filterExpressionConverter;
this.initializeSchema = builder.initializeSchema;
this.batchingStrategy = builder.batchingStrategy;
}
@Override
@@ -120,8 +223,8 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
}
// Create the collection if it does not exist
if (!this.mongoTemplate.collectionExists(this.config.collectionName)) {
this.mongoTemplate.createCollection(this.config.collectionName);
if (!this.mongoTemplate.collectionExists(this.collectionName)) {
this.mongoTemplate.createCollection(this.collectionName);
}
// Create search index
createSearchIndex();
@@ -151,17 +254,17 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
List<org.bson.Document> vectorFields = new ArrayList<>();
vectorFields.add(new org.bson.Document().append("type", "vector")
.append("path", this.config.pathName)
.append("path", this.pathName)
.append("numDimensions", this.embeddingModel.dimensions())
.append("similarity", "cosine"));
vectorFields.addAll(this.config.metadataFieldsToFilter.stream()
vectorFields.addAll(this.metadataFieldsToFilter.stream()
.map(fieldName -> new org.bson.Document().append("type", "filter").append("path", "metadata." + fieldName))
.toList());
return new org.bson.Document().append("createSearchIndexes", this.config.collectionName)
return new org.bson.Document().append("createSearchIndexes", this.collectionName)
.append("indexes",
List.of(new org.bson.Document().append("name", this.config.vectorIndexName)
List.of(new org.bson.Document().append("name", this.vectorIndexName)
.append("type", "vectorSearch")
.append("definition", new org.bson.Document("fields", vectorFields))));
}
@@ -195,7 +298,7 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
for (Document document : documents) {
MongoDBDocument mdbDocument = new MongoDBDocument(document.getId(), document.getContent(),
document.getMetadata(), embeddings.get(documents.indexOf(document)));
this.mongoTemplate.save(mdbDocument, this.config.collectionName);
this.mongoTemplate.save(mdbDocument, this.collectionName);
}
}
@@ -203,7 +306,7 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
public Optional<Boolean> doDelete(List<String> idList) {
Query query = new Query(org.springframework.data.mongodb.core.query.Criteria.where(ID_FIELD_NAME).in(idList));
var deleteRes = this.mongoTemplate.remove(query, this.config.collectionName);
var deleteRes = this.mongoTemplate.remove(query, this.collectionName);
long deleteCount = deleteRes.getDeletedCount();
return Optional.of(deleteCount == idList.size());
@@ -221,8 +324,8 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
? this.filterExpressionConverter.convertExpression(request.getFilterExpression()) : "";
float[] queryEmbedding = this.embeddingModel.embed(request.getQuery());
var vectorSearch = new VectorSearchAggregation(EmbeddingUtils.toList(queryEmbedding), this.config.pathName,
this.config.numCandidates, this.config.vectorIndexName, request.getTopK(), nativeFilterExpressions);
var vectorSearch = new VectorSearchAggregation(EmbeddingUtils.toList(queryEmbedding), this.pathName,
this.numCandidates, this.vectorIndexName, request.getTopK(), nativeFilterExpressions);
Aggregation aggregation = Aggregation.newAggregation(vectorSearch,
Aggregation.addFields()
@@ -231,7 +334,7 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
.build(),
Aggregation.match(new Criteria(SCORE_FIELD_NAME).gte(request.getSimilarityThreshold())));
return this.mongoTemplate.aggregate(aggregation, this.config.collectionName, org.bson.Document.class)
return this.mongoTemplate.aggregate(aggregation, this.collectionName, org.bson.Document.class)
.getMappedResults()
.stream()
.map(d -> mapMongoDocument(d, queryEmbedding))
@@ -242,11 +345,157 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
public VectorStoreObservationContext.Builder createObservationContextBuilder(String operationName) {
return VectorStoreObservationContext.builder(VectorStoreProvider.MONGODB.value(), operationName)
.withCollectionName(this.config.collectionName)
.withCollectionName(this.collectionName)
.withDimensions(this.embeddingModel.dimensions())
.withFieldName(this.config.pathName);
.withFieldName(this.pathName);
}
/**
* Creates a new builder instance for MongoDBAtlasVectorStore.
* @return a new MongoDBBuilder instance
*/
public static MongoDBBuilder builder() {
return new MongoDBBuilder();
}
public static class MongoDBBuilder extends AbstractVectorStoreBuilder<MongoDBBuilder> {
private MongoTemplate mongoTemplate;
private String collectionName = DEFAULT_VECTOR_COLLECTION_NAME;
private String vectorIndexName = DEFAULT_VECTOR_INDEX_NAME;
private String pathName = DEFAULT_PATH_NAME;
private int numCandidates = DEFAULT_NUM_CANDIDATES;
private List<String> metadataFieldsToFilter = Collections.emptyList();
private boolean initializeSchema = false;
private BatchingStrategy batchingStrategy = new TokenCountBatchingStrategy();
private MongoDBAtlasFilterExpressionConverter filterExpressionConverter = new MongoDBAtlasFilterExpressionConverter();
/**
* @throws IllegalArgumentException if mongoTemplate is null
*/
public MongoDBBuilder mongoTemplate(MongoTemplate mongoTemplate) {
Assert.notNull(mongoTemplate, "MongoTemplate must not be null");
this.mongoTemplate = mongoTemplate;
return this;
}
/**
* Configures the collection name. This must match the name of the collection for
* the Vector Search Index in Atlas.
* @param collectionName the name of the collection
* @return the builder instance
* @throws IllegalArgumentException if collectionName is null or empty
*/
public MongoDBBuilder collectionName(String collectionName) {
Assert.hasText(collectionName, "Collection Name must not be null or empty");
this.collectionName = collectionName;
return this;
}
/**
* Configures the vector index name. This must match the name of the Vector Search
* Index Name in Atlas.
* @param vectorIndexName the name of the vector index
* @return the builder instance
* @throws IllegalArgumentException if vectorIndexName is null or empty
*/
public MongoDBBuilder vectorIndexName(String vectorIndexName) {
Assert.hasText(vectorIndexName, "Vector Index Name must not be null or empty");
this.vectorIndexName = vectorIndexName;
return this;
}
/**
* Configures the path name. This must match the name of the field indexed for the
* Vector Search Index in Atlas.
* @param pathName the name of the path
* @return the builder instance
* @throws IllegalArgumentException if pathName is null or empty
*/
public MongoDBBuilder pathName(String pathName) {
Assert.hasText(pathName, "Path Name must not be null or empty");
this.pathName = pathName;
return this;
}
/**
* Sets the number of candidates for vector search.
* @param numCandidates the number of candidates
* @return the builder instance
*/
public MongoDBBuilder numCandidates(int numCandidates) {
this.numCandidates = numCandidates;
return this;
}
/**
* Sets the metadata fields to filter in vector search.
* @param metadataFieldsToFilter list of metadata field names
* @return the builder instance
* @throws IllegalArgumentException if metadataFieldsToFilter is null or empty
*/
public MongoDBBuilder metadataFieldsToFilter(List<String> metadataFieldsToFilter) {
Assert.notEmpty(metadataFieldsToFilter, "Fields list must not be empty");
this.metadataFieldsToFilter = metadataFieldsToFilter;
return this;
}
/**
* Sets whether to initialize the schema.
* @param initializeSchema true to initialize schema, false otherwise
* @return the builder instance
*/
public MongoDBBuilder initializeSchema(boolean initializeSchema) {
this.initializeSchema = initializeSchema;
return this;
}
/**
* Sets the batching strategy for vector operations.
* @param batchingStrategy the batching strategy to use
* @return the builder instance
* @throws IllegalArgumentException if batchingStrategy is null
*/
public MongoDBBuilder batchingStrategy(BatchingStrategy batchingStrategy) {
Assert.notNull(batchingStrategy, "batchingStrategy must not be null");
this.batchingStrategy = batchingStrategy;
return this;
}
/**
* Sets the filter expression converter.
* @param converter the filter expression converter to use
* @return the builder instance
* @throws IllegalArgumentException if converter is null
*/
public MongoDBBuilder filterExpressionConverter(MongoDBAtlasFilterExpressionConverter converter) {
Assert.notNull(converter, "filterExpressionConverter must not be null");
this.filterExpressionConverter = converter;
return this;
}
/**
* Builds the MongoDBAtlasVectorStore instance.
* @return a new MongoDBAtlasVectorStore instance
* @throws IllegalStateException if the builder is in an invalid state
*/
@Override
public MongoDBAtlasVectorStore build() {
validate();
return new MongoDBAtlasVectorStore(this);
}
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public static final class MongoDBVectorStoreConfig {
private final String collectionName;
@@ -267,14 +516,17 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
this.metadataFieldsToFilter = builder.metadataFieldsToFilter;
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public static Builder builder() {
return new Builder();
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public static MongoDBVectorStoreConfig defaultConfig() {
return builder().build();
}
@Deprecated(since = "1.0.0-M5", forRemoval = true)
public static final class Builder {
private String collectionName = DEFAULT_VECTOR_COLLECTION_NAME;
@@ -297,7 +549,6 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
* @return this builder
*/
public Builder withCollectionName(String collectionName) {
Assert.notNull(collectionName, "Collection Name must not be null");
Assert.notNull(collectionName, "Collection Name must not be empty");
this.collectionName = collectionName;
return this;
@@ -310,7 +561,6 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
* @return this builder
*/
public Builder withVectorIndexName(String vectorIndexName) {
Assert.notNull(vectorIndexName, "Vector Index Name must not be null");
Assert.notNull(vectorIndexName, "Vector Index Name must not be empty");
this.vectorIndexName = vectorIndexName;
return this;
@@ -323,7 +573,6 @@ public class MongoDBAtlasVectorStore extends AbstractObservationVectorStore impl
* @return this builder
*/
public Builder withPathName(String pathName) {
Assert.notNull(pathName, "Path Name must not be null");
Assert.notNull(pathName, "Path Name must not be empty");
this.pathName = pathName;
return this;

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.util.List;

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.util.List;

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
@@ -30,6 +30,8 @@ import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.condition.EnabledIfEnvironmentVariable;
import org.springframework.ai.document.DocumentMetadata;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.core.io.DefaultResourceLoader;
import org.testcontainers.junit.jupiter.Container;
import org.testcontainers.junit.jupiter.Testcontainers;
@@ -257,11 +259,12 @@ class MongoDBAtlasVectorStoreIT {
@Bean
public VectorStore vectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel,
MongoDBAtlasVectorStore.MongoDBVectorStoreConfig.builder()
.withMetadataFieldsToFilter(List.of("country", "year"))
.build(),
true);
return MongoDBAtlasVectorStore.builder()
.mongoTemplate(mongoTemplate)
.embeddingModel(embeddingModel)
.metadataFieldsToFilter(List.of("country", "year"))
.initializeSchema(true)
.build();
}
@Bean

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import org.testcontainers.utility.DockerImageName;

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
@@ -40,6 +40,8 @@ import org.springframework.ai.observation.conventions.SpringAiKind;
import org.springframework.ai.observation.conventions.VectorStoreProvider;
import org.springframework.ai.openai.OpenAiEmbeddingModel;
import org.springframework.ai.openai.api.OpenAiApi;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.observation.DefaultVectorStoreObservationConvention;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationDocumentation.HighCardinalityKeyNames;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationDocumentation.LowCardinalityKeyNames;
@@ -185,11 +187,15 @@ public class MongoDbVectorStoreObservationIT {
@Bean
public VectorStore vectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel,
ObservationRegistry observationRegistry) {
return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel,
MongoDBAtlasVectorStore.MongoDBVectorStoreConfig.builder()
.withMetadataFieldsToFilter(List.of("country", "year"))
.build(),
true, observationRegistry, null, new TokenCountBatchingStrategy());
return MongoDBAtlasVectorStore.builder()
.mongoTemplate(mongoTemplate)
.embeddingModel(embeddingModel)
.metadataFieldsToFilter(List.of("country", "year"))
.initializeSchema(true)
.observationRegistry(observationRegistry)
.customObservationConvention(null)
.batchingStrategy(new TokenCountBatchingStrategy())
.build();
}
@Bean

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
package org.springframework.ai.vectorstore;
package org.springframework.ai.vectorstore.mongodb.atlas;
import java.util.List;