Vertex AI Agent Engine is a Google Cloud service that helps you build and scale AI agents in production. You can use the Agent Engine with MongoDB Atlas and your preferred framework to build AI agents for a variety of use cases, including agentic RAG.
Get Started
The following tutorial demonstrates how you can use the Agent Engine with Atlas to build a RAG agent that can answer questions about sample data. It uses MongoDB Vector Search with LangChain to implement the retrieval tools for the agent.
Prerequisites
Before you begin, ensure you have the following:
- An Atlas cluster in your preferred Google Cloud region. To create a new cluster, see Create a Cluster. You can also get started with Atlas through the Google Cloud Marketplace. 
- A Google Cloud project with Vertex AI enabled. To set up a project, see Set up a project and a development environment in the Google Cloud documentation. 
Set up your Environment
Create an interactive Python notebook by saving a file
with the .ipynb extension in Google Colab.
This notebook allows you to
run Python code snippets individually, and you'll use
it to run the code in this tutorial.
Install the required packages.
In your notebook environment, install the required packages:
!pip install --upgrade --quiet \     "google-cloud-aiplatform[langchain,agent_engines]" requests datasets pymongo langchain langchain-community langchain-mongodb langchain-google-vertexai google-cloud-aiplatform langchain_google_genai requests beautifulsoup4 
Create the MongoDB Vector Search indexes.
Run the following code in your notebook to create the
MongoDB collections and MongoDB Vector Search indexes used to store and query
your data for this tutorial. Replace <connection-string> with your
cluster's connection string.
Note
Replace <connection-string> with the connection string for your
Atlas cluster or local Atlas deployment.
Your connection string should use the following format:
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net 
To learn more, see Connect to a Cluster via Drivers.
Your connection string should use the following format:
mongodb://localhost:<port-number>/?directConnection=true 
To learn more, see Connection Strings.
from pymongo import MongoClient from pymongo.operations import SearchIndexModel client = MongoClient("<connection-string>") # Replace with your connection string db = client["AGENT-ENGINE"] stars_wars_collection = db["sample_starwars_embeddings"] stars_trek_collection = db["sample_startrek_embeddings"] # Create your index model, then create the search index search_index_model = SearchIndexModel(    definition={       "fields": [          {          "type": "vector",          "path": "embedding",          "numDimensions": 768,          "similarity": "cosine"          }       ]    },    name="vector_index",    type="vectorSearch" ) # Create the indexes stars_wars_collection.create_search_index(model=search_index_model) stars_trek_collection.create_search_index(model=search_index_model) 
To learn more about creating a MongoDB Vector Search index, see How to Index Fields for Vector Search.
Initialize the Vertex AI SDK.
Run the following code in your notebook, replacing the placeholder values with your Google Cloud project ID, region, and staging bucket:
PROJECT_ID = "<your-project-id>"  # Replace with your project ID LOCATION = "<gcp-region>"         # Replace with your preferred region, e.g. "us-central1" STAGING_BUCKET = "gs://<your-bucket-name>"  # Replace with your bucket import vertexai vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=STAGING_BUCKET) 
Ingest Data into Atlas
Run the following code to scrape sample data from Wikipedia about
Star Wars and Star Trek, convert the text into vector embeddings
using the text-embedding-005 model, and then store this data in
the corresponding collections in Atlas.
import requests from bs4 import BeautifulSoup from pymongo import MongoClient import certifi from vertexai.language_models import TextEmbeddingModel # Scrape the website content def scrape_website(url):     response = requests.get(url)     soup = BeautifulSoup(response.text, 'html.parser')     content = ' '.join([p.text for p in soup.find_all('p')])     return content # Split the content into chunks of 1000 characters def split_into_chunks(text, chunk_size=1000):     return [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)] def get_text_embeddings(chunks):     model = TextEmbeddingModel.from_pretrained("text-embedding-005")     embeddings = model.get_embeddings(chunks)     return [embedding.values for embedding in embeddings] def write_to_mongoDB(embeddings, chunks, db_name, coll_name):     client = MongoClient("<connection-string>", tlsCAFile=certifi.where()) # Replace placeholder with your Atlas connection string     db = client[db_name]     collection = db[coll_name]     for i in range(len(chunks)):         collection.insert_one({             "chunk": chunks[i],             "embedding": embeddings[i]         }) # Process Star Wars data content = scrape_website("https://en.wikipedia.org/wiki/Star_Wars") chunks = split_into_chunks(content) embeddings_starwars = get_text_embeddings(chunks) write_to_mongoDB(embeddings_starwars, chunks, "AGENT-ENGINE", "sample_starwars_embeddings") # Process Star Trek data content = scrape_website("https://en.wikipedia.org/wiki/Star_Trek") chunks = split_into_chunks(content) embeddings_startrek = get_text_embeddings(chunks) write_to_mongoDB(embeddings_startrek, chunks, "AGENT-ENGINE", "sample_startrek_embeddings") 
Tip
You can view your data in the Atlas UI
by navigating to the AGENT-ENGINE database and selecting
the sample_starwars_embeddings and sample_startrek_embeddings
collections.
Create the Agent
In this section, you define tools that the agent can use to query your collections using MongoDB Vector Search, create a memory system to maintain conversation context, and then initialize the agent using LangChain.
Define tools for the agent.
Create the following two tools:
Run the following code to create a tool that
uses MongoDB Vector Search to query the sample_starwars_embeddings
collection:
def star_wars_query_tool(     query: str ):     """     Retrieves vectors from a MongoDB database and uses them to answer a question related to Star wars.     Args:         query: The question to be answered about star wars.     Returns:         A dictionary containing the response to the question.     """     from langchain.chains import ConversationalRetrievalChain, RetrievalQA     from langchain_mongodb import MongoDBAtlasVectorSearch     from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI     from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory     from langchain.prompts import PromptTemplate     prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Do not return any answers from your own knowledge. Respond only in 2 or 3 sentences.     {context}     Question: {question}     """     PROMPT = PromptTemplate(         template=prompt_template, input_variables=["context", "question"]     )     # Replace with your connection string to your Atlas cluster     connection_string = "<connection-string>"     embeddings = VertexAIEmbeddings(model_name="text-embedding-005")     vs = MongoDBAtlasVectorSearch.from_connection_string(         connection_string=connection_string,         namespace="AGENT-ENGINE.sample_starwars_embeddings",         embedding=embeddings,         index_name="vector_index",         embedding_key="embedding",         text_key="chunk",     )     llm = ChatVertexAI(         model_name="gemini-pro",         convert_system_message_to_human=True,         max_output_tokens=1000,     )     retriever = vs.as_retriever(         search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}     )     memory = ConversationBufferWindowMemory(         memory_key="chat_history", k=5, return_messages=True     )     conversation_chain = ConversationalRetrievalChain.from_llm(         llm=llm,         retriever=retriever,         memory=memory,         combine_docs_chain_kwargs={"prompt": PROMPT},     )     response = conversation_chain({"question": query})     return response 
Run the following code to create a tool that
uses MongoDB Vector Search to query the sample_startrek_embeddings
collection:
def star_trek_query_tool(     query: str ):     """     Retrieves vectors from a MongoDB database and uses them to answer a question related to star trek.     Args:         query: The question to be answered about star trek.     Returns:         A dictionary containing the response to the question.     """     from langchain.chains import ConversationalRetrievalChain, RetrievalQA     from langchain_mongodb import MongoDBAtlasVectorSearch     from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI     from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory     from langchain.prompts import PromptTemplate     prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Do not return any answers from your own knowledge. Respond only in 2 or 3 sentences.     {context}     Question: {question}     """     PROMPT = PromptTemplate(         template=prompt_template, input_variables=["context", "question"]     )     # Replace with your connection string to your Atlas cluster     connection_string = "<connection-string>"     embeddings = VertexAIEmbeddings(model_name="text-embedding-005")     vs = MongoDBAtlasVectorSearch.from_connection_string(         connection_string=connection_string,         namespace="AGENT-ENGINE.sample_startrek_embeddings",         embedding=embeddings,         index_name="vector_index",         embedding_key="embedding",         text_key="chunk",     )     llm = ChatVertexAI(         model_name="gemini-pro",         convert_system_message_to_human=True,         max_output_tokens=1000,     )     retriever = vs.as_retriever(         search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}     )     memory = ConversationBufferWindowMemory(         memory_key="chat_history", k=5, return_messages=True     )     conversation_chain = ConversationalRetrievalChain.from_llm(         llm=llm,         retriever=retriever,         memory=memory,         combine_docs_chain_kwargs={"prompt": PROMPT},     )     response = conversation_chain({"question": query})     return response 
Create a memory system.
You can use LangChain to create memory for your agent so that it can maintain conversation context across multiple prompts:
from langchain.memory import ChatMessageHistory # Initialize session history store = {} def get_session_history(session_id: str):   if session_id not in store:     store[session_id] = ChatMessageHistory()   return store[session_id] 
Initialize the agent.
Create the agent using LangChain. This agent uses the tools and memory system that you defined.
from vertexai.preview.reasoning_engines import LangchainAgent # Specify the language model model = "gemini-1.5-pro-001" # Initialize the agent with your tools agent = LangchainAgent(   model=model,   chat_history=get_session_history,   model_kwargs={"temperature": 0},   tools=[star_wars_query_tool, star_trek_query_tool],   agent_executor_kwargs={"return_intermediate_steps": True}, ) 
To test the agent with a sample query:
# Test your agent response = agent.query(     input="Who was the antagonist in Star wars and who played them? ",     config={"configurable": {"session_id": "demo"}}, ) display(Markdown(response["output"])) 
The main antagonist in the Star Wars series is Darth Vader, a dark lord of the Sith. He was originally played by David Prowse in the original trilogy, and later voiced by James Earl Jones. In the prequel trilogy, he appears as Anakin Skywalker, and was played by Hayden Christensen. 
Deploy the Agent
In this section, you deploy your agent to the Vertex AI Agent Engine as a managed service. This allows you to scale your agent and use it in production without managing the underlying infrastructure.
Deploy your agent.
Run the following code to configure and deploy the agent in the Vertex AI Agent Engine:
from vertexai import agent_engines remote_agent = agent_engines.create(   agent,   requirements=[     "google-cloud-aiplatform[agent_engines,langchain]",     "cloudpickle==3.0.0",     "pydantic>=2.10",     "requests",     "langchain-mongodb",     "pymongo",     "langchain-google-vertexai",   ], ) 
Retrieve the project URL.
Run the following code to retrieve the project number associated with your project ID. This project number will be used to construct the complete resource name for your deployed agent:
from googleapiclient import discovery from IPython.display import display, Markdown # Retrieve the project number associated with your project ID service = discovery.build("cloudresourcemanager", "v1") request = service.projects().get(projectId=PROJECT_ID) response = request.execute() project_number = response["projectNumber"] print(f"Project Number: {project_number}") # The deployment creates a unique ID for your agent that you can find in the output 
Test the agent.
Run the following code to use your agent. Replace the placeholder with your agent's full resource name:
Note
After deployment, your agent will have a unique resource name in the following format:
projects/<project-number>/locations/<gcp-region>/reasoningEngines/<unique-id>
from vertexai.preview import reasoning_engines # Replace with your agent's full resource name from the previous step REASONING_ENGINE_RESOURCE_NAME = "<resource-name>" remote_agent = reasoning_engines.ReasoningEngine(REASONING_ENGINE_RESOURCE_NAME) response = remote_agent.query(     input="tell me about episode 1 of star wars",     config={"configurable": {"session_id": "demo"}}, ) print(response["output"]) response = remote_agent.query(     input="Who was the main character in this series",     config={"configurable": {"session_id": "demo"}}, ) print(response["output"]) 
Star Wars: Episode I - The Phantom Menace was the first film installment released as part of the prequel trilogy. It was released on May 19, 1999. The main plot lines involve the return of Darth Sidious, the Jedi's discovery of young Anakin Skywalker, and the invasion of Naboo by the Trade Federation. The main character in Star Wars is Luke Skywalker. He is a young farm boy who dreams of adventure and becomes a Jedi Knight. He fights against the evil Galactic Empire alongside his friends, Princess Leia and Han Solo. 
You can also ask the agent about Star Trek using the same session:
response = remote_agent.query(     input="what is episode 1 of star trek?",     config={"configurable": {"session_id": "demo"}}, ) print(response["output"]) 
Episode 1 of Star Trek is called "The Man Trap". It was first aired on September 8, 1966. The story involves the Enterprise crew investigating the disappearance of a crew on a scientific outpost. It turns out that the crew members were killed by a creature that can take on someone else's form after it kills them. 
Next Steps
You can also debug and optimize your agents by enabling tracing in the Agent Engine. Refer to the Vertex AI Agent Engine documentation for other features and examples.
To learn more about the LangChain MongoDB integration, see Integrate MongoDB with LangChain.