Chatbots are everywhere! Many are improving the way we interact with websites, providing instant responses and engaging users. In this article, we’ll explore how to build a website specific chatbot using Streamlit and Google’s AI via the Gemini API. This chatbot will fetch content from your website and generate informative responses, creating a seamless user experience.
Getting Started
Before diving into the nitty-gritty, make sure you have Python installed on your system (this will work in Google Colab too). You’ll also need a few additional Python libraries to support our chatbot development.
Install the Necessary Libraries
Open your terminal and run these commands to install the required libraries:
pip install streamlit
pip install google-generativeai
pip install requests
pip install beautifulsoup4
Next, clone our chatbot repository and install its dependencies:
git clone https://github.com/YourUsername/website-chatbot.git
cd website-chatbot
pip install -r requirements.txt
Now that your environment is set up, let’s start building the core functionalities.
Fetching Website Content
Web scraping allows us to extract data from websites. For our chatbot, we’ll use the requests library to fetch web pages and BeautifulSoup to parse the HTML content.
Here’s the function to fetch and process the website content:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
def fetch_website_content(url, max_pages=20):
try:
content_dict = {}
visited = set()
to_visit = [url]
while to_visit and len(visited) < max_pages:
current_url = to_visit.pop(0)
if current_url in visited:
continue
response = requests.get(current_url)
soup = BeautifulSoup(response.content, 'html.parser')
# Store the content with its URL
content_dict[current_url] = {
'title': soup.title.string if soup.title else 'No title',
'content': soup.get_text()
}
visited.add(current_url)
# Find child links
for link in soup.find_all('a', href=True):
child_url = urljoin(url, link['href'])
if child_url.startswith(url) and child_url not in visited and child_url not in to_visit:
to_visit.append(child_url)
return content_dict
except Exception as e:
print(f"Error fetching website content: {str(e)}")
return {}
This function will crawl the given URL and its child links, fetching content and storing it in a dictionary.
Configuring the Gemini API
Google’s Gemini API is a powerhouse for natural language understanding and response generation. We need to configure it with our API key:
import google.generativeai as genai
genai.configure(api_key='your_gemini_api_key')
Now, let’s create a function to generate responses based on the fetched website content:
def generate_response(user_input, content_dict):
try:
# Prepare the conversation history for the model
conversation = [
{"role": "user", "parts": ["""You are a helpful assistant for the website content provided to you. Please follow these guidelines:
1. Use the information from the website content to answer questions directly and specifically.
2. When discussing specific topics or sections, always mention the relevant page title and provide the source URL if available.
3. If there's relevant information for the user's question, explain its key points briefly.
4. If you're not sure about specific details, say so, but provide information on how the user can find out more (e.g., relevant web pages).
5. Be concise but informative, focusing on the most relevant information for the user's query.
Here's the content from the website:"""]},
{"role": "model", "parts": ["Understood. I'll provide specific, relevant information from the website content, including page titles, URLs when available, and key points. I'll be concise and direct in my responses, focusing on the user's query and providing guidance on how to get more information if needed."]},
]
# Add content from each page
for url, page_data in content_dict.items():
conversation.append({"role": "user", "parts": [f"Content from {url}:\nTitle: {page_data['title']}\n\n{page_data['content'][:2000]}..."]})
conversation.append({"role": "user", "parts": [user_input]})
# Generate a response from the Gemini model using the correct method
response = genai.generate(
model='gemini-pro', # Specify the correct model
prompt=conversation
)
if response and 'candidates' in response and len(response['candidates']) > 0:
return response['candidates'][0]['content']
else:
return "I'm sorry, but I couldn't find specific information to answer your question. Please check the website directly for the most up-to-date information."
except Exception as e:
print(f"An error occurred: {str(e)}")
return "I'm sorry, but an error occurred while processing your request. Please try again or check the website directly for assistance."
This function builds a conversation history, incorporates the fetched content, and generates a response based on the user’s input.
Building the Streamlit Interface
Streamlit allows us to build web applications easily. Below is the code for configuring our chatbot and creating the user interface:
import streamlit as st
st.set_page_config(page_title="Website Chatbot with Gemini", page_icon="🤖", layout="wide")
# Initialize session state for conversation history and website content
if "messages" not in st.session_state:
st.session_state.messages = []
if "website_content" not in st.session_state:
st.session_state.website_content = {}
st.title("Website Chatbot with Gemini")
st.markdown("""
### How This Chatbot Works
- This chatbot provides information based on the content of the website you specify.
- It can summarize information from multiple pages and make connections between different topics.
- The chatbot will provide source URLs when possible to allow you to read more on the website.
- If the chatbot is unsure about specific details, it will say so but try to provide related information.
- The goal is to be informative and helpful while maintaining accuracy about the website's content.
""")
Handling User Inputs
Next, we create input fields and manage user interactions with the chatbot:
# Gemini API Key input
gemini_api_key = st.text_input("Enter your Gemini API Key:", type="password")
if gemini_api_key:
# Initialize the Gemini client
genai.configure(api_key=gemini_api_key)
# Website URL input
website_url = st.text_input("Enter the website URL:")
if website_url:
if website_url != st.session_state.get('last_url', ''):
with st.spinner("Loading website content..."):
st.session_state.website_content = fetch_website_content(website_url)
st.session_state.last_url = website_url
st.session_state.messages = [] # Clear previous conversation
st.success("Website content loaded. You can now ask questions about the website!")
# Display chat history
for message in st.session_state.messages:
with st.chat_message("user" if st.session_state.messages.index(message) % 2 == 0 else "assistant"):
st.write(message)
# User input
user_input = st.chat_input("Your question about the website:")
if user_input:
st.session_state.messages.append(user_input)
with st.chat_message("user"):
st.write(user_input)
with st.chat_message("assistant"):
response = generate_response(user_input, st.session_state.website_content)
st.write(response)
st.session_state.messages.append(response)
else:
st.warning("Please enter a website URL to start chatting about its content.")
else:
st.warning("Please enter your Gemini API Key to use the chatbot.")
This segment sets up the input fields for the Gemini API key and the website URL. It handles loading the website content, processing user queries, and displaying the conversation history.
Testing and Debugging
Testing and debugging are crucial steps in ensuring your chatbot works seamlessly.
Testing Tips:
- Test with different website URLs.
- Check how the chatbot handles missing or incorrect website content.
- Ensure responses are accurate and helpful.
Debugging Tips:
- Use st.error() to display error messages in Streamlit.
- Check the console for error logs and resolve issues step-by-step.
- Confirm that the API key and other configurations are correctly set up.
Enhancing and Customizing Your Chatbot
Once the basic functionality is in place, you can enhance and customize your chatbot further. Consider adding:
- Sophisticated error handling.
- Improved user interfaces using additional Streamlit components.
- Integration with other APIs or data sources for more functionality.
Congratulations! You have successfully built a website chatbot using Streamlit and Google’s Gemini API. This chatbot fetches and processes website content to generate informative and relevant responses, enhancing user interaction on your website. Continue to customize and enhance your chatbot to meet specific needs and provide a robust user experience.
By following the steps outlined in this article, you’ve gained valuable insights into web scraping, API integration, and building interactive web applications with Streamlit. Keep experimenting and expanding on this foundation to create even more powerful and user-friendly chatbots.