A Practical Guide to Building Gen AI App for Insights from Images Using Streamlit

This guide outlines the creation of an AI application for image analysis using Streamlit and Gemini API, providing step-by-step instructions for integrating AI functionalities

Amit Kulkarni
Python in Plain English

--

Source: Author

We will explore the below topics in this blog

  • Introduction
  • Accessing the API Key
  • App development
  • Executing the app
  • Test for various scenarios
  • Conclusion & FAQs

Introduction

The Gen AI App and Streamlit are two cutting-edge technologies revolutionizing data analysis. This comprehensive guide aims to empower users to unlock valuable insights from documents and images through the seamless integration of these technologies. It provides a roadmap with instructions, expert tips, and practical examples, guiding users through the process from setting up a development environment to deploying a Gen AI App-enhanced Streamlit application. By the end of the journey, users will have the expertise and confidence to fully harness the potential of Gen AI App and Streamlit, facilitating the effortless extraction of insights from documents and images. The ultimate goal is to redefine data interaction, empowering businesses and individuals to analyze extensive data sets with ease and efficiency.

If you are new to application development Plotly Dash then consider this quick start guide Plotly Dash Vs Streamlit | A Beginners Guide For App Development In Python.

Accessing the API Key

Google provides users with the option to generate an API key via its AI studio, which can then be securely stored and seamlessly integrated into the code, much like other AI tools. In this scenario, we utilize the .env file to store the API key and subsequently load it into the code, as demonstrated in the example below.

load_dotenv()
os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

App development

We will use the Streamlit framework for developing the app. Streamlit is a Python-based framework for creating interactive web applications. Its simple syntax and intuitive interface allow developers to create powerful data-driven applications without extensive coding knowledge. Streamlit’s built-in widgets and functionality enable easy prototyping and deployment of data visualization and machine learning applications. Its seamless integration with popular data science libraries like Pandas, Matplotlib, and Plotly enhances its capabilities. Whether a beginner or an experienced developer, Streamlit is an accessible platform for building sophisticated web applications for data exploration and analysis. In this section, we will build the app using Streamlit step by step starting with the layout, a few validations on the uploaded documents, and running the application.

Installing the libraries

We’ll establish a virtual environment for this project to maintain a clean environment, and subsequently, install the necessary libraries within this virtual environment. All the libraries can be installed using requirements.txt

<project path>virutalenv genai
<project path>genai\scripts\activate
<genai><project path>pip install -r requirements.txt

The Layout

We will need the below-listed controls on the app.

  1. A control to browse and upload a document on the left panel
  2. A text field to enter prompt on the canvas
  3. A button to initiate the process on the canvas
  4. Show the uploaded image on the screen
  5. A result section to show the response from the AI
st.set_page_config(page_title="Document & Image Analyzer")
st.sidebar.title("Upload Image")
input = st.text_input("Input Prompt: ", key="input")
uploaded_file = st.sidebar.file_uploader(
"Choose an image...", type=["jpg", "png", "jpeg"]
)

submit = st.button("Fetch Information")

if uploaded_file is not None:
st.image(uploaded_file, caption="Uploaded Image", use_column_width=True)
Source: Author | Fig1: Layout of the app

Processing and validations

We will write a function to process the uploaded image.


def input_image_setup(uploaded_file):
# Check if a file has been uploaded
if uploaded_file is not None:
# Read the file into bytes
bytes_data = uploaded_file.getvalue()

image_parts = [
{
"mime_type": uploaded_file.type,
"data": bytes_data,
}
]
return image_parts
else:
raise FileNotFoundError("No file uploaded")

A function to get the response from the Gemini API

def get_gemini_response(input, image, prompt):
model = genai.GenerativeModel("gemini-pro-vision")
response = model.generate_content([input, image[0], prompt])
return response.text

Executing the app

In your VS Code terminal, go to the project directory and enter the following command.

<project path>streamlit run ContentExtractor.py

If the code runs without errors, you will see an output indicating that the app is accessible via the localhost URL. This message confirms that the application is up and running locally, allowing you to access it through your web browser for testing and usage.

  You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://192.168.0.2:8501

Test for various scenarios

Complete code for the app can be accessed from Github

Conclusion

On a closing note, we explored the potential of AI in creating AI applications using the Gemini API and Streamlit. It highlights AI’s ability to extract insights from text and visual data, paving the way for developers to create robust data extraction and analysis applications. AI technologies are revolutionizing data interaction, offering innovative solutions to complex challenges. With the right tools and a visionary approach, developers can harness AI’s power to revolutionize the digital realm and unlock new possibilities.

Connect with me

Collection of blogs

Data Science Using Python and R
Generative AI Blogs
Python For Finance
App Development Using Python
GeoSpatial Analysis Using Python

FAQs

Q1: How can the Gemini API and Streamlit contribute to enhancing user productivity?
A1: Gemini API and Streamlit automate data analysis, content generation, and visualization tasks, increasing productivity and efficiency by providing intuitive interfaces and powerful AI capabilities.

Q2: What level of technical expertise is required to utilize the Gemini API and Streamlit effectively?
A2: Gemini API and Streamlit are user-friendly platforms with extensive documentation, accessible to users of varying technical proficiency levels, allowing beginners to start with basic functionality.

Q3: Can the Gemini API and Streamlit be integrated with other third-party tools or APIs?
A3: Gemini API and Streamlit enable integration with third-party tools, libraries, and APIs, enhancing applications with additional functionality like external data access, machine learning models, and cloud services, allowing users to tailor their projects to meet specific requirements.

Q4: What are some potential challenges or limitations associated with using the Gemini API and Streamlit?
A4: The Gemini API and Streamlit offer numerous benefits, but users may face challenges like managing large datasets, optimizing performance, and integrating complex AI models, as well as limitations in scalability.

Q5: How can users stay updated on new features, updates, and best practices for the Gemini API and Streamlit?
A5: Regularly checking official documentation, blog posts, and community forums, as well as attending conferences and participating in online discussions, helps users stay informed about new features, updates, and best practices for Gemini API and Streamlit.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

--

--