2024-06-08

AI Embedding Models - Voyage, AnyScale, Google Gemini

Voyage AI

Voyage is a team of leading AI researchers and engineers, building embedding models for better retrieval and RAG. ¹

setup

python3 -m venv env-embeddings
source env-embeddings/bin/activate

# `-I`  Ignore the installed packages, overwriting them.
# `-U`  Upgrade all specified packages to the newest available version.

pip3 install -U voyageai==0.2.3
pip3 install --upgrade --force-reinstall voyageai
pip3 show voyageai
pip3 index versions voyageai

hands-on

# import the 'voyageai' module
import voyageai
import os

# Create a 'Client' object from the 'voyageai' module and initialize it with your API key
vo = voyageai.Client(api_key=os.environ.get("VOYAGE_AI_API_KEY"))

# user query
user_query = "when apple is releasing their new Iphone?"


# The 'model' parameter is set to "voyage-2", and the 'input_type' parameter is set to "document"
documents_embeddings = vo.embed(
    [user_query], model="voyage-2", input_type="document"
).embeddings

# printing the embedding
print(documents_embeddings)

VOYAGE_AI_API_KEY=pa-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  python3 voyage-ai.py

Supported Embedding models, and more to come.

Model	Context Length (tokens)	Embedding Dimension	Description
voyage-code-2	16000	1536	Optimized for code retrieval (17% better than alternatives)
voyage-2	4000	1024	Embedding model with the best retrieval quality (better than OpenAI ada)
voyage-lite-02-instruct	4000	1024	Instruction-tuned for classification, clustering, and sentence textual similarity tasks

AnyScale AI

Anyscale, the company behind Ray, releases APIs for LLM developers to run and fine-tune open-source LLMs quickly, cost-efficiently, and at scale. ¹

setup

python3 -m venv env-embeddings
source env-embeddings/bin/activate

# `-I`  Ignore the installed packages, overwriting them.
# `-U`  Upgrade all specified packages to the newest available version.

pip3 install -U openai==1.33.0
pip3 install --upgrade --force-reinstall openai
pip3 show openai
pip3 index versions openai

hands-on llm

# import necessary modules
import openai
import os

# define the anyscale endpoint token
ANYSCALE_ENDPOINT_TOKEN = os.environ.get("ANYSCALE_ENDPOINT_TOKEN")

# Create an OpenAI client with the Anyscale base URL and API key
oai_client = openai.OpenAI(
  base_url="https://api.endpoints.anyscale.com/v1",
  api_key=ANYSCALE_ENDPOINT_TOKEN,
)

# Define the OpenAI model to be used for chat completions
model = "mistralai/Mistral-7B-Instruct-v0.1"

# Define a prompt for the chat completion
prompt = '''hello, how are you?
'''

# Use the AnyScale model for chat completions
# Send a user message using the defined prompt
response = oai_client.chat.completions.create(
  model=model,
  messages=[
    {"role": "user", "content": prompt}
  ],
)

# printing the response
print(response.choices[0].message.content)

ANYSCALE_ENDPOINT_TOKEN=esecret_xxxxxxxxxxxxxxxxxxxxxxxxxx \
  python3 anyscale-ai.py

hands-on embedding

# import necessary modules
import openai
import os

# Define the Anyscale endpoint token
ANYSCALE_ENDPOINT_TOKEN = os.environ.get("ANYSCALE_ENDPOINT_TOKEN")

# Create an OpenAI client with the Anyscale base URL and API key
oai_client = openai.OpenAI(
  base_url="https://api.endpoints.anyscale.com/v1",
  api_key=ANYSCALE_ENDPOINT_TOKEN,
)

# https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
# https://cookbook.openai.com/examples/using_embeddings
embeddings = oai_client.embeddings.create(
  model="thenlper/gte-large",
  input=["Your text string goes here"],
)
# print(embeddings.model_dump())
print(embeddings.data[0].embedding)

ANYSCALE_ENDPOINT_TOKEN=esecret_xxxxxxxxxxxxxxxxxxxxxxxxxx \
  python3 anyscale-ai.py

Supported LLM and Embedding models.

Model	Price ($/M tokens)
Mistral-7B-OpenOrca	0.15
Mistral-7B-Instruct-v0.1	0.15
Zephyr-7b-beta	0.15
Llama-Guard-7b	0.15
Llama-2-7b-chat-hf	0.15
NeuralHermes-2.5-Mistral-7B	0.15
Llama-2-13b-chat-hf	0.25
Mixtral-8x7B-Instruct-v0.1	0.50
Llama-2-70b-chat-hf	1.0
CodeLlama-34b-Instruct-hf	1.0
CodeLlama-70b-Instruct-hf	1.0
thenlper-gte-large	0.05
BAAI/bge-large-en-v1.5	0.05

Google Gemini AI

The free tier API usage is what makes it more interesting. ¹ ² ³

setup

python3 -m venv env-embeddings
source env-embeddings/bin/activate

# `-I`  Ignore the installed packages, overwriting them.
# `-U`  Upgrade all specified packages to the newest available version.

pip3 install -U google-generativeai==0.6.0 grpcio==1.64.1 grpcio-tools==1.62.2
pip3 install --upgrade --force-reinstall google-generativeai grpcio grpcio-tools
pip3 show google-generativeai grpcio grpcio-tools
pip3 index versions google-generativeai grpcio grpcio-tools

hands-on llm

# importing google.generativeai as genai
import google.generativeai as genai
import os

# setting the api key
genai.configure(api_key=os.environ.get("GOOGLE_GEMINI_API_KEY"))

# setting the text model
model = genai.GenerativeModel('gemini-pro')

# generating response
response = model.generate_content("What is the meaning of life?")

# printing the response
print(response.text)

GOOGLE_GEMINI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  python3 gemini-ai.py

hands-on embedding

# importing google.generativeai as genai
import google.generativeai as genai
import os

# setting the api key
genai.configure(api_key=os.environ.get("GOOGLE_GEMINI_API_KEY"))

# https://python.langchain.com/v0.2/docs/integrations/text_embedding/google_generative_ai/
# https://github.com/google-gemini/cookbook
# https://ai.google.dev/api/python/google/generativeai
# https://ai.google.dev/api/python/google/generativeai/generate_embeddings

title = "The next generation of AI for developers and Google Workspace"
sample_text = '''
Title: The next generation of AI for developers and Google Workspace
Full article:
Gemini API & Google AI Studio: An approachable way to explore and prototype with generative AI applications
'''

# https://github.com/google-gemini/cookbook/blob/main/examples/Talk_to_documents_with_embeddings.ipynb
# https://ai.google.dev/api/python/google/generativeai/embed_content
model = 'models/embedding-001'
embedding = genai.embed_content(
  model=model,
  content=[
    sample_text,
    "The next generation of AI for developers and Google Workspace"
  ],
  task_type="retrieval_query",
  # task_type="retrieval_document",
  # title is optional - only applicable when task_type is RETRIEVAL_DOCUMENT.
  # title=title,
)
print(embedding['embedding'][0])
print('-----')

model = 'models/embedding-001'
embedding = genai.embed_content(
  model=model,
  content= sample_text,
  task_type="retrieval_document",
  title=title,
)
print(embedding)

GOOGLE_GEMINI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  python3 gemini-ai.py

Supported Models.

Model Name	Task Type	Query per Minute
gemini-pro	Text	60
gemini-pro-vision	Image	60
embedding-001	Classification, clustering and more	Unknown

List of Task Types.

`task_type`	Description
`RETRIEVAL_QUERY`	Specifies the given text is a query in a search or retrieval setting.
`RETRIEVAL_DOCUMENT`	Specifies the given text is a document in a search or retrieval setting.
`SEMANTIC_SIMILARITY`	Specifies the given text is used for Semantic Textual Similarity (STS).
`CLASSIFICATION`	Specifies that the embedding is used for classification.
`CLUSTERING`	Specifies that the embedding is used for clustering.
`QUESTION_ANSWERING`	Specifies that the query embedding is used for answering questions. Use `RETRIEVAL_DOCUMENT` for the document side.
`FACT_VERIFICATION`	Specifies that the query embedding is used for fact verification.

other embedding models

AI Embedding Models - Google VertexAI