Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/update api #43

Merged
merged 5 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 42 additions & 48 deletions examples/retrieval/semantic_search.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "initial_id",
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import taskingai\n",
"# Load TaskingAI API Key from environment variable\n",
"from taskingai.retrieval import Collection\n",
"from taskingai.retrieval.text_splitter import TokenTextSplitter"
]
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
Expand All @@ -37,21 +37,19 @@
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# choose an available text_embedding model from your project\n",
"embedding_model_id = \"YOUR_EMBEDDING_MODEL_ID\""
],
"metadata": {
"collapsed": false
},
"id": "388eb6fa46f66b52"
"id": "388eb6fa46f66b52",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# create a collection\n",
"def create_collection() -> Collection:\n",
Expand All @@ -67,12 +65,12 @@
"metadata": {
"collapsed": false
},
"id": "7c7d4e2cc2f2f494"
"id": "7c7d4e2cc2f2f494",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# Check collection status. \n",
"# Only when status is \"READY\" can you insert records and query chunks.\n",
Expand All @@ -82,70 +80,64 @@
"metadata": {
"collapsed": false
},
"id": "eb5dee18aa83c5e4"
"id": "eb5dee18aa83c5e4",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# create record 1 (machine learning)\n",
"taskingai.retrieval.create_record(\n",
" collection_id=collection.collection_id,\n",
" title=\"Machine Learning\",\n",
" type=\"text\",\n",
" content=\"Machine learning is a subfield of artificial intelligence (AI) that involves the development of algorithms that allow computers to learn from and make decisions or predictions based on data. The term \\\"machine learning\\\" was coined by Arthur Samuel in 1959. In other words, machine learning enables a system to automatically learn and improve from experience without being explicitly programmed. This is achieved by feeding the system massive amounts of data, which it uses to learn patterns and make inferences. There are three main types of machine learning: 1. Supervised Learning: This is where the model is given labeled training data and the goal of learning is to generalize from the training data to unseen situations in a principled way. 2. Unsupervised Learning: This involves training on a dataset without explicit labels. The goal might be to discover inherent groupings or patterns within the data. 3. Reinforcement Learning: In this type, an agent learns to perform actions based on reward/penalty feedback to achieve a goal. It's commonly used in robotics, gaming, and navigation. Deep learning, a subset of machine learning, uses neural networks with many layers (\\\"deep\\\" structures) and has been responsible for many recent breakthroughs in AI, including speech recognition, image recognition, and natural language processing. It's important to note that machine learning is a rapidly developing field, with new techniques and applications emerging regularly.\",\n",
" text_splitter=TokenTextSplitter(\n",
" chunk_size=100, # maximum tokens of each chunk\n",
" chunk_overlap=10, # token overlap between chunks\n",
" ),\n",
" text_splitter={\"type\": \"token\", \"chunk_size\": 100, \"chunk_overlap\": 10},\n",
")"
],
"metadata": {
"collapsed": false
},
"id": "f783de4624047df7"
"id": "f783de4624047df7",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# create record 2 (Michael Jordan)\n",
"taskingai.retrieval.create_record(\n",
" collection_id=collection.collection_id,\n",
" type=\"text\",\n",
" content=\"Michael Jordan, often referred to by his initials MJ, is considered one of the greatest players in the history of the National Basketball Association (NBA). He was known for his scoring ability, defensive prowess, competitiveness, and clutch performances. Born on February 17, 1963, Jordan played 15 seasons in the NBA, primarily with the Chicago Bulls, but also with the Washington Wizards. His professional career spanned two decades from 1984 to 2003, during which he won numerous awards and set multiple records. Here are some key highlights of his career: - Scoring: Jordan won the NBA scoring title a record 10 times. He also has the highest career scoring average in NBA history, both in the regular season (30.12 points per game) and in the playoffs (33.45 points per game). - Championships: He led the Chicago Bulls to six NBA championships and was named Finals MVP in all six of those Finals (1991-1993, 1996-1998). - MVP Awards: Jordan was named the NBA's Most Valuable Player (MVP) five times (1988, 1991, 1992, 1996, 1998). - Defensive Ability: He was named to the NBA All-Defensive First Team nine times and won the NBA Defensive Player of the Year award in 1988. - Olympics: Jordan also won two Olympic gold medals with the U.S. basketball team, in 1984 and 1992. - Retirements and Comebacks: Jordan retired twice during his career. His first retirement came in 1993, after which he briefly played minor league baseball. He returned to the NBA in 1995. He retired a second time in 1999, only to return again in 2001, this time with the Washington Wizards. He played two seasons for the Wizards before retiring for good in 2003. After his playing career, Jordan became a team owner and executive. As of my knowledge cutoff in September 2021, he is the majority owner of the Charlotte Hornets. Off the court, Jordan is known for his lucrative endorsement deals, particularly with Nike. The Air Jordan line of sneakers is one of the most popular and enduring in the world. His influence also extends to the realms of film and fashion, and he is recognized globally as a cultural icon. In 2000, he was inducted into the Basketball Hall of Fame.\",\n",
" text_splitter=TokenTextSplitter(\n",
" chunk_size=100,\n",
" chunk_overlap=10,\n",
" ),\n",
" content=\"Michael Jordan, often referred to by his initials MJ, is considered one of the greatest players in the history of the National Basketball Association (NBA). He was known for his scoring ability, defensive prowess, competitiveness, and clutch performances. Born on February 17, 1963, Jordan played 15 seasons in the NBA, primarily with the Chicago Bulls, but also with the Washington Wizards. His professional career spanned two decades from 1984 to 2003, during which he won numerous awards and set multiple records. \\n\\n Here are some key highlights of his career: - Scoring: Jordan won the NBA scoring title a record 10 times. He also has the highest career scoring average in NBA history, both in the regular season (30.12 points per game) and in the playoffs (33.45 points per game). - Championships: He led the Chicago Bulls to six NBA championships and was named Finals MVP in all six of those Finals (1991-1993, 1996-1998). - MVP Awards: Jordan was named the NBA's Most Valuable Player (MVP) five times (1988, 1991, 1992, 1996, 1998). - Defensive Ability: He was named to the NBA All-Defensive First Team nine times and won the NBA Defensive Player of the Year award in 1988. - Olympics: Jordan also won two Olympic gold medals with the U.S. basketball team, in 1984 and 1992. \\n\\n - Retirements and Comebacks: Jordan retired twice during his career. His first retirement came in 1993, after which he briefly played minor league baseball. He returned to the NBA in 1995. He retired a second time in 1999, only to return again in 2001, this time with the Washington Wizards. He played two seasons for the Wizards before retiring for good in 2003. After his playing career, Jordan became a team owner and executive. As of my knowledge cutoff in September 2021, he is the majority owner of the Charlotte Hornets. Off the court, Jordan is known for his lucrative endorsement deals, particularly with Nike. \\n\\n The Air Jordan line of sneakers is one of the most popular and enduring in the world. His influence also extends to the realms of film and fashion, and he is recognized globally as a cultural icon. In 2000, he was inducted into the Basketball Hall of Fame.\",\n",
" text_splitter={\"type\": \"separator\", \"chunk_size\": 200, \"chunk_overlap\": 10, \"separators\": [\"\\n\\n\"]}\n",
")"
],
"metadata": {
"collapsed": false
},
"id": "e23ee88246ffc350"
"id": "e23ee88246ffc350",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# create record 3 (Granite)\n",
"taskingai.retrieval.create_record(\n",
" collection_id=collection.collection_id,\n",
" type=\"text\",\n",
" content=\"Granite is a type of coarse-grained igneous rock composed primarily of quartz and feldspar, among other minerals. The term \\\"granitic\\\" means granite-like and is applied to granite and a group of intrusive igneous rocks. Description of Granite * Type: Igneous rock * Grain size: Coarse-grained * Composition: Mainly quartz, feldspar, and micas with minor amounts of amphibole minerals * Color: Typically appears in shades of white, pink, or gray, depending on their mineralogy * Crystalline Structure: Yes, due to slow cooling of magma beneath Earth's surface * Density: Approximately 2.63 to 2.75 g/cm³ * Hardness: 6-7 on the Mohs hardness scale Formation Process Granite is formed from the slow cooling of magma that is rich in silica and aluminum, deep beneath the earth's surface. Over time, the magma cools slowly, allowing large crystals to form and resulting in the coarse-grained texture that is characteristic of granite. Uses Granite is known for its durability and aesthetic appeal, making it a popular choice for construction and architectural applications. It's often used for countertops, flooring, monuments, and building materials. In addition, due to its hardness and toughness, it is used for cobblestones and in other paving applications. Geographical Distribution Granite is found worldwide, with significant deposits in regions such as the United States (especially in New Hampshire, which is also known as \\\"The Granite State\\\"), Canada, Brazil, Norway, India, and China. Varieties There are many varieties of granite, based on differences in color and mineral composition. Some examples include Bianco Romano, Black Galaxy, Blue Pearl, Santa Cecilia, and Ubatuba. Each variety has unique patterns, colors, and mineral compositions.\",\n",
" text_splitter=TokenTextSplitter(\n",
" chunk_size=100,\n",
" chunk_overlap=10,\n",
" ),\n",
" text_splitter={\"type\": \"token\", \"chunk_size\": 100, \"chunk_overlap\": 10},\n",
")"
],
"metadata": {
"collapsed": false
},
"id": "73458e8086bec5bd"
"id": "73458e8086bec5bd",
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
Expand All @@ -159,8 +151,6 @@
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# Check record status. \n",
"# Only when status is \"READY\", the record chunks can appear in query results.\n",
Expand All @@ -172,48 +162,50 @@
"metadata": {
"collapsed": false
},
"id": "f6140ba9ae4e3f91"
"id": "f6140ba9ae4e3f91",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# query chunks 1\n",
"chunks = taskingai.retrieval.query_chunks(\n",
" collection_id=collection.collection_id,\n",
" query_text=\"Basketball\",\n",
" top_k=2\n",
" top_k=10,\n",
" score_threshold=0.5,\n",
")\n",
"print(chunks)"
],
"metadata": {
"collapsed": false
},
"id": "cd499d7869e8445c"
"id": "cd499d7869e8445c",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# query chunks 2\n",
"chunks = taskingai.retrieval.query_chunks(\n",
" collection_id=collection.collection_id,\n",
" query_text=\"geology\",\n",
" top_k=2\n",
" top_k=10,\n",
" max_tokens=300,\n",
")\n",
"print(chunks)"
],
"metadata": {
"collapsed": false
},
"id": "b6fd67f81af404b2"
"id": "b6fd67f81af404b2",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# query chunks 3\n",
"chunks = taskingai.retrieval.query_chunks(\n",
Expand All @@ -226,7 +218,9 @@
"metadata": {
"collapsed": false
},
"id": "fc9c1fa12d893dd1"
"id": "fc9c1fa12d893dd1",
"outputs": [],
"execution_count": null
}
],
"metadata": {
Expand Down
2 changes: 1 addition & 1 deletion taskingai/_version.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
__title__ = "taskingai"
__version__ = "0.2.3"
__version__ = "0.2.4"
3 changes: 3 additions & 0 deletions taskingai/client/models/entities/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,11 @@
from .chat_completion_function_message import *
from .chat_completion_function_parameters import *
from .chat_completion_function_parameters_property import *
from .chat_completion_function_parameters_property_items import *
from .chat_completion_message import *
from .chat_completion_role import *
from .chat_completion_system_message import *
from .chat_completion_usage import *
from .chat_completion_user_message import *
from .chat_memory import *
from .chat_memory_message import *
Expand All @@ -54,6 +56,7 @@
from .status import *
from .text_embedding_input_type import *
from .text_embedding_output import *
from .text_embedding_usage import *
from .text_splitter import *
from .text_splitter_type import *
from .tool_ref import *
Expand Down
6 changes: 1 addition & 5 deletions taskingai/client/models/entities/action.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,9 @@
"""

from pydantic import BaseModel, Field
from typing import Optional, Any, Dict
from typing import Any, Dict
from .action_method import ActionMethod
from .action_param import ActionParam
from .action_param import ActionParam
from .action_body_type import ActionBodyType
from .action_param import ActionParam
from .chat_completion_function import ChatCompletionFunction
from .action_authentication import ActionAuthentication

__all__ = ["Action"]
Expand Down
2 changes: 2 additions & 0 deletions taskingai/client/models/entities/chat_completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from pydantic import BaseModel, Field
from .chat_completion_finish_reason import ChatCompletionFinishReason
from .chat_completion_assistant_message import ChatCompletionAssistantMessage
from .chat_completion_usage import ChatCompletionUsage

__all__ = ["ChatCompletion"]

Expand All @@ -22,3 +23,4 @@ class ChatCompletion(BaseModel):
finish_reason: ChatCompletionFinishReason = Field(...)
message: ChatCompletionAssistantMessage = Field(...)
created_timestamp: int = Field(...)
usage: ChatCompletionUsage = Field(...)
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,13 @@

from pydantic import BaseModel, Field
from typing import Optional, List

from .chat_completion_function_parameters_property_items import ChatCompletionFunctionParametersPropertyItems

__all__ = ["ChatCompletionFunctionParametersProperty"]


class ChatCompletionFunctionParametersProperty(BaseModel):
type: str = Field(..., pattern="^(string|number|integer|boolean)$")
description: str = Field("", max_length=256)
type: str = Field(..., pattern="^(string|number|integer|boolean|array)$")
description: str = Field("", max_length=512)
enum: Optional[List[str]] = Field(None)
items: Optional[ChatCompletionFunctionParametersPropertyItems] = Field(None)
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# -*- coding: utf-8 -*-

# chat_completion_function_parameters_property_items.py

"""
This script is automatically generated for TaskingAI python client
Do not modify the file manually

Author: James Yao
Organization: TaskingAI
License: Apache 2.0
"""

from pydantic import BaseModel, Field


__all__ = ["ChatCompletionFunctionParametersPropertyItems"]


class ChatCompletionFunctionParametersPropertyItems(BaseModel):
type: str = Field(..., pattern="^(string|number|integer|boolean)$")
22 changes: 22 additions & 0 deletions taskingai/client/models/entities/chat_completion_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# -*- coding: utf-8 -*-

# chat_completion_usage.py

"""
This script is automatically generated for TaskingAI python client
Do not modify the file manually

Author: James Yao
Organization: TaskingAI
License: Apache 2.0
"""

from pydantic import BaseModel, Field


__all__ = ["ChatCompletionUsage"]


class ChatCompletionUsage(BaseModel):
input_tokens: int = Field(...)
output_tokens: int = Field(...)
2 changes: 2 additions & 0 deletions taskingai/client/models/entities/retrieval_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,6 @@
class RetrievalConfig(BaseModel):
top_k: int = Field(3, ge=1, le=20)
max_tokens: Optional[int] = Field(None, ge=1, le=8192)
score_threshold: Optional[float] = Field(None, ge=0.0, le=1.0)
method: RetrievalMethod = Field(...)
function_description: Optional[str] = Field(None, min_length=0, max_length=1024)
21 changes: 21 additions & 0 deletions taskingai/client/models/entities/text_embedding_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# -*- coding: utf-8 -*-

# text_embedding_usage.py

"""
This script is automatically generated for TaskingAI python client
Do not modify the file manually

Author: James Yao
Organization: TaskingAI
License: Apache 2.0
"""

from pydantic import BaseModel, Field


__all__ = ["TextEmbeddingUsage"]


class TextEmbeddingUsage(BaseModel):
input_tokens: int = Field(...)
5 changes: 3 additions & 2 deletions taskingai/client/models/entities/text_splitter.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@
"""

from pydantic import BaseModel, Field
from typing import Optional
from typing import Optional, List
from .text_splitter_type import TextSplitterType

__all__ = ["TextSplitter"]


class TextSplitter(BaseModel):
type: TextSplitterType = Field(...)
type: TextSplitterType = Field("token")
chunk_size: Optional[int] = Field(None, ge=50, le=1000)
chunk_overlap: Optional[int] = Field(None, ge=0, le=200)
separators: Optional[List[str]] = Field(None, min_length=1, max_length=16)
1 change: 1 addition & 0 deletions taskingai/client/models/entities/text_splitter_type.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@

class TextSplitterType(str, Enum):
TOKEN = "token"
SEPARATOR = "separator"
Loading
Loading