forked from dcSpark/shinkai-node
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtech_specs
95 lines (76 loc) · 7.37 KB
/
tech_specs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
Sure, based on the provided text, here are the technical requirements for your Shinkai project:
**Agents:**
1. Each profile in a Shinkai node can have one or more agents.
2. Agents are made up of a combination of:
- A user-specified name (which becomes the agent’s sub-identity under the node).
- An external HTTP API connection to an LLM (initially targeting OpenAI API and using LocalAI for running local models).
- Permissions specifying which toolkits/which storage buckets the agent has access to.
- Permissions which sub-identity has the ability to message the agent.
3. Agents support both local LLMs and 3rd-party LLMs (such as GPT4).
4. Agents should provide a principled framework for scheduling execution & LLM inferencing.
5. Agents are designed to be flexible and efficient, with scalability as more responsibilities are placed upon them over time.
**Jobs:**
1. All actions taken by agents in a Shinkai node occur within jobs.
2. Each job is akin to a distinct conversation and or specific goal/task that the user wants the agent to perform.
**Internals Of A Job:**
1. A job is composed of a:
- Unique Job Id
- Parent agent Id (identity of the parent agent)
- A conversation inbox (for storing messages to the agent from the user and messages from the agent)
- A step history (an ordered list of all messages submitted to the LLM which triggered a step to execute)
- A scope (what storage buckets and/or documents are accessible to the LLM via vector search and/or direct querying based off bucket name/key)
2. By default, jobs have the same scope(permissions) as the agent to access data in the storage database.
3. A user can specify a scope when creating a job (by selecting a series of buckets and/or documents) which limits the access and should improve overall quality in regards to accuracy.
**Job Engine:**
1. All messages to agent sub-identities always go through the job engine.
2. The job engine receives messages from devices and sequentially processes them either into new jobs or into existing jobs.
3. If the sender identity does not have valid permissions to interact with the agent, the message is thrown away.
4. Messages which are sent to existing jobs must use the job message schema type, providing the job ID that the message is destined for.
5. If the job ID does not exist for this specific agent then the message is thrown away.
6. If a message is thrown away the job engine sends back an error message to the sender device.
7. In the medium to long-term, the job engine will process all messages sequentially and will route them into agents and jobs.
8. All jobs will be executed in parallel as new messages come in, with the LLM acting as the blocking part of the system (if using a local LLM).
---
Sure, here are the technical requirements based on the text you provided:
Job Creation:
1. The system needs to be able to handle Job Creation schema type messages which signal the start of a new job.
2. Job creation should initiate from a device under the Shinkai profile with an agent sub-identity message.
3. The job engine must generate a unique, non-overlapping job ID.
4. The new job is initialized under the correct agent and the step history is filled with a default text template. The default text template should explain the job's role and its available tools.
5. The system must support testing of different ways to format the tools so that the LLM can easily process them.
6. The job engine returns the job ID back to the device.
7. The frontend app should send two messages: one for initializing the job and one for submitting the first message into the job. The second message is sent only after the job ID is returned by the job engine.
Job Step Execution:
1. The system needs to handle state transitions, called steps, which are triggered when a new message is submitted to a job.
2. The job can be archived or deleted by a human user.
3. Each step is split into two phases: decision and execution.
4. The decision phase begins running with the existing step history as context when a new message is supplied to the job.
5. The system should support appending some pieces of data into the step history before executing new steps (for instance, the latest date/time).
6. The LLM should be queried again if the output does not fit the expected structure, until a valid output is provided.
7. Premessages should be validated to fit the expected schema and only use tools that are installed.
8. During the execution phase, the Premessages are converted into Messages.
9. The system should call all the tools installed in the Shinkai node and fill out the Premessage contents with the results, thereby converting the Premessage into a Message.
10. The system should allow for Premessages to be directed to "Self", "User", or a specific external identity.
11. For Premessages directed to "Self", the contents need to be wrapped within the Job Message schema with the job id automatically filled out by the node. For Premessages sent to "User", these are directly saved to the job’s conversation inbox and do not require any extra wrapping.
12. All the resulting messages are sent immediately to the receiver identity specified on the message, or scheduled for the future (if the message was tagged with a scheduled date/time).
Please note that the technical requirements are a high-level abstraction of the provided information. For actual implementation, these requirements may need to be refined and broken down into more specific tasks.
---
Sure, here are the technical requirements derived from the User Experience section:
1. **User Conversations:**
- The system should support creation of a single conversation per global identity.
- The system should support creation of a single conversation per profile sub-identity, enabling end-to-end encrypted private conversations between users.
- The system should support creation of an infinite number of conversations between a user and any of their agents.
2. **Job Creation:**
- The system should allow users to create a new job by starting a new conversation with a specific agent through the Shinkai app.
3. **User Interface:**
- The system should feature two main tabs in the user interface: a “People” tab and an “Agents” tab.
- The “People” tab should display Shinkai identities added to the profile’s list of contacts, stored in a bucket.
- The “Agents” tab should list out the user's agents. When an agent is clicked, it should display a list of all existing jobs underneath said agent.
- If a specific job is clicked, the system should display the job conversation inbox within a typical chat user interface.
- The chat UI for a job should be similar to the chat UI when talking with other users, at least in its initial versions.
4. **Functionality Extensions:**
- The system should be designed to allow for extensions of functionality over time, likely including starting new Jobs from a conversation with a person.
- The first major use case extension should include the ability for users to upload/submit links to text files/documents, save them in the storage DB/generate embeddings, and setting the scope of the job to include the document(s).
5. **Interaction with Documents:**
- The system should allow users to ask their Agent questions or request it to perform tasks based on the content of a document.
- Future plans should include the integration of a document/embedding generation model into the Shinkai node.