Skip to content

Commit

Permalink
Update baselines.
Browse files Browse the repository at this point in the history
  • Loading branch information
whilo committed Feb 24, 2025
1 parent 25f5cd4 commit f34558f
Show file tree
Hide file tree
Showing 4 changed files with 379 additions and 201 deletions.
15 changes: 0 additions & 15 deletions resources/prompts/minecraft.txt
Original file line number Diff line number Diff line change
@@ -1,30 +1,15 @@
You are a Minecraft player having fun in Minecraft. You get transcripts of the screen and are listening to the speakers. Go ahead and explore the world by pursuing meaningful goals.



===== Recent audio input =====


%s


===== Recent screen descriptions =====


%s


===== Your recent statements =====


%s


===== Your recent actions =====


%s


================

31 changes: 11 additions & 20 deletions resources/prompts/screen.txt
Original file line number Diff line number Diff line change
@@ -1,31 +1,22 @@
You are an computer assistant looking at the screen and listening to the speakers. You execute tasks on the screen.


You are a computer assistant actively observing the screen and listening to the speakers and user.

===== Recent audio input =====


%s


===== Recent screen descriptions =====


%s

===== Recent screen transcripts =====
%s

===== Your recent statements =====


%s


===== Your recent actions =====


%s


================


===== Instructions =====
- Your goal is to assist the user by understanding the context and responding helpfully and concisely.
- Maintain continuity by considering previous audio inputs, history of screen transcripts, your statements, and your actions.
- Respond *only* to any questions asked by the user in the audio input! Consider the context from the screen and your recent statements.
- Do not repeat yourself in terms of your recent statements! Continue from where you left off to maintain a natural conversation flow.
- Do not explicitly tell the user to feel free to ask questions or follow up!
- You might have heard your own earlier statements also on audio input as a feedback. Ignore those.
- If there is nothing new to add answering the questions of the user, e.g. because you answered it and no further question has been asked, respond with "QUIET". Otherwise, provide the next one or two sentences in your ongoing conversation with the user.
43 changes: 31 additions & 12 deletions src/is/simm/runtimes/openai.clj
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
[superv.async :refer [S go-try go-loop-try <? put?]]
[clojure.data.json :as json]
[babashka.http-client :as http]
[clojure.java.io :as io])
[clojure.java.io :as io]
[clojure.spec.alpha :as s])
(:import [java.util Base64]
[java.util.function Function]))

Expand All @@ -28,15 +29,25 @@
(def headers
{"Authorization" (str "Bearer " api-key)})

(defn payload [model content]
;; spec for openai messages
(s/def ::openai-message (s/keys :req-un [::role ::content]))
(s/def ::role string?)
(s/def ::content (s/or :text string? :image_url (s/keys :req-un [::url])))
(s/def ::url string?)

(s/fdef payload
:args (s/cat :model string? :messages (s/coll-of ::openai-message :kind vector?))
:ret string?)
(defn payload [model messages]
(json/write-str
{"model" model
"messages" [{"role" "user"
"content" content
#_[{"type" "text"
"text" text}
{"type" "image_url"
"image_url" {"url" (str "data:image/jpeg;base64," base64-image)}}]}]
"messages" messages
#_[{"role" "user"
"content"
[{"type" "text"
"text" text}
{"type" "image_url"
"image_url" {"url" (str "data:image/jpeg;base64," base64-image)}}]}]
;"max_tokens" 300
}))

Expand All @@ -48,11 +59,14 @@
"o1-preview" 128000
"o1-mini" 128000 })

(defn chat [model content]
(s/fdef chat
:args (s/cat :model string? :messages (s/coll-of ::openai-message :kind vector?))
:ret (s/cat :response string?))
(defn chat [model messages]
(let [res (promise-chan)
cf (http/post "https://api.openai.com/v1/chat/completions"
{:headers (assoc headers "Content-Type" "application/json")
:body (payload model content)
:body (payload model messages)
:async true})]
(-> cf
(.thenApply (reify Function
Expand All @@ -70,14 +84,18 @@
res))


(s/fdef text-chat
:args (s/cat :model string? :text string?)
:ret (s/cat :response string?))
(defn text-chat [model text]
(let [res (chan)]
(if (>= (count text) (* 4 (window-sizes model)))
(do (warn "Text too long for " model ": " (count text) (window-sizes model))
(put! res (ex-info "Sorry, the text is too long for this model. Please try a shorter text."
{:type ::text-too-long :model model :text-start (subs text 0 100) :count (count text)}))
res)
(chat model [{"type" "text" "text" text}]))))
(chat model [{"role" "user"
"content" [{"type" "text" "text" text}]}]))))

(comment

Expand Down Expand Up @@ -153,7 +171,8 @@
request (http/post "https://api.openai.com/v1/audio/transcriptions"
{:headers headers
:multipart [{:name "file" :content (io/file input-path) :file-name input-path :mimetype "audio/wav"}
{:name "model" :content model}]
{:name "model" :content model}
{:name "language" :content "en"}]
:async true})]
(-> request
(.thenApply (reify Function
Expand Down
Loading

0 comments on commit f34558f

Please sign in to comment.