Skip to content

Commit

Permalink
Improve HTML with hiccup and Markdown parser.
Browse files Browse the repository at this point in the history
  • Loading branch information
whilo committed Apr 13, 2024
1 parent eb15692 commit 4a36a96
Show file tree
Hide file tree
Showing 9 changed files with 87 additions and 66 deletions.
3 changes: 2 additions & 1 deletion deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
io.replikativ/kabel {:mvn/version "0.2.2"}
http-kit/http-kit {:mvn/version "2.7.0"}
metosin/reitit {:mvn/version "0.7.0-alpha7"}
compojure/compojure {:mvn/version "1.7.1"}
io.github.nextjournal/markdown {:mvn/version "0.5.148"}
hiccup/hiccup {:mvn/version "2.0.0-RC3"}
ring/ring-jetty-adapter {:mvn/version "1.12.0"}
etaoin/etaoin {:mvn/version "1.0.40"}
missionary/missionary {:mvn/version "b.34"}
Expand Down
4 changes: 2 additions & 2 deletions resources/default_schema.edn
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
{:db/ident :message/text
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one}
{:db/ident :message/tag
{:db/ident :message/link
:db/valueType :db.type/string
:db/cardinality :db.cardinality/many}

Expand All @@ -80,7 +80,7 @@
{:db/ident :conversation/summary
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one}
{:db/ident :conversation/tag
{:db/ident :conversation/link
:db/valueType :db.type/string
:db/cardinality :db.cardinality/many}
{:db/ident :conversation/message
Expand Down
8 changes: 4 additions & 4 deletions resources/prompts/assistance.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ You are simmie_beta, a chat bot. Answer to the conversation with strong priority
You have access to the following external function calls that you must use in each of these cases:
If the user asks for a piece of information that is not available in context conduct a web search to get more information by answering with WEBSEARCH('your search terms'). If you think can derive precise search terms feel free to do preemptive web searches if it will advance the conversation.
If the user wants to imagine or picture an idea, answer with IMAGEGEN('your prompt').
If the user wants to add an issue/todo, answer with ADD_ISSUE('issue title') and it will be added to the chat as well.
If the user wants to remove an issue/todo, answer with REMOVE_ISSUE('issue title') and it will be added to the chat as well.
If the user wants to add an issue/todo, answer with ADD_ISSUE('issue title') for each issue and it will be added to the chat as well.
If the user wants to remove an issue/todo, answer with REMOVE_ISSUE('issue title') for each issue and it will be added to the chat as well.
If the user wants to see or list the issues/todos, answer LIST_ISSUES() to retrieve all of them from your database.
If the user wants to get the notes from the chat sent, answer SEND_NOTES() to zip them and send them to the chat.
If the suer wants to retrieve a note, answer RETRIEVE_NOTE('note title') to send it to the chat.
If the suer wants to list the notes, answer LIST_NOTES() to send it to the chat.
If the user wants to retrieve a note, answer RETRIEVE_NOTE('note title') to send it to the chat.
If the user wants to list the notes, answer LIST_NOTES() to point the user to the chat's website.
If there seems no reply necessary right now to the last message, add 'QUIET' to the message.
If the user asks about your abilities, explain these abilities intuitively in context of the conversation.
When one of the users comes back after more than a few hours, greet them friendly and add DAILY to the response at the end.
Expand Down
2 changes: 1 addition & 1 deletion resources/prompts/note.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Title: %s
Body:%s


Given the note above on the subect, update it in light of the following conversation summary and return the new note body only. References to entities (events, places, people, organisations, businesses, academic topics, everyday topics, etc.) are syntactically expressed with double brackets in RoamResearch or logseq syntax, e.g. [[some topic][This topic is]] or [[Wikipedia]]. Make sure you retain these references. Be brief and succinct while keeping important facts, focus on the topic of the title *only* and rely on the references for the rest of the context to be provided in these notes. Use nested Emacs org-mode lists with '*' nesting for the note. If you do not want to update the note, write SKIP.
Given the note above on the subect, update it in light of the following conversation summary and return the new note body only. References to entities (events, places, people, organisations, businesses, academic topics, everyday topics, etc.) are syntactically expressed as Wikipedia style internal links with double brackets, e.g. [[peter][my friend peter]] or [[peter]]. Make sure you retain these references. Be brief and succinct while keeping important facts, focus on the topic of the title *only* and rely on the references for the rest of the context to be provided in these notes. Use Markdown with LaTeX support for formulas. Make full use of Markdown to lay out the note well, but prefer nested lists of bullet points to paragraphs. If you do not want to update the note, write SKIP.

%s

6 changes: 3 additions & 3 deletions src/ie/simm/db.clj
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@
(map (fn [[d f l n t]] (str d " " f " " l " (" n "): " (str/replace t #"\n" " "))))
(str/join "\n")))

(defn extract-tags [text]
(defn extract-links [text]
(vec (distinct (map second (re-seq #"\[\[([^\[\]]+)\](\[.+\])?\]" text)))))

(defn msg->txs [message]
(let [{:keys [message_id from chat date text]} message
tags (when text (extract-tags text))]
tags (when text (extract-links text))]
(vec
(concat
(when from
Expand Down Expand Up @@ -84,7 +84,7 @@
(when text
{:message/text text})
(when (seq tags)
{:message/tag tags}))]))))
{:message/link tags}))]))))

(def window-size 10)

83 changes: 60 additions & 23 deletions src/ie/simm/runtimes/assistance.clj
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,18 @@
[ie.simm.languages.browser :refer [extract-body]]
[ie.simm.languages.chat :refer [send-text! send-photo! send-document!]]
[ie.simm.prompts :as pr]
[ie.simm.db :refer [ensure-conn conversation extract-tags msg->txs window-size]]
[ie.simm.db :refer [ensure-conn conversation extract-links msg->txs window-size]]
[superv.async :refer [<?? go-try S go-loop-try <? >? put? go-for] :as sasync]
[clojure.core.async :refer [chan pub sub mult tap timeout] :as async]
[taoensso.timbre :refer [debug info warn error]]
[datahike.api :as d]
[hasch.core :refer [uuid]]
[clojure.string :as str]
[clojure.java.io :as io]
[etaoin.api :as e])
[hiccup.core :as h]
[nextjournal.markdown :as md]
[nextjournal.markdown.transform :as md.transform]
[nextjournal.markdown.parser :as md.parser])
(:import [java.util.zip ZipEntry ZipOutputStream]))

(defn summarize [S conn conv chat]
Expand All @@ -34,7 +37,7 @@
(sort-by first)
(take-last window-size)
(map second))
note-titles (extract-tags summarization)
note-titles (extract-links summarization)
_ (debug "=========================== CREATING NOTES ===============================")
new-notes
(<? S (async/into []
Expand All @@ -43,7 +46,7 @@
prompt (format pr/note note body summarization #_conv)
new-body (<? S (reasoner-llm prompt))]
:when (not (.contains new-body "SKIP"))
:let [new-refs (extract-tags new-body)
:let [new-refs (extract-links new-body)
ref-ids (mapv first (d/q '[:find ?n
:in $ [?t ...]
:where
Expand All @@ -56,23 +59,23 @@
:note/summary -1})))]
(debug "=========================== STORING NOTES ===============================")
(debug "Summarization:" summarization)
(debug "summarization tags" (extract-tags summarization))
(debug "summarization links" (extract-links summarization))
(d/transact conn (concat
[{:db/id -1
:conversation/summary summarization
:conversation/tag (extract-tags summarization)
:conversation/link (extract-links summarization)
:conversation/message messages}]
new-notes))
;; keep exports up to date
(doseq [[t b] (map (fn [{:keys [note/title note/body]}] [title body]) new-notes)]
(debug "writing note" t)
;; write to org file in notes/chat-id/title.org
(let [f (io/file (str "notes/" (:id chat) "/" t ".org"))]
(let [f (io/file (str "notes/" (:id chat) "/" t ".md"))]
(io/make-parents f)
(with-open [w (io/writer f)]
(binding [*out* w]
(println b)))))
(extract-tags summarization))))
(extract-links summarization))))

(defn zip-notes [chat-id]
(let [zip-file (io/file (str "notes/" chat-id ".zip"))
Expand All @@ -94,6 +97,8 @@

(def base-url "https://ec2-34-218-223-7.us-west-2.compute.amazonaws.com")



(defn assistance
"This interpreter can derive facts and effects through a relational database."
[[S peer [in out]]]
Expand All @@ -119,22 +124,54 @@
po (pub pub-out :type)

;; TODO figure out prefix, here conflict if notes/
routes [["/download/notes/:chat-id/notes.zip"
routes [["/download/chat/:chat-id/notes.zip"
{:get (fn [{{:keys [chat-id]} :path-params}]
{:status 200 :body (zip-notes chat-id)})}]
["/notes/:chat-id"
{:get (fn [{{:keys [chat-id]} :path-params}]
;; list the notes in basic HTML
{:status 200
:body (str "<html><body><h1>Notes</h1><a href=\"/download/notes/" chat-id "/notes.zip\">Download</a><ul>"
(str/join "" (map (fn [f] (str "<li><a href=\"/" f "\">" f "</a></li>"))
(rest (file-seq (io/file (str "notes/" chat-id))))))
"</ul></body></html>")})}]
(let [conn (ensure-conn peer chat-id)]
{:status 200
:body (h/html
[:html
[:head
[:meta {:charset "utf-8"}]
[:meta {:name "viewport" :content "width=device-width, initial-scale=1"}]
[:link {:href "https://cdn.jsdelivr.net/npm/katex@0.13.13/dist/katex.min.css" :rel "stylesheet" :type "text/css"}]
[:link {:href "https://fonts.bunny.net" :rel "preconnect"}]
[:link {:href "https://fonts.bunny.net/css?family=fira-mono:400,700%7Cfira-sans:400,400i,500,500i,700,700i%7Cfira-sans-condensed:700,700i%7Cpt-serif:400,400i,700,700i" :rel "stylesheet" :type "text/css"}]
[:title "Notes"]
[:link {:rel "stylesheet" :href "https://cdnjs.cloudflare.com/ajax/libs/bulma/0.9.3/css/bulma.min.css"}]]
[:body
[:div {:class "flex"}
[:h1 "Notes"]
[:a {:href (str "/download/chat/" chat-id "/notes.zip")} "Download"]
[:ul (map (fn [[f]] [:li [:a {:href (str "/notes/" chat-id "/" f)} f]])
(d/q '[:find ?t :where [?n :note/title ?t]] @conn))]]]])}))}]
;; access each individual node link as referenced above
["/notes/:chat-id/:note"
{:get (fn [{{:keys [chat-id note]} :path-params}]
{:status 200
:body (slurp (io/file (str "notes/" chat-id "/" note)))})}]]]
(let [conn (ensure-conn peer chat-id)
body (:note/body (d/entity @conn [:note/title note]))]
{:status 200
:body
(h/html
[:html
[:head
[:meta {:charset "utf-8"}]
[:meta {:name "viewport" :content "width=device-width, initial-scale=1"}]
[:link {:href "https://cdn.jsdelivr.net/npm/katex@0.13.13/dist/katex.min.css" :rel "stylesheet" :type "text/css"}]
[:link {:href "https://fonts.bunny.net" :rel "preconnect"}]
[:link {:href "https://fonts.bunny.net/css?family=fira-mono:400,700%7Cfira-sans:400,400i,500,500i,700,700i%7Cfira-sans-condensed:700,700i%7Cpt-serif:400,400i,700,700i" :rel "stylesheet" :type "text/css"}]
[:title note]
[:link {:rel "stylesheet" :href "https://cdnjs.cloudflare.com/ajax/libs/bulma/0.9.3/css/bulma.min.css"}]]
[:body
[:div {:class "flex"}
[:h1 note]
(if (string? body)
(md.transform/->hiccup (md/parse (update md.parser/empty-doc :text-tokenizers concat [md.parser/internal-link-tokenizer md.parser/hashtag-tokenizer])
body))
"Note does not exist yet.")]]])}))}]]]
(swap! peer assoc-in [:http :routes :assistance] routes)
;; we will continuously interpret the messages
(go-loop-try S [m (<? S msg-ch)]
Expand Down Expand Up @@ -172,20 +209,20 @@
(summarize S conn conv chat))


;; 3. retrieve summaries for active tags
all-tags (d/q '[:find [?t ...] :where [_ :conversation/tag ?t]] @conn)
relevant (<? S (cheap-llm (format "You are given the following tags in brackets [[some tag]]:\n\n%s\n\nList the most relevant tags for the following conversation with descending priority.\n\n%s"
(str/join ", " (map #(str "[[" % "]]") all-tags))
;; 3. retrieve summaries for active links
all-links (d/q '[:find [?t ...] :where [_ :conversation/link ?t]] @conn)
relevant (<? S (cheap-llm (format "You are given the following links in Wikipedia style brackets [[some entity]]:\n\n%s\n\nList the most relevant links for the following conversation with descending priority.\n\n%s"
(str/join ", " (map #(str "[[" % "]]") all-links))
conv)))
active-tags (concat (take 3 (extract-tags relevant)) [firstname])
active-links (concat (take 3 (extract-links relevant)) [firstname])

summaries (d/q '[:find ?t ?s
:in $ [?t ...]
:where
[?c :note/title ?t]
[?c :note/body ?s]]
@conn (concat active-tags (extract-tags conv)))
_ (debug "active tags" #_summaries active-tags)
@conn (concat active-links (extract-links conv)))
_ (debug "active links" #_summaries active-links)

;; 4. derive reply
assist-prompt (format pr/assistance
Expand Down
Loading

0 comments on commit 4a36a96

Please sign in to comment.