another server, based on stream #1418

colinator · 2023-11-02T22:11:32Z

Looks like there are several attempts to make servers with this code - here is another! This approach doesn't actually make a server - it simply factorizes the 'stream' example into components. This would support many ways of creating servers and data encodings - json/http, protobuf/grpc, flexbuffers, message pack, etc.

Happy to make changes and/or fold in with another server-ization branch.

ggerganov · 2023-11-03T08:28:40Z

Interesting! Will look further, but first need to make some long-time pending updates (#1422) and after that will come back to this example

litongjava · 2023-11-03T10:50:36Z

I like it, but so far it's only done with modularization, it would be nice to write another websocker-server. I would like to do this work, do you have plans to add a web-socket-server?

Also I found out that recording audio with a sample rate of 48KHz and then converting it to 16KHz gives better recognition. I would also like to add support for

colinator · 2023-11-03T13:51:17Z

@litongjava Well, if this approach gets integrated, then yes please write a web-socket server. I actually want something different - a ROS-like pub-sub node. We can all have the server we want, with the encoding we want.

codesoda · 2023-11-07T01:59:46Z

It'd be great if the server didn't "have to" capture the audio. Then, this stream transcription approach can be incorporated into other apps with real-time audio/video data streams. The host app would convert to 16khz audio and keep throwing it to the stream server.

colinator · 2023-11-07T16:29:49Z

@codesoda Yes, me too. Hence the LocalSDLMicrophone class separation. I also want the final result to be encoded into some protocol other than json - hence the WhisperOutput/WhisperEncoder classes.

litongjava · 2023-11-21T09:34:16Z

I have completed the WebSocket service, as it was quite complex, so I created a separate project for it. The project can be found at https://github.com/litongjava/whisper-cpp-server.

After compiling, run the following command to start the server:

./cmake-build-debug/whisper_server_base_on_uwebsockets -m models/ggml-base.en.bin

Then, navigate to the web directory at https://github.com/litongjava/whisper-cpp-server/tree/main/web and open the index.html file.

However, my tests indicate that the results of the speech recognition are not very satisfactory

colinator added 3 commits October 7, 2023 22:31

added examples/stream_components

ac5eaf3

Merge branch 'ggerganov:master' into master

a98d634

updated doc

95b5759

colinator added 3 commits November 12, 2023 17:10

Merge remote-tracking branch 'upstream/master'

2803017

updated make for stream_components

620ef99

updated cmake for stream_components

4770210

ggerganov mentioned this pull request Nov 16, 2023

Server example? #1369

Open

colinator added 2 commits December 2, 2023 18:07

Merge remote-tracking branch 'upstream/master'

650d960

add whisper_embd_enc

c58eb5c

colinator closed this by deleting the head repository Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

another server, based on stream #1418

another server, based on stream #1418

colinator commented Nov 2, 2023

ggerganov commented Nov 3, 2023

litongjava commented Nov 3, 2023

colinator commented Nov 3, 2023

codesoda commented Nov 7, 2023

colinator commented Nov 7, 2023

litongjava commented Nov 21, 2023

another server, based on stream #1418

another server, based on stream #1418

Conversation

colinator commented Nov 2, 2023

ggerganov commented Nov 3, 2023

litongjava commented Nov 3, 2023

colinator commented Nov 3, 2023

codesoda commented Nov 7, 2023

colinator commented Nov 7, 2023

litongjava commented Nov 21, 2023