Skip to content

OpenASR/idiolect

Repository files navigation

idiolect icon idiolect

Deploy

A general purpose voice user interface for the IntelliJ Platform, inspired by Tavis Rudd. Possible use cases: visually impaired and RSI users. Originally developed as part of a JetBrains hackathon, it is now a community-supported project. For background information, check out this presentation.

See Also

Usage

To get started, press the Voice control button in the toolbar, then speak a command, e.g. "Hi, IDEA!" Idiolect supports a simple grammar. For a complete list of commands, please refer to the wiki. Click the button once more to deactivate.

Voice Commands

For a full list of all actions that can be activated by voice ask Idiolect:

What can I say?

or

What can I say about "activate"?

Command tab

There are a lot of actions, and some of them are not easy to say or remember. To make it easier you can customise the phrases.

Edit custom phrases

Some of the more useful commands are:

Navigation

  • activate project tool window (commit, database, debug, find, gradle, run, terminal...)
  • back/forward, down/up, left/right, page up/down, scroll up/down
  • next/previous word
  • method down/up
  • find, find in project
  • go to ... (class, bookmark, declaration, line, implementation...)
  • close all editors
  • next method
  • find usages
  • call hierarchy

Editing

  • create (editor config file, grpc request action, liquibase changelog, vue single file comp)
  • new class/dockerfile/element/file...
  • code cleanup
  • code completion
  • collapse block
  • extract class/function/method/etc
  • generate getter/setter/test method...
  • new class (or "create new class "something")
  • rename (or rename to "something")
  • cut / copy / paste / delete
  • cut line end/backward
  • delete to (line/word) (start/end)
  • toggle column mode
  • reformat code
  • fix it
  • whoops

Debugging

  • debug
  • context debug
  • context run
  • coverage

Git Commands

  • git add/pull/merge/push/stash...
  • checkin files
  • annotate

Building

For Linux or macOS users:

git clone https://github.com/OpenASR/idiolect && cd idiolect && ./gradlew runIde

For Windows users:

git clone https://github.com/OpenASR/idiolect & cd idiolect & gradlew.bat runIde

Recognition works with most popular microphones (preferably 16kHz, 16-bit). For best results, minimize background noise.

Contributing

Contributors who have IntelliJ IDEA installed can simply open the project. Otherwise, run the following command from the project's root directory:

./gradlew runIde -PluginDev

Architecture

Idiolect is implemented using the IntelliJ Platform SDK. For more information about the plugin architecture, please refer to the wiki page.

Integration with Idiolect

plugin.xml defines a number of <extensionPoint>s which would allow other plugins to integrate with or extend/customise the capabilities of Idiolect.

An example of this is provided in idiolect-azure which implements AsrProvider and adds its own settings under Tools/Idiolect.

AsrProvider

Listens for audio input, recognises speech to text and returns an NlpRequest with possible utterances. Does not resolve the intent.

Possible alternative implementations could:

  • integrate with Windows SAPI 5 Speech API
  • integrate with Dragon/Nuance API

NlpProvider

Processes an NlpRequest. The default implementation invokes IdeService.invokeAction(ExecuteVoiceCommandAction, nlpRequest) and the action is handled by ExecuteVoiceCommandAction and ActionRecognizerManager.handleNlpRequest()

AsrSystem

Processes audio input, recognises speech to text and executes actions. The default implementation AsrControlLoop uses the AsrProvider and NlpProvider.

Some APIs such as AWS Lex implement the functionality of AsrProvider and NlpProvider in a single call.

IntentResolver

Processes an NlpRequest (utterance/alternatives) and resolves an NlpResponse with intentName and slots. ActionRecognizerManager.handleNlpRequest() iterates through the IntentResolvers until it finds a match.

The Idiolect implementations use either exact-match or regular expressions on the recognized text. Alternative implementations may use AI to resolve the intent.

CustomPhraseRecognizer

Many of the auto-generated trigger phrases are not suitable for voice activation. You can add your own easier to say and remember phrases in ~/.idea/phrases.properties

IntentHandler

Fulfills an NlpResponse (intent + slots), performing desired actions. ActionRecognizerManager.handleNlpRequest() iterates through the IntentHandlers until the intent is actioned.

TemplateIntentHandler

Handles two flavours of intent prefix:

  • Template.id.${template.id} eg: Template.id.maven-dependency
  • Template.${template.groupName}.${template.key} eg: Template.Maven.dep

template.id is often null. template.key is the "Abbreviation" that you would normally type before pressing TAB.

The default trigger phrases are generated from the template description or key and are often not suitable for voice activation. You can add your own trigger phrase -> live template mapping in ~/.idea/phrases.properties and it will be resolved by CustomPhraseRecognizer.

ttsProvider

Reads audio prompts/feedback to the user

org.openasr.idiolect.nlp.NlpResultListener

Any interfaces which are registered to the topic in plugin.xml under <applicationListeners> will be notified when

  • listening state changes
  • recognition is returned by the AsrProvider
  • request is fulfilled by an IntentHandler
  • there is a failure
  • a prompt/message is provided for the user

Plugin Actions

plugin.xml defines <action>s:

This action is invoked when the user clicks on the Voice control button in the toolbar. This simply tells AsrService to activate or standby. When the AsrService is active, the AsrSystem,

by default ASRControlLoop (see below).

A debugging aid to use one of the ActionRecognizer extension classes configured in plugin.xml to generate an ActionCallInfo which is then runInEditor().

Similar to ExecuteActionFromPredefinedText but uses the Idiolect.VoiceCommand.Text data attached to the invoking AnActionEvent.

IDEA Actions

There are many Actions (classes which extend AnAction) provided by IDEA:

ASRControlLoop

When AsrControlLoop detects an utterance, it invokes PatternBasedNlpProvider.processUtterance() which typically calls invokeAction() and/or one or more of the methods of IdeService

Programming By Voice

Maintainers