A general purpose voice user interface for the IntelliJ Platform, inspired by Tavis Rudd. Possible use cases: visually impaired and RSI users. Originally developed as part of a JetBrains hackathon, it is now a community-supported project. For background information, check out this presentation.
To get started, press the button in the toolbar, then speak a command, e.g. "Hi, IDEA!" Idiolect supports a simple grammar. For a complete list of commands, please refer to the wiki. Click the button once more to deactivate.
For a full list of all actions that can be activated by voice ask Idiolect:
What can I say?
or
What can I say about "activate"?
There are a lot of actions, and some of them are not easy to say or remember. To make it easier you can customise the phrases.
Edit custom phrases
Some of the more useful commands are:
- activate project tool window (commit, database, debug, find, gradle, run, terminal...)
- back/forward, down/up, left/right, page up/down, scroll up/down
- next/previous word
- method down/up
- find, find in project
- go to ... (class, bookmark, declaration, line, implementation...)
- close all editors
- next method
- find usages
- call hierarchy
- create (editor config file, grpc request action, liquibase changelog, vue single file comp)
- new class/dockerfile/element/file...
- code cleanup
- code completion
- collapse block
- extract class/function/method/etc
- generate getter/setter/test method...
- new class (or "create new class "something")
- rename (or rename to "something")
- cut / copy / paste / delete
- cut line end/backward
- delete to (line/word) (start/end)
- toggle column mode
- reformat code
- fix it
- whoops
- debug
- context debug
- context run
- coverage
- git add/pull/merge/push/stash...
- checkin files
- annotate
For Linux or macOS users:
git clone https://github.com/OpenASR/idiolect && cd idiolect && ./gradlew runIde
For Windows users:
git clone https://github.com/OpenASR/idiolect & cd idiolect & gradlew.bat runIde
Recognition works with most popular microphones (preferably 16kHz, 16-bit). For best results, minimize background noise.
Contributors who have IntelliJ IDEA installed can simply open the project. Otherwise, run the following command from the project's root directory:
./gradlew runIde -PluginDev
Idiolect is implemented using the IntelliJ Platform SDK. For more information about the plugin architecture, please refer to the wiki page.
plugin.xml defines a number of <extensionPoint>
s which would allow other plugins to integrate with or extend/customise the capabilities of Idiolect.
An example of this is provided in idiolect-azure which implements AsrProvider
and adds its own settings under Tools/Idiolect.
Listens for audio input, recognises speech to text and returns an NlpRequest
with possible utterances.
Does not resolve the intent.
Possible alternative implementations could:
- integrate with Windows SAPI 5 Speech API
- integrate with Dragon/Nuance API
Processes an NlpRequest
.
The default implementation invokes IdeService.invokeAction(ExecuteVoiceCommandAction, nlpRequest)
and the action is handled by ExecuteVoiceCommandAction
and ActionRecognizerManager.handleNlpRequest()
Processes audio input, recognises speech to text and executes actions.
The default implementation AsrControlLoop
uses the AsrProvider
and NlpProvider
.
Some APIs such as AWS Lex implement the functionality of AsrProvider
and NlpProvider
in a single call.
Processes an NlpRequest
(utterance/alternatives) and resolves an NlpResponse
with intentName
and slots
.
ActionRecognizerManager.handleNlpRequest()
iterates through the IntentResolver
s until it finds a match.
The Idiolect implementations use either exact-match or regular expressions on the recognized text. Alternative implementations may use AI to resolve the intent.
Many of the auto-generated trigger phrases are not suitable for voice activation. You can add your own easier to
say and remember phrases in ~/.idea/phrases.properties
Fulfills an NlpResponse
(intent + slots), performing desired actions.
ActionRecognizerManager.handleNlpRequest()
iterates through the IntentHandler
s until the intent is actioned.
Handles two flavours of intent prefix:
Template.id.${template.id}
eg:Template.id.maven-dependency
Template.${template.groupName}.${template.key}
eg:Template.Maven.dep
template.id
is often null.
template.key
is the "Abbreviation" that you would normally type before pressing TAB
.
The default trigger phrases are generated from the template description or key and are often not suitable for voice activation.
You can add your own trigger phrase -> live template mapping in ~/.idea/phrases.properties
and it will be resolved by CustomPhraseRecognizer
.
Reads audio prompts/feedback to the user
Any interfaces which are registered to the topic in plugin.xml under <applicationListeners>
will be notified when
- listening state changes
- recognition is returned by the
AsrProvider
- request is fulfilled by an
IntentHandler
- there is a failure
- a prompt/message is provided for the user
plugin.xml defines <action>
s:
This action is invoked when the user clicks on the button in the toolbar.
This simply tells AsrService
to activate or standby.
When the AsrService
is active, the AsrSystem
,
by default ASRControlLoop
(see below).
A debugging aid to use one of the ActionRecognizer
extension classes configured in plugin.xml
to generate an ActionCallInfo
which is then runInEditor()
.
Similar to ExecuteActionFromPredefinedText
but uses the Idiolect.VoiceCommand.Text
data attached to the invoking AnActionEvent
.
There are many Actions (classes which extend AnAction
) provided by IDEA:
When AsrControlLoop
detects an utterance, it invokes
PatternBasedNlpProvider.processUtterance()
which typically calls invokeAction()
and/or one or more of the methods of IdeService