-
Notifications
You must be signed in to change notification settings - Fork 6
Software Documentation
Cargi is a smart personal assistant for car that makes your driving experience pleasant, comfortable and intuitive. The app, by syncing with your calendar and contacts, makes texting, calling and navigation easier than before, and provides a user-centric interaction paradigm appropriate for the limited mobility and perception in a car. This documentation discusses the brainstorming and the design process, and provides a thorough discussion of our technical design.
We began the design process with need-finding to understand the challenges that people faced while driving. As a part of this process, we conducted immersive interviews with several users about their experiences driving to/from work. Some of our major findings included patterns in user's activities in the car, difficulty in notifying ETAs, lack of a navigation app that accounted for real-time traffic, bad interface for calling/texting, challenges staying attentive and the need for information on cheap gas stations. This was followed by a phase of analogous research to explore the existing technologies and apps. Finally, we brainstormed several ideas for the features that would be most helpful for our customers. The picture below captures one of our brainstorming sessions.
To validate our value proposition, we prototype our idea using human Cargi, and also sent out a survey, asking people to rank a few features in order of importance to them. Based on the survey results, we realized navigation, communication, entertainment and information to be the most important to people. We followed this with Facebook advertisements to understand our target audience which was crucial for our design, and while young professionals were the most important target audience, the idea also resonated with older adults. To narrow our focus, however, we focused on young professionals that use a calendar to keep track of their engagements and use third party apps such as Spotify and Waze for entertainment and productivity respectively.
Based on these observations, we decided on the following features for our first prototype.
- The ability to text the ETA to the relevant contact as determined from the calendar event
- The ability to call the relevant contact
- Navigation to the relevant destination (determined from the calendar entry) using Google maps
- The ability to launch Spotify from within the app with a single tap
Given the demand for ways to find cheap gas, we subsequently added a feature that finds the cheapest and nearest gas station, in addition to providing more options to the user in terms of third party apps and ETA message formats. Moreover, we added speech recognition and text-to-speech technologies to make the experience completely hands-free. Finally, to make Cargi even smarter, we used rule-based learning to resolve conflicts, such as contacts with the same name. The following sections discuss our technical framework and backend technologies.
The functionality of our project can be divided into the following heads.
- IOS app
- Texting and calling
- Identification of relevant contacts
- Navigation
- Music
- Cheapest gas
- Continuous speech recognition and text-to-speech
- Backend
- Rule-based model for ranking contacts
- Smart suggestion of contacts
All source code for our IOS app can be found in the CS210 cargi-ios repository on Github.
We use texting and calling features provided by MFMessageComposeViewController and tel:// URL. The relevant contact is identified from the calendar event entry, and the phone number is then retrieved from the user's contact. The texting feature sends the ETA of the user to the relevant contact, and provides multiple formats to allow the user to customize the message.
To identify relevant contacts from the calendar entry (event title), we use rules that are indicative of how contacts are usually specified in event title or notes. If the full name is present as a substring, then we identify the event contact without any further processing. Otherwise, we use rules such as the contact being composed of three words, the first name (and last name if available) not being stop words etc. In case these rules return multiple contacts, we use our backend APIs to rank them in decreasing order of event frequency (will be discussed later). These rules are successful in determining the correct relevant contact for the most part. In the future, we might want to integrate core NLP features such as NER to enhance its accuracy.
We use Google maps to navigate the user to the destination or the nearest gas station (if the gas feature is being used) by default. However, the user has the option of using Apple maps instead in the absence of Google maps, or by selecting Apple Maps from the settings. We determine the location from the event entry so that the user does not need to key in the destination.
Cargi opens the preferred app using deep learning. We currently support Spotify (default) and Apple music, and the user can select the preferred app from the settings.
We developed an API located at https://gas-price-api.herokuapp.com/ that takes in a zip code, gets the corresponding city and state using the Ziptastic API, scrapes the MapQuest website for gas prices, and returns a JSON object with the gas prices. We use this with the Google Places API that gets the five nearest gas stations. Using these two APIs, the relevant information is presented to the user, and Cargi can then navigate the user to the desired gas station using the navigation feature described above.
We use RapidEars, a plug-in of a popular IOS speech recognition framework OpenEars, that helps with continuous offline speech recognition. We constantly listen for the phrase "Hey Carghi" (discrepancy in spelling due to better recognition accuracy), and use native IOS TTS (AVSpeechSynthesizer) to denote recognition. We then pass control to Nuance's speech recognition system to listen for more complex instructions. Note here that OpenEars has a very limited vocabulary, and Nuance does not support continuous speech recognition, necessitating the use of two speech recognition systems.
A crucial component of Cargi is the ability to process the data, and offer smart suggestions based on user’s activity and past preferences. We do so using rule-based models. This required us, however, to think about the most suitable backend platforms. After brainstorming and discussions, we finally settled in on Microsoft Azure. Our backend source code is available in the cargi-backend repo on Github.
As discussed in the section on identification of relevant contacts above, it is possible that there are multiple relevant contacts for an event based on our rules. In these situations, we rank the contacts in the decreasing order of frequency of events with the current user. The filterContacts API helps do that. It takes the email address of the currently signed in user, and returns an array of contact names sorted in the correct order.
The eventContacts API helps resolve the situation where one of the two users, say user1, involved in an event has user2 as an event contact, while user2 does not. The API, by traversing through events and event contacts of user1, can determine an event with user2, and recommend user1 as a contact for user2. Thus, this API helps predict event contacts where they cannot be determined otherwise.
Preliminary Sketches and Schematics can be found in this Google Drive folder. These are some of the resources we used during design and development, both from a user experience and a functionality perspective.
The following data map provides information about what data feeds into what part of the application, and the processing required. For more schematics, please visit our Sketches and Schematics folder.
Some of the future extensions of this project include developing an Android app, making smarter suggestions based on analysis of past events (we already have a basic implementation of this feature in our backend repo), integrating with car, support for more third party apps, and integrating with reminders or similar apps to enhance user productivity.
Download our app [here] (https://goo.gl/UhHjX6). On launching the app, our tutorial will help you get up and speed with the features of our app.