HfG IG4 Designing Systems WS2018/2019

Using Voice – Designing Alternative Speech Interfaces from Scratch

In the course "System Design" we will develop new interaction concepts using speech recognition and natural language processing.

Links:

Course Chat
Code Examples

Speech Interfaces

Communicating with voice, especially speech, is one of the most natural ways for humans to exchange information, give commands and express feelings.

Voice User Interface (VUI) make human interaction with machines and services possible, by using technologies like text-to-speech, speech-to-text, Natural Language Processing to "understand" what a person is telling a computer. Very often those Voice User Interfaces are currently associated to initiate a simple automated service or process e.g. "Hey Siri what's the weather tomorrow?".

Briefing

We find ourselves in an exciting Human communication moment. With the recent arrival of machine learning the typical error rate in speech recognition dropped down from 23% (2013) to ~5% (2017) [1,2]. For the first time it seems likely that the way we all communicate with our computing devices might change very soon. Speech recognition technology matured to "usable" and production ready, mass adoption seems to be around the corner.

This brief asks to look into speech recognition in a 3 steps approach: Build an open source alternative to Siri Cortana Alexa for a specific (simple) use case. Explore the potential and implications of the speech recognition "material" (technology as a material) from a design point of view, with an emphasis on thinking through making. Build an open source proof of concept.

Project outcomes should be expressed through a variety of forms e.g. wireframes, prototypes, visual layout mockups, demonstrators, video and photography of (future) apps, products, or services. These will be documented and presented at the end of the term.

[1] (https://www.slideshare.net/a16z/mobile-is-eating-the-world-20162017/23-23The_arrival_of_machine_learningPerformance)

[2] (https://venturebeat.com/2017/05/17/googles-speech-recognition-technology-now-has-a-4-9-word-error-rate/)

Considerations

Implications of how we communicate with computers/services and to each other?
Human consequences (both positive and negative)?
Research situations where these hands-free, eyes-free interfaces provide real advantages

Schedule (subject to change)

Thu Oct 4

Intro Chat Voice interfaces

Assignment: State-of-the-art research use cases and examples in teams of two

Voice and conversational interface
Text recognition / translation / natural language understanding
Technical issues and glitches
Speech synthesis
Social and psychological issues

Thu Oct 11 (Mechatronik Biggel)

Thu Oct 18

Show & Tell + group discussion of state-of-the-art research

Assignment: Research/try out a technologies from Github repo (run and setup examples)

Snowboy
Snips.ai
Google voice recognition API
Source training data e.g. custom hotword "Hej IG" etc.
Chat bots
...

Thu Oct 25

Show & Tell: working examples, evaluation (technical pros/cons/issues) Workshop issues, Q & A session …

Assignment: Decide/Brainstorm on first early directions

Assignment: Technical Tutorial (Abstract, Requirements, Setup, Running, Examples, Video ...) on Github course repo

Thu Nov 1 (Halloween) 🎃

Thu Nov 8

Individual tutorials Feedback on Technical Tutorial post First early directions

Assignment: User Research -> Experts blind people, every day users … ?

Thu Nov 15 (Laborwoche)

Thu Nov 22

Round table Individual tutorials First concepts

Thu Nov 29

Round table Individual tutorials

Thu Dec 6

Individual tutorials

Thu Dec 13 – Mid Term Presentation

Show & Tell Discussion

Thu Dec 20

Individual tutorials

(Weihnachten)

Thu Jan 10

Individual tutorials

Thu Jan 17

Individual tutorials

Thu Jan 24

Individual tutorials

Thu Jan 31 - Final Presentation

Show & Tell Final Crit Course Feedback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Using Voice – Designing Alternative Speech Interfaces from Scratch

Speech Interfaces

Briefing

Considerations

Schedule (subject to change)

Thu Oct 4

Thu Oct 11 (Mechatronik Biggel)

Thu Oct 18

Thu Oct 25

Thu Nov 1 (Halloween) 🎃

Thu Nov 8

Thu Nov 15 (Laborwoche)

Thu Nov 22

Thu Nov 29

Thu Dec 6

Thu Dec 13 – Mid Term Presentation

Thu Dec 20

Thu Jan 10

Thu Jan 17

Thu Jan 24

Thu Jan 31 - Final Presentation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Using Voice – Designing Alternative Speech Interfaces from Scratch

Speech Interfaces

Briefing

Considerations

Schedule (subject to change)

Thu Oct 4

Thu Oct 11 (Mechatronik Biggel)

Thu Oct 18

Thu Oct 25

Thu Nov 1 (Halloween) 🎃

Thu Nov 8

Thu Nov 15 (Laborwoche)

Thu Nov 22

Thu Nov 29

Thu Dec 6

Thu Dec 13 – Mid Term Presentation

Thu Dec 20

Thu Jan 10

Thu Jan 17

Thu Jan 24

Thu Jan 31 - Final Presentation