Skip to content

Latest commit

 

History

History
113 lines (78 loc) · 4.17 KB

File metadata and controls

113 lines (78 loc) · 4.17 KB

HfG IG4 Designing Systems WS2018/2019

Using Voice – Designing Alternative Speech Interfaces from Scratch

In the course "System Design" we will develop new interaction concepts using speech recognition and natural language processing.

Links:

Speech Interfaces

Communicating with voice, especially speech, is one of the most natural ways for humans to exchange information, give commands and express feelings.

Voice User Interface (VUI) make human interaction with machines and services possible, by using technologies like text-to-speech, speech-to-text, Natural Language Processing to "understand" what a person is telling a computer. Very often those Voice User Interfaces are currently associated to initiate a simple automated service or process e.g. "Hey Siri what's the weather tomorrow?".

Briefing

We find ourselves in an exciting Human communication moment. With the recent arrival of machine learning the typical error rate in speech recognition dropped down from 23% (2013) to ~5% (2017) [1,2]. For the first time it seems likely that the way we all communicate with our computing devices might change very soon. Speech recognition technology matured to "usable" and production ready, mass adoption seems to be around the corner.

This brief asks to look into speech recognition in a 3 steps approach: Build an open source alternative to Siri Cortana Alexa for a specific (simple) use case. Explore the potential and implications of the speech recognition "material" (technology as a material) from a design point of view, with an emphasis on thinking through making. Build an open source proof of concept.

Project outcomes should be expressed through a variety of forms e.g. wireframes, prototypes, visual layout mockups, demonstrators, video and photography of (future) apps, products, or services. These will be documented and presented at the end of the term.

[1] (https://www.slideshare.net/a16z/mobile-is-eating-the-world-20162017/23-23The_arrival_of_machine_learningPerformance)

[2] (https://venturebeat.com/2017/05/17/googles-speech-recognition-technology-now-has-a-4-9-word-error-rate/)

Considerations

  • Implications of how we communicate with computers/services and to each other?
  • Human consequences (both positive and negative)?
  • Research situations where these hands-free, eyes-free interfaces provide real advantages

Schedule (subject to change)

Thu Oct 4

Intro Chat Voice interfaces

Assignment: State-of-the-art research use cases and examples in teams of two

  • Voice and conversational interface
  • Text recognition / translation / natural language understanding
  • Technical issues and glitches
  • Speech synthesis
  • Social and psychological issues

Thu Oct 11 (Mechatronik Biggel)

Thu Oct 18

Show & Tell + group discussion of state-of-the-art research

Assignment: Research/try out a technologies from Github repo (run and setup examples)

  • Snowboy
  • Snips.ai
  • Google voice recognition API
  • Source training data e.g. custom hotword "Hej IG" etc.
  • Chat bots
  • ...

Thu Oct 25

Show & Tell: working examples, evaluation (technical pros/cons/issues) Workshop issues, Q & A session …

Assignment: Decide/Brainstorm on first early directions

Assignment: Technical Tutorial (Abstract, Requirements, Setup, Running, Examples, Video ...) on Github course repo

Thu Nov 1 (Halloween) 🎃

Thu Nov 8

Individual tutorials Feedback on Technical Tutorial post First early directions

Assignment: User Research -> Experts blind people, every day users … ?

Thu Nov 15 (Laborwoche)

Thu Nov 22

Round table Individual tutorials First concepts

Thu Nov 29

Round table Individual tutorials

Thu Dec 6

Individual tutorials

Thu Dec 13 – Mid Term Presentation

Show & Tell Discussion

Thu Dec 20

Individual tutorials

(Weihnachten)

Thu Jan 10

Individual tutorials

Thu Jan 17

Individual tutorials

Thu Jan 24

Individual tutorials

Thu Jan 31 - Final Presentation

Show & Tell Final Crit Course Feedback