Put all .py files inside the src folder. Put "data" and "vectors" folders inside your project folder, parallel to the src folder. Put your .magnitude file inside the "vectors" folder.
robot_com.py supports command line IO to control your robot using natural English language. With the addition of audio_io.py, you are able to control your robot using voice!
To run command line IO, you need to install matplotlib:
pip3 install matplotlib
Next, change the robot ID with your own in line 15 of robot_com.py. Start the server in one Terminal session, cd into the src folder in another Terminal session and type:
python3 robot_com.py
To run audio IO, you will need to install portaudio and pyaudio:
brew install portaudio
pip3 install pyaudio
Next, you need an account with Google Cloud Platform (GCP). When you register a new account, you'll get $300 of free credits. You need to enable the speech-to-text module, set up a new project and service account in your GCP account and get a service account key file (this is going to be in .json format). Rename it "credentials.json" and put it under the src folder. You may also need to install and set up Google Cloud SDK locally. Look up GCP's documentation for more details.
Next, change the robot ID with your own in line 166 of audio_io.py. Start the server in one Terminal session, cd into the src folder in another Terminal session and type:
python3 audio_io.py
Notes:
-
If you want to try audio IO, please try command line IO first.
-
If you are able to successfully run audio_io.py, say your command (using voice!) and see if text appears in Terminal. To end the session, simply say any sentence containing one of the following keywords: "exit", "quit", "bye" or "goodbye".
Open Terminal and type:
cd .../sphero-project/spherov2.js/examples
sudo yarn server
Keep it running. Open another Terminal session and type:
cd .../sphero-project/src
For command line IO, type:
python3 robot_com.py
Or for voice IO, type:
python3 audio_io.py
- Turn your front/back light green
- Turn off your front/back light
- Flash the following colors for 2 seconds each: red, blue, green
- Make a (counter-clockwise) circle/donut
- Run in a (counter-clockwise) square
- Increase/decrease your speed
- Run faster/slower
- Speed up/slow down
- Sequential directional commands like “go straight, come back and turn left”
- Fall
- Run away
- Dance for me / make some moves
- Sing for me / make some noise
- Scream
- Turn your head to face left
- Look behind ya
- Look forward
- What's the color of your front/back light?
- What's your name?
- How should I call you?
- I wanna call you Jack
- How much power do you have left?
- What's your battery status?
- You are on a 3 by 3 grid
- You are at 0, 0
- There is a chair at 0, 1
- Go to the right of the chair
- Go to 0, 0
- There are some flowers below the chair
- There is a bomb at 2, 2
- Quit / Bye
Inside the merge folder, you can find merge.py. If you put students' .py files inside the merge folder and call
python3 merge.py
A new file "r2d2TrainingSentences.txt" will be generated, which contains the merged training data from all students' .py files. The results are sorted according to categories and duplicates are removed.
- Right now, all sentences are fed into intent detection to be classified into one of 6 categories. After a category is determined, we have specific parser for that particular category. In the future, we can split the existing 6 cateogories into more categories, which will reduce the amount of pattern matching within each cateogory. But this will also require more accurate classification/intent detection, which may require a better representation of the sentence or a better way to generate a sentence embedding.
- Find a better representation of the sentence or a better way to generate a sentence embedding. Right now we are splitting the sentence into words, getting an embedding for each word and taking a component-wise average along each dimension. The results are very good. But as we increase the number of categories, this may not be sufficient.
- Add text-to-speech to give the robot voice
- Add more pattern matching to existing parsers
- Add more training sentences