Block programming

Block server

We have in, Pepper connection part, implemented head gestures, eye gestures, text-to-speech, and also translation gestures. The implementation follow the UML diagram that we did during the last sprint.

The methods for moving is pretty general where you can define how fast you want to move pepper in every direction, such as x, y and rotation.

The head gestures is too pretty general but you can only define one direction at a time. You can however call the function twice to get the head moving up and to the right at the same time. But the function itself can only do one thing at a time. There is also functions for shake and nod head.

The text-to-speech is simple, only send the text to Pepper so that it later can say the input text.

The expressions with the eyes is quite done, we haven’t implemented any functionality for ear and shoulder expressions yet. Those expressions that exists are

  • rotate eyes, which makes a color rotate around the eyes.
  • fade eyes, which fades the eyes color to another color
  • angry eyes, which makes part of the eyes red and the upper part black
  • sad eyes, which makes the eyes blue
  • blink eyes, which makes the eyes blink
  • squint eyes, which makes the eyes squint for a duration of time
  • random eyes, which makes the eyes go random color for a duration of time
  • wink eye, which makes one eye wink

What we have left to do is arm and hand gestures, we also need to implement a way to retrieve data from the blockly side.

Blockly

The issues with generating python2.7 code with the build in codegenerator in Blockly are now resolved. Furthermore, the ability to send the generated code to the Block server through HTTP is added. Blocks for each of the expressions mentioned above in Block server have been implemented.

There was also a dependency issue with the NPM package of Blockly and some extensions used for creating prettier blocks. This was resolved with changing the version of some of the packages so the Blockly site works as intended again.

In the upcoming sprint, we will focus on adding all the basic syntax blocks from Blocklys library to the toolbox (the part of the Blockly editor where all available blocks are). Because of the fact that the target group is almost only Swedish speaking children with very limited English knowledge we will translate all blocks to Swedish from English which it is right know. It could be smart to have the option to choose between Swedish and English in which case both implementations is nedded.

Lastly, we will keep adding new blocks as the Block server group defines more functionalities.

Web interaction

The work during sprint 1 led to a working ask_wikipedia function. It was implemented as expected; with us recording a short question phrase, transcribing that audio file into text and using that as input to wikipedia. The main developing hurdle was to manage the audio file and transcribing. There were some initial permission challenges with accessing the audio file, as well as design questions as in how long Pepper should be listening.

The permission challenges were resolved by connecting to Pepper via SSH. There are some libraries currently used to aid with this, that might become deprecated with python 2.7 and it might have to be addressed in the future. But for the time being it’s working, and perhaps the group in the future agrees on a standard connection.

The listening question poses a bigger problem. Currently the solution is a hard coded sleep, meaning after the ask_wikipedia function is called one has X seconds to ask a question (currently 3sec). This solution is not good enough. Pepper has built in functions to aid with “Speech detection” and that is something to be implemented in the future.

The wikipedia package chosen turned out great. It provides well structured information for Pepper to use when answering.

Rock-paper-scissors

During this sprint we have managed to show images on Pepper’s tablet and also together with members of the Blockly group have found how to open up the settings menu on Pepper’s tablet. This menu is android based and can change settings such as screen brightness and network connection for the tablet. In addition, we also made documentation on how to open these settings.

The main thing we focused on this sprint was that Pepper should be able to take pictures, which we managed to do. However in order to get meaningful pictures that can be sent to the image recognition software, we would need to add support for recognizing and tracking hands of the current player which we have started to look into and will work more on in the next sprint.

We have also installed a linter and formatter for Pep8 which the entire team decided should be standard in the project.

Image recognition

The image recognition part of the sprint has consisted of mainly working with the tensorflow dataset and trying to get a model working to start getting correct predictions. A couple of models have been trained and with the help of webcams, the models have been tested to see the correctness of the predictions. The image recognition group will at some point be able to use Pepper’s camera, but using an external camera makes it easier to test for now… The model is far from done and needs a lot of iterations which has been a problem since the group still has not gotten access to a GPU-cluster, which makes it more troublesome to train models, especially models with hundreds millions of trainable parameters.

Another part of the image recognition group has been trying to implement a hand recognizer from an open source project called MediaPipe. Most of the time has been spent reading the documentation to get a better understanding on the implementation part.

The trained model is far from perfect and needs more work done, which will hopefully get done faster once the GPU-cluster is available. The open source project used for hand recognition will be implemented shortly and tested out to get an understanding if we can use MediaPipe or not.