C-BISCUIT: Monitoring Your Robot’s HealthJune 10, 2016 by Trevor Gamblin
Now that we've got the Wandboard up and running, it's time to set up some tools to help us check in on our growing robot's status.
In the last article, we went through the basics of getting Ubuntu and ROS installed on the Wandboard. This time around, we need to further tweak the system setup and get it set up in a way that we can easily modify and build upon as we need more specific information from the system.
- C-BISCUIT: A Robotics Platform for the Hacker and Hobbyist
- C-BISCUIT: Design Choices and Justification
- C-BISCUIT Power: 5V 3A Buck Regulator for Wandboard
- C-BISCUIT Power: Crowbar Protection Circuit for 5V Regulator
- C-BISCUIT: The Brains of the Operation
- C-BISCUIT Power: Assembly and Testing of Regulator and Crowbar Circuits
- C-BISCUIT: Monitoring Your Robot’s Health
- C-BISCUIT: Robot System Architecture
- C-BISCUIT: Schematic Design for the RCB—Microcontroller, Motor Controller
- C-BISCUIT: Schematic Design for the RCB—Power, Stepper
- C-BISCUIT: Layout and Assembly for the Robot Control Board
- C-BISCUIT: System Integration and Testing
Unlike in The Brains of the Operation article, where part of the content was a condensed and tweaked version of the installation instructions for Ubuntu and ROS, it'd be unfair (and very unnecessary) to recreate the extensive ROS Tutorials available on their wiki. It'd be ideal if you could take a look at these as part of the lead-up to continuing here, going at least as far as the Publisher/Subscriber in Python. Once you've done that, you'll be ready for the next few articles concerning the software stack.
Finally, since this article will focus heavily on some bash scripting, it's recommended that you get a brief overview if you aren't familiar with the command line—I recommend LinuxCommand.
A Note about Wi-Fi Issues and Updates
If you take a look around the internet, you'll find that some users have had trouble with Wi-Fi connectivity with the Wandboard.
While we haven't encountered that exact situation thus far, in our testing we did see some troubles with updating the software on the board with apt-get where on occasion the system would freeze during the install of a particular package. The simplest workaround for this seems to be to restart the system and install that package individually, before trying the update and upgrade commands again.
If you're experiencing problems with either the Wi-Fi or updating, be sure to comment below and tell us what you're seeing so we can better investigate.
Information During Login via SSH
One of the first things we'll do next is have the Wandboard tell us what is and isn't running when we log in. To that end, we'll need to edit the .bashrc file again, creating an entirely new section where we source a new script we'll create, called .ros_status_script.
It's important that these modifications are made on the Wandboard, itself, because, while you could edit your ssh configuration files to do this, it'd become a hassle if you were attempting to log in from different machines and/or environments. Once the changes are done, every login will provide you with this basic set of info. A complete copy of the script in text form can be found at our GitHub page.
Below I have included a picture of the first iteration of the basic checklist script. Take a look at it, and then continue below to get a short explanation of what's happening:
There's actually not all that much going on in this one. The two things I've said we'd do with this part of the article can involve a lot of repetition, so it's a good idea to write some short functions to make it easy to do them over and over. The first section features variables containing the output of a command called tput, which unfortunately has a rather vague manual page. We're using it to color certain parts of the script's output green or red so it'll be easier to read and find some key information.
The first of the functions we encounter, prog_check, is the core part of this script—it uses the output of ps cax, which provides a list of active programs and PIDs, and searches its output with grep for a keyword that we specify (hence the $1, meaning to take the first argument following the function name when we call it). If the call to grep finds something, it'll trigger the following if-else part to output a line containing the program's name, with "Online" in green next to it; otherwise, it prints "Offline" in red.
The second function, ssh_check, is a simpler function that should help curb having too many remote logins running at once (something I'm definitely guilty of). The variable sshlist uses who to tell us what sessions are active, and the pipe to grep -v localhost makes sure only the results that don't match localhost are put into the variable. The next two variables use wc -l to get the number of lines returned (and therefore the number of sessions), and some basic shell arithmetic to provide a count of the other sessions running, and then we print it all out as before.
Finally, there's the run section. You can see that I've currently set it up so that we're checking for the turtlesim nodes that were part of the earlier ROS tutorials. You could do this for many more important keywords for things you'd run, such as the Listener and Talker nodes found here.
By creating this file on your own Wandboard and adding a line consisting of "source .ros_status_script" to the .bashrc file in your Wandboard's home directory, this script will run every time you start. Below is another screenshot, showing the output of another terminal once you have roscore, turtlesim_node, and turtle_teleop_key all running (along with 7 total SSH sessions!).
The ssh counting might seem a little weird, especially if you note that the script tells you only about other sessions, but lists all of them (including the one you just logged in on). I wrote this under the assumption that when you get going, your remote logins might all be coming from different places (and hence different IPs), which should make it clearer and easier to determine which one is yours.
Extra System Info (and Prettying It Up)
That's a good start, but it'd probably help if we also got a bit of information about the system's free space, memory usage, and CPU load when we logged in. Also, I forgot to color the ssh_check function's output, which isn't necessary, but it'd definitely be a nice touch. Let's do that now.
First off, let's add some color to the sshcheck function. This time, we'll make the numbering cyan. You'll want to add the line cyan=`tput setaf 6` near the variables red and green, and cyan color indicators to the print statements in ssh_check, like you can see below:
If you now save and close the file, then type source .ros_status_script, you'll see the changed output with the number of ssh connections colored.
We'll be using awk to help us format some output from the commands free (memory), df (disk space), and top (various statistics, but shows CPU load). There's a great Stack Exchange answer that has good formatting for what we want to do, so we'll use the examples from there (with slight modifications to the call to df so that we can see a higher precision output). Let's add those to a function so that if we want to make any changes, we only have to make them in one place.
Below you can see a new function I've created, called system_check:
Once you save that file, you can once again use source .ros_status_script to test it out, or just try logging in via ssh again.
As a final addition to our monitoring script, it'd be a good idea to be able to tell what is connected to the robot—whether it's a real device or not. This won't be as useful right now, but once it's rolling around connected to the Wi-Fi, it'll be handy to know how all of the extras that are connected are doing, too.
There are a few good ways to do the device check. Since the output of ls /dev should be relatively small, we'll use grep once again, comparing the baseline output of ls when used on the /dev directory to the same output when things should be connected.
The first thing we need to do is head to our home directory (if not already there). From there, type ls /dev > default_devices.txt, and you'll see that you've created a file with the /dev directory's content within. Now we need to edit our .ros_status_script file once more. Below I've included a screenshot of a function I've added called devices_check:
This one's probably the most complicated out of all of the functions we've done. In it, we (temporarily) check the current list of files in /dev by redirecting ls's output to it. ls is a tricky tool that doesn't do well when you put its output directly into a variable, so in order to be able to grep said output, we'll put it in a file as text first. At the end, you can see that this text file is deleted.
The newdevs variable stores the results of a call to grep that checks for all lines that are present in currentdevs.txt but not default_devices.txt.
devcount is simply a count of the results (note that when it comes to mounting filesystems, we'll often catch multiple partitions from the same device).
Finally, we print the count with some easy-to-see coloring and then, if there were any results, we list them. Try connecting a USB stick or gaming controller and see what happens when you run it!
That pretty much covers the basics of this script designed to tell you what's up when you connect to the robot. As we get into more complicated parts of the software side we'll continue to tweak and improve it, but if you've got ideas of your own, then by all means add them.
In our next few articles, we'll be doing a mix of hardware and software, including putting Digikey's Scheme-It tool to the test for the I/O board we'll be creating.
Next Article in Series: C-BISCUIT: Robot System Architecture