Robobot Raspberry strange error
Back to Robobot
Error
The demo behaviour (pressing the green button) fails, typically after a few (< 10) successful runs.
It could be that some commands (e.g., servo or drive commands) do not run. After some time, nothing runs, and the Python script complains that there is no 'HBT data after 31 runs'. like (from the robobot/mqtt_python/log_out.txt):
% BCM GPIO found OK % Starting % Connected to MQTT Broker localhost on 1883 % ConnectedOut to MQTT Broker localhost on 1883 % GPIO setup start % GPIO setup finished % I am now accepted as master of robot unknown % Robot (srobot.py) got no HBT data after 31 loops (stops). <---- here % Is teensy_interface running? % Camera thread stopped <---- shut down % Edge (sedge.py):: turns on line sensor % Service - thread stopped ...
What happened
The teensy_interface stalls, and this is visible from the 'log_t0_teensy_io.txt'
At some point, the teensy_interface detects that time is not continuous. This is reported as a 'time glitch', which can happen if the system time is updated by the Network Time Protocol (NTP), but that is not the case here.
The interesting part of the logfile typically looks like this:
1769677836.0802 Rx ;67pose 3403.6880 0.424 -0.002 -0.0046 3.1388 <= normal, 1769677836.0813 Rx ;73livn 533 528 539 540 568 581 568 572 5 <= a message every ~1ms 1769677836.0824 Rx ;56ird 1.500 0.349 1723 6516 1 1769677836.0846 Rx ;68liv 306 404 460 453 450 442 413 305 25 1769677836.0857 Rx ;66pose 3403.6931 0.425 -0.002 -0.0046 3.1389 1769677836.0857 Rx ;03vel 3403.6941 0.249 0.251 0.0093 0.249 4 1769677836.0865 Txd ;33motv 1.99 2.13 1769677836.0868 Rx ;79gyro 0.209869 0.141235 -0.005859 3403.694 1769677836.0869 Rx ;43acc -0.005127 -0.004150 1.263184 3403.695 1769677836.0901 Rx ;71pose 3403.6980 0.426 -0.002 -0.0046 3.1389 <= Last OK message. 1769677839.2960 Txd ;63leds 14 0 45 0 <= Back after 3.2 sec. 1769677839.2961 NTP time glitch of 3.206 sec, maybe an NTP update <= Detected time glitch. 1769677839.2966 Txd ;35alive 1769677839.2971 Rx ;36mot 2 2.1 0.1 0.1 0 <= start receiving queued 1769677839.2972 Rx ;62livn 530 515 541 549 565 587 564 540 5 messages from Teensy 1769677839.2973 Rx ;11mca 0.081 0.074 16 Processing a message 1769677839.2973 Rx ;38vel 3403.7019 0.251 0.247 -0.0185 0.251 4 every ~0.1 ms 1769677839.2974 Rx ;61pose 3403.7031 0.428 -0.002 -0.0046 3.1389 1769677839.2975 Rx ;70gyro -1.224457 -0.743774 -0.036377 3403.706 1769677839.2976 Rx ;91acc 0.001221 -0.024414 1.346924 3403.707 ... 1769677839.2985 Rx ;66pose 3403.7229 0.433 -0.002 -0.0046 3.1389 1769677839.2986 Rx ;87vel 3403.7261 0.245 0.252 0.0305 0.245 4 1769677839.2987 Rx ;69pose 3403.7278 0.434 -0.002 -0.0035 3.1389 1769677839.2987 Rx ;79livn 516 528 555 565 579 580 547 529 5 1769677839.2988 Txd ;34motv 1.99 2.14 1769677839.2988 Rx ;17gyro -0.553070 -0.377563 0.177246 3403.730 1769677839.2989 Rx ;35acc -0.013184 -0.034180 1.329102 3403.731 1769677839.2990 Rx ;36mot 2 2.1 0.1 0.1 0 1769677839.2990 Rx ;17mca 0.077 0.083 17 1769677839.2991 Rx ;68pose 3403.7329 0.435 -0.002 -0.0046 3.1388 ... 1769677839.3050 Rx ;03gyro -0.095306 -0.011353 0.024658 3403.850 1769677839.3052 Rx ;20acc -0.013916 0.023438 1.248779 3403.851 1769677839.3053 Rx ;30vel 3403.8540 0.250 0.245 -0.0181 0.250 4 1769677839.3053 Rx ;65pose 3403.8540 0.465 -0.002 -0.0035 3.1388 1769677839.3054 Rx ;83pose 3403.8589 0.467 -0.002 -0.0046 3.1389 1769677839.3055 Rx ;72livn 524 528 551 556 561 585 569 545 5 1769677839.3056 Txd ;18motv 2.02 2.13 1769677839.3056 Rx ;29vel 3403.8621 0.252 0.250 -0.0090 0.252 4 <= last message in queue 1769677839.3067 Txd ;34motv 1.99 2.14 After this, no regular 1769677839.3266 Txd ;64leds 14 0 55 0 data from Teensy, just 1769677839.4115 Qu 1 ;39!servo 1 -800 1000 reply to commands 1769677839.4123 Tx ;39!servo 1 -800 1000 1769677839.4134 Rx ;29confirm !servo 1 -800 1000 1769677839.4134 Rx ;41# setvo t=3407321, 0, ref=-800, e=-90000, value=9000, vel=1000, dw=-1000, v=2163 1769677839.4157 Rx ;46# setvo t=3407323, 0, ref=-800, e=-89000, value=8000, vel=1000, dw=-1000, v=2150 ... 1769677839.4444 Rx ;98# setvo t=3407351, 0, ref=-800, e=-75000, value=-6000, vel=1000, dw=-1000, v=1965 1769677839.4458 Rx ;95# setvo t=3407353, 0, ref=-800, e=-74000, value=-7000, vel=1000, dw=-1000, v=1951 1769677839.4474 Txd ;65leds 14 0 65 0 ... 1769677839.6890 Txd ;67leds 14 0 85 0 1769677839.8098 Txd ;68leds 14 0 95 0 1769677839.9306 Txd ;09leds 14 0 105 0 1769677840.0514 Txd ;68leds 14 0 95 0 1769677840.1730 Txd ;67leds 14 0 85 0
The assessment
After some time in the time glitch, the Teensy USB interface is informed that there is no space for more messages. The Teensy reacts by removing all subscribed message types, including heartbeat (HBT).
The Python program will not run if there is no heartbeat from the Teensy.
Short-time glitches (less than 2 seconds) will not stop the Teensy, but they can be fatal for a mission.
The cause is unknown, but it seems like the operating system is stopping the teensy_interface (and probably the Python app) for this number of seconds.
The conclusion
The cure was to replace the SD card with a fresh image from another robot.