June 18, 2026
As I discussed in a previous post, robotic foundation models are designed to take camera inputs and output state deltas. In the simplest case, you can think about a robotic arm. An arm’s end-effector pose can be defined by 7 values, xyz coordinates, roll, pitch, yaw, and gripper width. The model takes in camera and state inputs and outputs a delta for the arm’s position. The robot moves, and the loop restarts.
Under the hood, the arm’s motors need to adjust to the correct position and then hold it. This requires an intricate series of magnets, electric currents, sensors, and microcontrollers to faithfully execute the commands. To really understand the robotic control loop, we must start at the lowest level and work our way up.
Robots move because of motors, and motors move because of the Lorentz Force. The idea here is that you have a magnetic field, and a coil with current running through it. The force generated by this is perpendicular to both the magnetic field and current and is proportional to how perpendicular the current and magnetic field are. So if they run parallel, no force (torque) is generated on the motor.
In the mid 1800s, following the discovery of electromagnetism, the brushed motor was invented. The central element of the design is a current that runs through the center of the motor with magnets attached on the outside. At the start the current is flowing out and generates a clockwise rotational force. However, once it has rotated 180 degrees the rotational force is in the opposite direction. So a special switch called a commutator was invented that reverses the direction of the current once it has rotated past the switch. By flipping a switch twice per rotation, the motor can rotate uninterrupted.
The flaw with this approach is that these motors need to spin absurdly fast for long periods of time, slowly wearing down the physical brush. In the 1960s, a few breakthroughs were discovered that allowed for the creation of a new type of motor that avoided this problem, the brushless motor. The brushless motor became extremely popular in long running high end electronic systems such as robotics.
The key insight of a brushless motor is that we can push this switch functionality to the software layer. The brushless motor inverts the coils and the magnet, putting the magnet on the inside (the rotor, which spins) and the coils around the outside (the stator, which stays still). Then the system drives three sets of coils with separate currents, modulating them together so the magnetic field rotates. It uses sensors to smooth the handoff and accelerate/decelerate the motor. This system only became viable once microcontrollers were small enough to fit inside a motor.
To understand how to modulate the strength of the torque we arrive at the innermost control loop of the motor.
The torque loop runs at 10-20 kHz (10,000-20,000 times per second). The system takes a desired torque from the higher level system and controls the electric currents in the motor to effect the desired torque.
The torque loop relies on two different types of sensors, the current sensors that measure the current through the coils, and the position sensors that allow it to measure the torque that the current will push onto the magnet (which is proportional to the sine of the angle between the fields). In some weight sensitive motors, the current sensors are dropped and the values are reverse engineered in software using other feedback data, although this is uncommon in precision robotics.
Brushless motors rely on three sets of currents that run through the outer coils. Managing three separate currents to keep the motor rotating might seem complex, but there are a series of mathematical transformations called Clarke and Park transforms that allow the motor to use Field Oriented Control. The key idea is we want to project the independent currents into a perpendicular current and a parallel current. If we call the three currents Ia, Ib, and Ic, we are transforming them into two values Id and Iq. The direct current (Id) points straight into the magnetic field and should be reduced to zero. The quadrature current is perpendicular and therefore is proportional to the torque.
A quick aside on why current maps so cleanly onto torque. Torque is the torque constant (Kt) times the quadrature current. The same constant runs in reverse. The torque kicks back a reverse voltage as the speed increases. This gives the motor an overall top speed, and this back-EMF can be used to measure the velocity.
Once we have transformed the current into a value that is proportional to torque, we can construct a controller to modulate the value. We measure the Iq through the current sensors and the transform and compare it to the target value. Then we modulate it using a PI controller which monitors the absolute error and how it accumulates over time to modulate the value at the frequency of the system (10-20 kHz).
In more sophisticated systems, we can improve the estimated values by using our understanding of the system physics as a prior. Essentially you can model the resistance and back-EMF of the motor to make a starting guess about the voltage required. This will produce some residual error to which we apply a PI controller, instead of trying to control the quadrature current directly.
The torque is controllable but not unlimited, and the real ceiling is heat. Current through the winding resistance burns power as I²R, so every motor has a high peak torque it can hold for a second or two and a much lower torque it can hold forever. The controller enforces a current limit so a commanded torque can never overheat the motor.
The final output of this system is a highly controllable torque on the motor at 10k+ updates per second. From this torque (acceleration), we can control the motor’s velocity.
One layer up the abstraction ladder, the velocity loop is in charge of controlling the velocity of the motor. This is a loop that takes a velocity as input, and spits out a target torque for the torque loop. In practice, it normally outputs a target quadrature current, but this is mathematically identical to outputting a torque and is only separated by the torque constant (Kt). This system runs at around 1-2 kHz, roughly an order of magnitude slower than the torque loop.
The velocity loop uses the same position sensor as the torque loop, and diffs the position values to estimate the velocity. The naive strategy would be to build a PI controller similar to the torque loop to target the desired velocity. However, there are a few complications.
First, the velocity estimate in and of itself is quite noisy. Most sensors will use magnets or other mechanisms that measure position in discrete chunks. Because it is not continuous, the measurement has quantization noise. When the motor is moving quickly this is not much of an issue, but as it slows down it causes problems. To allow the motor to stay moving quickly, the measurement switches between slow and fast measurement regimes. There are also a series of techniques such as Kalman filters and others that create realtime models to smooth out the velocity estimates. At the highest end, an IMU is fused in to sharpen the velocity estimate.
The second issue is that a generic PI controller will overshoot and run into issues with inertia, load changes, and other complexities like structural resonance. Similar to the idea from the torque loop, if we build a strong theoretical prior of the velocity of the motor given a set of inputs, we can run the PI controller over only the error term and not the actual torque value. This smooths out the velocity substantially.
This brings us to the final control loop of the motor, the position loop. This loop takes an angle as an input and rotates the joint. As you would expect, the ground truth for this loop comes from the position sensor that has been mentioned in the previous two sections. The position loop usually uses a purely proportional controller since the integrated error term is already captured in the velocity loop. If an integral term was added to the position loop as well, the arm would tend to overshoot the destination.
In some high end systems, the motor encoder can be unreliable for exact position because gearbox backlash and compliance let the motor shaft drift from the joint, so a separate load encoder is attached to the outside and is used as the ground truth for the position. However, having two sensors can cause all sorts of headaches for the control loop.
The move also has to be planned. The arm cannot just move at infinitely high velocity from one angle to another. The two considerations are the maximum velocity of the arm, and the smoothness of the trajectory. Taking these two inputs, a trajectory generator constructs a jerk-limited S curve for velocity and feeds it to the position loop to drive the motor from one location to another. The position loop also runs at 1-2 kHz, often in the same slow loop as the velocity controller.
The entire loop for a motor must be run on a lower level custom system close to the motor to achieve the speed and control needed. This requires a lot of custom hardware and a low level MCU that is designed specifically to run the necessary control loop calculations at high speeds.
The MCU is running a slow and a fast loop at all times. The fast loop works in the following way:
The slow loop runs outside of this:
There is a lot of hardware complexity around how this microcontroller actually controls current and how it is designed to be small and cheap and fast enough to keep these loops running efficiently.
I have been sloppy so far about the difference between the motor and the joint. In almost every real robot they are not the same thing. Between them sits a transmission, usually a gearbox, that trades speed for torque. Electric motors are naturally fast and weak, happiest spinning at thousands of RPM with very little torque, while a robot joint wants the opposite, slow and powerful motion. A gear reduction of 100 to 1 multiplies the torque by a hundred and divides the speed by the same.
The kind of transmission you choose impacts the entire stack. The property that matters most is backdrivability, which is whether you can push on the joint from the outside and have the motor turn, so the motor can also feel the forces the world pushes back. A high-ratio gearbox like a harmonic drive gives huge torque from a small motor, but it is hard to backdrive, so the joint feels stiff. To mitigate this, humanoids use a quasi-direct-drive actuator, a strong motor with only a small reduction, around 10 to 1, which gives up peak torque for a joint that is backdrivable and therefore can do better sensing.
This is also why the load encoder from the position loop exists. A gearbox is never perfectly rigid. It has a little backlash, so the output can move a bit before the input catches up, and it has some flex, meaning the gears and housing deform slightly under load. Both mean the motor’s position is never exactly the joint’s position, so a second encoder on the output reads the truth.
Everything up to here has been about position. You command an angle and hold it. But manipulation is really about contact. Command the hand to a point one centimeter below a tabletop and a position loop will pour in as much torque as its limits allow trying to reach it, crushing the object or straining the robot.
The fix is to control the relationship between force and motion, which is called impedance control. Instead of commanding a position, you make the joint behave like a programmable spring and damper around a target, compliant enough to yield when it touches something, stiff enough to follow a path in free space. You turn the stiffness up for precise positioning and down for delicate contact. This is where backdrivable actuators and good torque control pay off. Because torque is essentially current, a quasi-direct-drive joint can sense external forces and respond softly with no dedicated force sensor at all.
This runs at a lower level than the brain, and slows the system down when it comes into contact with objects. This has an impact on the end-effector state, which is then fed back into the next pass for the brain to reason about.
In the simple case of the robotic arm, these motor microcontrollers are wired up to a high performance edge computer like an Nvidia Jetson. The arm's end-effector pose is defined by a coordinate system with seven values including x, y, z values, roll, pitch, yaw and gripper width. Since the arm is bolted to the table, the coordinate system stays fixed. This allows the brain to take camera input and previous state, run model inference on a GPU, and output a positional change.
To move between two states, the brain plans a path, using inverse kinematics to convert target poses into joint angles. It then sends signals to the microcontrollers to manipulate the joints through a communication protocol called EtherCAT. This is the highest level loop of robotic control on the machine, taking tasks as inputs and manipulating the joints to complete them.
This case is complicated somewhat on humanoid robots. The humanoid moves around freely in space, and so it is complex to even understand the exact location of every joint. The robot must be manipulated quickly and smoothly as a unit to keep balance and avoid all sorts of interconnected actions. This drives the need for a faster optimization algorithm (usually on a dedicated real-time controller) that controls balance, reacting to gravity and other types of contact. This loop runs around 200-1000 Hz.
Above this, we have the cognitive loop where most of the model innovation is happening.