Andy Zeng
Inverse Kinematics

Inverse Kinematics

Inverse kinematics refers to the use of the kinematics equations of a robot to determine the joint parameters that provide a desired position of the end-effector. Specification of the movement of a robot so that its end-effector achieves a desired task is known as motion planning. Inverse kinematics transforms the motion plan into joint actuator trajectories for the robot. The movement of a kinematic chain whether it is a robot or an animated character is modeled by the kinematics equations of the chain. These equations define the configuration of the chain in terms of its joint parameters. Forward kinematics uses the joint parameters to compute the configuration of the chain, and inverse kinematics reverses this calculation to determine the joint parameters that achieves a desired configuration. For example, inverse kinematics formulas allow calculation of the joint parameters that position a robot arm to pick up a part. Similar formulas determine the positions of the skeleton of an animated character that is to move in a particular way. This project was worked on jointly between me and my colleague, Cheng-Yu (Charlie) Pai.


Our user interactive project program calculates and renders the movements of a robotic/jointed arm using the transpose Jacobian method of inverse kinematics. The arm is segmented and connected in between a series of ball joints; each segment is individually constructed with a length (in world space coordinates) and a range in (radians denoting the degrees of freedom) specified by the user (no pin or prism joint functionalities were implemented). The program was dynamically designed to support multiple arms, each of which contains a number segments customized by the user’s command prompt commands.


Upon initializing the program, a C++ OpenGL window (coded on a Windows platform) shows up, rendering the arm. By default, the camera is set to be located at (0, 2.5, 0), looking at (0, 0, 0) in world space coordinates, staring at a roughly 3x3 viewbox centered at the origin. On command prompt specifications, a user can generate an arm, and specify the properties of its respective segments, as well as the base point at which the arm would begin at in world space. The camera can rotate with the arrow keys, and translate with shift-arrow keys. When the user clicks on the screen, the point on the 2D window clicked will be projected into world space against a plane 2.0 world space units away from the viewing plane. That particular point will then be used as the end point for all arms simulated in the system, which will then calculate Jacobian matrices for each arm and the segments will animate, attempting to reach the endpoint with the specified set of degrees of freedom and alpha values. The user can use respective mouse clicks and camera movements in conjunction to specify the desired end point.

A problem that we encountered was the bug where if the arm was fully extended, subsequent clicks of the mouse would cause the arm to not bend at all, missing the goal. Our solution was to test when the arm was fully extended and slightly bend each of the joints so that the Jacobian would recalculate to obtain the correct answer. This resulted in the arm swinging around towards the point and then bending towards the goal. This is not aesthetically pleasing, but gets the job done.

We do have a small bug in which the arm would sometimes jerk randomly if the goal is out of reach. This is a byproduct of using the transpose Jacobian rather than the SVD pseudo-inverse.

Initial Steps

When designing our approach to coding a functional IK system, flexibility was the top priority. From previous coding experiences, Charlie and I both knew full well that having flexible code was above all, the best time saver (with it, we could then add extra credit functions with ease). With a top down design model in mind, we began coding a rough skeleton for our code consisting of many empty classes and some OpenGL initialization code (since we had decided on using OpenGL for rendering purposes). We consistently interspersed debugging/printing code and breakpoints throughout our collection of classes, including arms, segments, and cameras, to ensure that our future bugs did not (out of carelessness) result from faulty parameters, erroneous arguments, or incomplete dependencies.

Another crucial point in our design phase was deciding which C++ math library to use. After some research, we narrowed it down to the Eigen and GLM API. At first, we had only in mind to use one math library. However, after more consultation we realized that it was best if we used both (and besides, confusion between using both could be avoided easily anyways with proper namespace declarations). With Eigen, it was very easy to handle matrices with multiple columns, especially when dealing with Jacobians on arms that had anything over 4 segments. GLM was limited to creating up to 4x4 matrices, so we decided to use Eigen for the math behind the arms. On the other hand, we were able to take care of camera transformations relatively easily with the GLM library. Hence, we were able to get the best of both libraries.

The next step to our design phase was creating the command line arguments. Since we had from the very beginning the idea of being able to generate multiple arms with varying numbers of segments (each with a different length), we quickly established some basic parsing code to take in particular command line arguments and initialize the respective arms and segments.

Lastly, we created a file input system for easier generation of complex arms. This way we could create branches upon branches and multiple arms without having long lists of command line arguments.

Command line arguments

Format: inverse_kinematics [-f filename]
-f - specifies the file to read
filename - file name
Example: ./inverse_kinematics.exe -f single_nobranch.txt

Format: inverse_kinematics [-arm ns bx by bz -seg sr at ap … -seg sr at ap] … [-arm ns bx by bz -seg sr at ap … -seg sr at ap]
-arm - indicates the creation of a new arm
ns - (integer) number of segments the arm will have
bx - (float, suggested: -2.0 < x < 2.0) the x coordinate for the base point of the arm
by - (float, suggested: -2.0 < y < 2.0) the y coordinate for the base point of the arm
bz - (float, suggested: -2.0 < z < 2.0) the z coordinate for the base point of the arm
-seg - begins parameter input for a segment
sr - (float) length of the segment in world units
at - (float - radians) theta value of the degrees of freedom for the segment’s lower joint
ap - (float - radians) phi value of the degrees of freedom for the segment’s lower joint

Example: (creates two arms, 4 segments each, middle join has a different degree of freedom):
inverse_kinematics -arm 4 0 0 0 -seg 0 0.5 0 1.57 -seg 1 0.5 0.785 1.57 -seg 2 0.5 1.57 1.57 -seg 3 0.5 1.57 1.57 -arm 4 1 0 0 -seg 0 0.5 0 1.57 -seg 1 0.5 0.785 1.57 -seg 2 0.5 0.785 1.57 -seg 3 0.5 1.57 1.57

File input system
# specifies a comment
arm ns bx by bz: Creates a new arm. ns specifies number of segments, and bx,by,bz specifies the base location of the arm.
seg sr at ap: Specifies the parameters of each segment of the current arm. sr is the length of the segment, and at,ap is the theta and phi of the degrees of freedom of the lower joint.
goal x y z: Specifies the initial goal’s location of the current arm.
br n ns: Creates a branch off of the current arm. n is the segment which the branch starts at, and ns is the number of segments of the branch. Subsequent lines of seg, goal, and br will use this branch as the current arm.
endbr: Ends specification of the current branch.

See sample test files for reference.

Calculating the Jacobian

The degrees of freedom were specified as a theta and phi with respect to the world axes (defined like the spherical coordinate system). We chose this because conversion into Cartesian coordinates and building the Jacobian were very simple. The Jacobian matrix is shown on the right.

It was a simple matter to transpose the Jacobian and multiply it to a dE in order to find a dq, then add it to the angles of each segment. The dq vector was scaled by an alpha=0.3((dE,JJTdE)/(JJTdE,JJTdE)) so that the angle change would not be too small. We also limited how large each element of dq could be in order to make the movement smoother. We set it arbitrarily to .08 radians.

Creating a Tree Structure

The tree structure was a simple additional variable in each Segment instance pointing towards an Arm instance to represent a branch. These branches would automatically adjust their base location if the parent segment changes locations (based on angles), but still rely on the original method for goal finding. Doing it this way allowed us to automatically calculate everything without changing any of the original code.

Calculating the Camera

Since we were going to transform the camera with matrices, we created a class dedicated to the camera that would take care of the viewport, glOrtho, and gluLookAt functions. After having written one too many cameras to work with OpenGL throughout the course already, this portion of the project was a breeze. We coded in the keyboard functions to create matrices that would be applied to transform the camera’s look-at and look-from positions. Within minutes, rotations and translations were working well (tested by printing out the transformation matrices and camera locations).


Left mouse: Select an endpoint in world space to be used by the arms.
Arrow keys: Rotate camera around the lookat point (default: origin).
+/-: Zoom in/out.
Shift up/down: Translate the camera up or down.
Spacebar/ESC: Exit the program.
1: Selects the previous arm.
2: Selects the next arm.

Drawing the Arms

Initially, we designed to draw the arms with multiple diamond (spearhead-shaped) figures created out of lines in 3D space. However, after rendering our first arm, we realized that not only did it look very simple, but it also looked very unprofessional and bland. The position of the camera was pretty difficult to tell at first sighting as well, unless the view angle was rotated around a few times. So then we decided to render it with different colored cylinders and spheres, with some stationary lighting as to give the viewing angle some relativity. The spheres would represent the ball joints, the cylinder would represent the arms, and a small sphere at the end of each arm would indicate the arm’s endpoint. Each color of the segments and balls are diffuse coefficients generated from random RGB values. Since each arm is drawn at every iteration of the loop within the ‘display’ method, the random RGB values are calculated only on the first iteration of the loop, the values are then stored in a data structure, and the successive iterations refer to the data structure for the color of the arms.

Calculating Mouse Position

To make the IK system interactive, we needed a way for the user to easily specify the endpoints that the arms would try to reach. After considering some pretty cool possibilities, like prompting the user to input the coordinates of the endpoint, or having the user move the endpoint around with the w-a-s-d keys, we decided that it would be best if the user could simply click on the screen, and a point in world space would be generated x-distance away from the viewing plane of the camera. Although this seemed extremely straightforward, we ran into multiple problems along the way.

The first approach to coding this function involved calculating the ratio of the mouse position to the viewport, then projecting the point to a parallel plane x-distance away from the camera position orthogonal to the camera’s direction of view. However, because our viewing camera’s size and zooming function involving changing the parameters of glOrtho, some glitches started to appear where the 3D position in world space calculated from the parallel plane would not be exactly where it looked like the mouse was clicking. Most of the time, the point was only off by a marginal value, but we couldn’t find a way to fix it. So instead, we decided to take the z-buffer approach, where we utilize the OpenGL’s built in z-buffer to calculate the endpoints.

Since the z-buffer determines the rendering distance from the camera, and some OpenGL built in functions allow us to calculate the exact points where the rays from the camera first hit an object in space, we took these functions to our advantage and built a transparent plane x-distance away from the camera and always parallel. Then we coded it such that when the user clicks on the screen, the OpenGL z-buffer functions would return the position where that particular ray of the camera first hits an object, which is the parallel plane (or sometimes the arms themselves, which we thought would be a cool feature). This allowed our mouse input feature to work very nicely, as the points returned were always exactly parallel to the position of the mouse on the viewing plane, and under the mouse on the OpenGL window.

After getting the mouse inputs to work well, we coded it such that every arm could potentially have a different endpoint. The user would be able to specify which arm by cycling through the endpoints (the selected endpoint ball would be colored white), and then proceed to move the endpoint around by clicking on the screen. Since the plane x-distant from the viewing plane is always parallel, the user can rotate/translate the camera to get a close/further endpoint and adjust as he wishes.


The majority of our program was developed with printing and debugging code interspersed in between chunks of code. Since we were approaching the project in a top-down manner, we had to debug every small part of the code at every step of the way to ensure that we were not making any silly mistakes that could cost us hours to debug in the future. This strategy did indeed save us a lot of time.

Video Demonstration (Beta Version)

Video Demonstration (Final Version)