Fill This Form To Receive Instant Help
Homework answers / question archive / COMP 417 – Assignment 4 Mapping and RL/Control Out: Nov 22nd, 2020 Due: Dec 3rd, 2020
Mapping and RL/Control
Out: Nov 22nd, 2020 Due: Dec 3rd, 2020. 6pm.
Complete the following two questions following the instructions below.
Time target: spend no more than 3 hours on Q1 and 4 hours on Q2. If you reach these limits and have no “good” solution, write up a one-pager on what you learned and move on to other deadlines.
Obtain the starter code by running:
$ git clone http://github.com/dmeger/COMP417_Fall2020.git
(or run git pull from the folder you worked in for A2)
Submit all your work in the Assignment 4 folder on My Courses.
In this exercise you are going to implement parts of the occupancy grid mapping system discussed in class. In particular, you are going to map an environment based on known odometry estimates and known 2D laser scans. The functionality that you need to implement is marked using to-do comments in the file: estimation_question/build_occ_map.py
This code reads pre-recorded odometry and lidar scans from a file and processes these one at a time (although currently most functionality is missing). When you complete the TODOs, you should see debugging images of the map saved every 25 lidar scans, so you do not have to wait until the end to know if things are going well. The sequence of images should look as follows:
Hand-in your completed build_occ_map.py file and the best final image saved by your code at the end of the best run you achieved. It will be called final_map.png.
For this question, you will choose a control method to balance the famous “cart pole” robot, which has a “cart” that can move in 1D and a freely swinging pendulum (no motor at the link) swinging attached. The ideal controller for this robot can “swing up” the pendulum from any starting configuration and balance it directly upwards without oscillation, while also keeping the cart centred near 0 on the x-axis.
You can do this in several ways and can take on either simple “balancing” or full “swing up and balance”. For this question we want you to explore a method and report on your findings – it’s not just about succeeding at the task, but you must describe what approach you selected, why, and the details in a short write-up, to be graded along with your solution code.
We’ve given you the cart-pole simulator in
lqr_question/cartpole_control.py
You must pick one of these three options to work with the cart pole:
The “work” is to solve for the A, B, Q and R matrices for the cartpole. To assist with this, below we have an appendix giving some help on the dynamics equations that you can work with. With these matrices, you’ll call the provided helper function, and apply the K that you receive from the LQR solver to the state of the robot within policyfn, to compute the control as: ???? = ????(???? − ????). Run the code to see the cartpole balancing (hopefully… note tuning is often required to make it work well).
What to submit for this question?
The state vector of this cart pole is:
???? = [???? ????? ????? ????]????
Where x is the position of the cart along the x axis, in metres, and theta is the angle of the pole (from the zero point: “downwards”) in radians. Both velocity components (with a dot above) are in units per second.
The control for this system is simply a force applied to the cart (black rectangle), and otherwise the robot’s dynamics are determined by a gravity force that pulls the pole downwards, a drag that opposes cart velocity and the mechanical coupling at the single joint.
We can write these dynamics equations on two lines after integrating the dynamics Lagrangian. You don’t need to understand this derivation (we haven’t taught you in 417), but feel free to investigate it if interested. For the assignment, you can just trust us that these 2 equations capture the dynamics that the simulator is using to model the cart pole robot:
The relevant constants for our simulation are:
In order to apply LQR, use pen and paper to express this system as a linear approximation, ????? = ????(????, ????) = ???????? + ???????? around the goal (the upwards balanced point). In the robot’s state space, this goal is ???? = [0 0 0 ????]????. Intuitively, this describes that the pole is upright (???? = ????) and the cart is at the centre, x=0, with zero for both velocities.
A linear approximation is achieved by taking a Taylor Expansion to first order. That means taking derivatives of the non-linear expressions with respect to each variable. The resulting A and B are called Jacobians, computed like this:
For the Q and R matrices for the quadratic cost function, these are obtained by hand tuning in a similar style to the PID gains. There is no one best answer, and tuning values should be a part of your write-up.