CSE 478
…
Lecture 4, April 1
- Motion models:
- Capture change in state, \(P(x |
a)\)
- We can use simple models, adding some noise to account for
errors
- Good models incorporate it’s uncertainty
- Better to overstate uncertainty
- Our MuSHR motion model:
- Kinematic: assumes “speed”, not acceleration, is the action
- Assume perfect, flat ground
- State: vec of x, y, theta (pos, heading)
- Controls: speed, steering angle
- New state is old state plus the delta
- Noise:
- Add noise to control before propagating through the model
- We add noise to state after the new prediction
- Euler’s Theorem: TLDR; for any moving, turning object, there is a
“center of rotation”, where all of the forces on the car are
perpendicular to that center
- Equations:
- \(\dot{\theta} = w = \frac{v}{L} \tan
\delta\)
- MuSHR sensor model:
- \(P(z_t | x_t) \rightarrow P(z_t | x_t,
m)\)
- \(z_t\): observation (laser scan),
\(x_t\): state, \(m\): map
- Sources of stochasticity:
- Measurement noise (gaussian around measurement)
- Unexpected objects, e.g. map is wrong and someone is in between (add
probability between 0 and sensor reading)
- In class, we’ll use linear probability between zero and obs
- Sensor failures (add some noise at \(z_\text{max}\))
- Random measurements (add uniform random measurement)
Lecture 5, April 3
- Monte-Carlo methods: random sampling to estimate expectation
- Importance sampling:
- Sample from a simple distribution
- Take samples from proposal dist \(q(x)\)
- Reweight the samples to match target distribution \(p(x)\)
- Reweight samples by \(\frac{p(x)}{q(x)}\)
- Belief updating:
- Wikihow How-To:
- Predict our current beliefs forward with motion model
- Reweight our beliefs with observation model
- Renormalize our beliefs
- Resample from our beliefs to prevent getting stuck
- Particle samping issues:
- Two-room issue:
- Beliefs will coalesce in a feedback loop
- E.g. fix: resample less if equal confidence in most weights
- Fix: low-variance resampling, take initial point and then uniform
steps
- Can have variable amounts of particles
Lecture 6, April 8
- Kalman Filtering: belief is a Gaussian, linear Gaussian systems
- Works with continuous distributions
- Paramaturized by \(\mu_t\) and
\(\sigma_t^2\)
- Works with linear models with Gaussian noise
- \(x_t = ax_{t-1} + bu_t + \mathcal{N}(0,
\sigma_r^2)\)
- \(z_t = cx_t + \mathcal{N}(0,
\sigma_q^2)\)
- Kalman gain: \(\frac{\text{Var(estimate)}}{\text{Var(estimate)} +
Var(measurement)}\)
- In multiple dimensions, we start to use covariance instead of
variance
- \(x_t\) is a distribution, but
often used in equations as the mean of the distribution
- More efficient than particle filters, but requires linear motion and
sensor models
Lecture 7, April 10
- We are not studying the extended Kalman filter
- Requires us to compute the Jacobian
- Has issues with highly non-linear functions, only the mean gets the
nonlinear update
- Unscented Kalman filter:
- Easy to get the mean of the new Gaussian - just run the old mean
through the non-linear update
- Getting the variance is harder, we don’t have an easy exact way to
do so
- Instead, we model the distribution (and get it’s variance) via
sampling:
- We get \(2n+1\) “sigma points”,
where \(n\) is the dimensionality of
the state vector
- First sigma point is the mean
- \(x_i = \mu +
(\sqrt{(n+\lambda)\Sigma})_i\) for \(i=1,\dots,n\)
- \(x_i = \mu -
(\sqrt{(n+\lambda)\Sigma})_{i-n}\) for \(i=n+1,\dots,2n\)
- Weight the mean higher, because we know it is true
Lecture 8(?), April 15
- Control: create actuator controls to follow a
plan!
- Plans:
- Tracking a reference trajectory
- Time-parameterized: timed path
- Index-parameterized: no time in path
- Open-loop: get commands and run without modification
- Feedback control:
- Measure error between reference and current state
- Take actions to reduce error
- Choosing reference state for index-parameterized path:
- Can use nearest point, but can cause issues
- Pick point with at least \(l\)
distance away
- Compute error to reference state:
- Give reference state it’s own coordinate frame
- Errors:
- Along-track error: \(e_\text{at}\)
(error in \(x\))
- Cross-track error: \(e_\text{ct}\)
(error in \(y\))
- Heading error: \(\theta_e\) (error
in \(\theta\))
- See slides for equations
- We will only control steering angle in this course, so we ignore
along-track error
Lecture 9, April 17
- Compute control law: \(u = K(x,
e)\)
- BANG-BANG CONTROL:
- Choose between max left and max right
- (Note: left is positive, right is negative)
- Think about controllers as a system: this is a pendulum
- PID (Proportional-Integral-Derivative) control:
- The “benchmark” controller (like linear regression in ML)
- Very popular in practice
- \(u = -(K_p e_{ct} + K_i \int e_{ct} dt +
K_d \dot{e}_{ct})\)
- \(K_p e_{ct}\): (Proportional)
present term
- \(K_i \int e_{ct} dt\): (Integral)
past term
- \(K_d \dot{e}_{ct}\): (Derivative)
future term
- The \(K\)’s are the gains for each
of the terms
- Proportional gain:
- Low: long time to correct, smaller oscillations
- High: quickly correct, larger oscillations
- Integral term:
- Can cap the absolute value of term
- Discount over time
- Finite-time
- Derivative term:
- Dampens the action by the change in error
- Reduces oscillations
- Numeric differentiation is terrible, use analytic derivatives
- \(\dot{e}_{ct} = V
\sin(\theta_e)\)
- How to prove a controller is stable?
- Stability means error and derivative of error are \(0\) as \(t
\rightarrow \infty\)
- Passive error dynamics: what happens as we don’t apply any
controls?
- With a gradient controller, we’re essentially performing gradient
descent on an energy loss landscape
- This shows any positive controls brings the error to zero
- There exists a closed-form solution for a provably-correct
controller (in theory)
Lecture …, missed
Lecture ?, April 24
- Social navigation: help humans feel comfortable and safe with
robots
- Collaboration/teaming: shared autonomy
- Learning from humans: how can humans teach a robot?
April 29
- Global planning: plan high level abstract actions
- Local planning: physical actions being taken for a high-level
action
- Configuration space (C-space):
- Represent state as a point in high-dim space
- We can use \(\mathcal{S}\) to
represent a circle
- The cross product of two is a torus
- Obstacle specification:
- Robot is a point in C-space
- Some of this space is full of obstacles
- Our obstacle C-space is all points between the intersection of real
obstacles and possible robots
May 1
- Explicit graph: precompute all collsions with a graph
- Implicit graph: jointly discover graph and find path within it
- “Best”-first search:
- Use priority queue with some priority function
- Completeness: returns a solution if it exists, else returns
nothing
- Heuristics: estimate the cost to go
- Admissibility: heurisic is less than or equal to the goal
- Consistentency: triangle inequality, \(h(s) \leq c(s, s') + h(s')\)
- Weighted A* multiplies heuristic by \(\epsilon\), solution is within \(1 + \epsilon\) of optimal
May 6
- Create discrete graph approximations of the continuous configuration
space
- Graph creation:
- Sample collision-free configurations as verticies
- Add neighboring edges between close verticies
- Use a collision checker to ensure there’s no bad paths
- Collision checking:
- Use a
steer
function to find a feasible path
- Check points along the path for collisions via Van Der Corput
sequence
- Good for catching objects early!
- The boundary value problem relates to connecting configurations -
I’m not quite sure
- Sampling strategies:
- Lattice sampling
- Uniform random sampling
- Low-Dispersion sampling (Halton sequence)
- Can interleave graph construction and search
May 8
- DARPA Racer presentation
- Goal: Move as quick as possible with fewest safety
inverventions
- Car is a very large version of Mushr
- The autonomy stack is nearly the same as with Mushr
- Perception:
- Local costmap: map of obstacle costs
- Need to convert local perspective to birds-eye view for planner
- Pinhole -> voxel -> meanpool
- Use a backbone to complete the full map
- Use image inpaiting for prediction
- Note: this is different from segmentation, does inference on outside
data
- Costmap is weighed average of features
- Planning:
- A* search (working on D*…what is that?)
- MPC:
- Have a set of possible trajectories
- Sensing uncertainty:
- Aleatoric: uncertainty due to environment (e.g. occlusion)
- Epistemic: lack of support from training data (ML uncertainty)
- Where these pop up:
- Fluctuating obstacles
- Consistent inaccurate predictions
- Myopic perception
- Determinization: sample or threshold stochastic world into a
deterministic one
- Super fucking cool
May 20
- RL Lecture: typical MDP stuff
- Inverse RL: learn reward function from (optimal policy)
examples
- (Rough) algo:
- Start with some reward function \(r\)
- Learn policy for \(r\)
- Compare policy to demonstrations
- Update \(r\)
- Repeat
- Linear reward function: reward is a linear function of the
features
- Linear inverse RL: update \(r\) via
difference in feature expectations