github posts notes

CSE 478

…

Lecture 4, April 1

Motion models:
- Capture change in state, \(P(x | a)\)
- We can use simple models, adding some noise to account for errors
- Good models incorporate it’s uncertainty
- Better to overstate uncertainty
Our MuSHR motion model:
- Kinematic: assumes “speed”, not acceleration, is the action
- Assume perfect, flat ground
- State: vec of x, y, theta (pos, heading)
- Controls: speed, steering angle
- New state is old state plus the delta
- Noise:
  - Add noise to control before propagating through the model
  - We add noise to state after the new prediction
Euler’s Theorem: TLDR; for any moving, turning object, there is a “center of rotation”, where all of the forces on the car are perpendicular to that center
Equations:
- \(\dot{\theta} = w = \frac{v}{L} \tan \delta\)
  - Change in direction
MuSHR sensor model:
- \(P(z_t | x_t) \rightarrow P(z_t | x_t, m)\)
  - \(z_t\): observation (laser scan), \(x_t\): state, \(m\): map
- Sources of stochasticity:
  1. Measurement noise (gaussian around measurement)
  2. Unexpected objects, e.g. map is wrong and someone is in between (add probability between 0 and sensor reading)
    - In class, we’ll use linear probability between zero and obs
  3. Sensor failures (add some noise at \(z_\text{max}\))
  4. Random measurements (add uniform random measurement)

Lecture 5, April 3

Monte-Carlo methods: random sampling to estimate expectation
Importance sampling:
1. Sample from a simple distribution
  - Take samples from proposal dist \(q(x)\)
2. Reweight the samples to match target distribution \(p(x)\)
  - Reweight samples by \(\frac{p(x)}{q(x)}\)
Belief updating:
- Wikihow How-To:
  1. Predict our current beliefs forward with motion model
  2. Reweight our beliefs with observation model
  3. Renormalize our beliefs
- Resample from our beliefs to prevent getting stuck
Particle samping issues:
- Two-room issue:
  - Beliefs will coalesce in a feedback loop
  - E.g. fix: resample less if equal confidence in most weights
  - Fix: low-variance resampling, take initial point and then uniform steps
Can have variable amounts of particles

Lecture 6, April 8

Kalman Filtering: belief is a Gaussian, linear Gaussian systems
- Works with continuous distributions
- Paramaturized by \(\mu_t\) and \(\sigma_t^2\)
- Works with linear models with Gaussian noise
- \(x_t = ax_{t-1} + bu_t + \mathcal{N}(0, \sigma_r^2)\)
- \(z_t = cx_t + \mathcal{N}(0, \sigma_q^2)\)
Kalman gain: \(\frac{\text{Var(estimate)}}{\text{Var(estimate)} + Var(measurement)}\)
In multiple dimensions, we start to use covariance instead of variance
\(x_t\) is a distribution, but often used in equations as the mean of the distribution
More efficient than particle filters, but requires linear motion and sensor models

Lecture 7, April 10

We are not studying the extended Kalman filter
- Requires us to compute the Jacobian
- Has issues with highly non-linear functions, only the mean gets the nonlinear update
Unscented Kalman filter:
- Easy to get the mean of the new Gaussian - just run the old mean through the non-linear update
- Getting the variance is harder, we don’t have an easy exact way to do so
- Instead, we model the distribution (and get it’s variance) via sampling:
  - We get \(2n+1\) “sigma points”, where \(n\) is the dimensionality of the state vector
  - First sigma point is the mean
  - \(x_i = \mu + (\sqrt{(n+\lambda)\Sigma})_i\) for \(i=1,\dots,n\)
  - \(x_i = \mu - (\sqrt{(n+\lambda)\Sigma})_{i-n}\) for \(i=n+1,\dots,2n\)
  - Weight the mean higher, because we know it is true

Lecture 8(?), April 15

Control: create actuator controls to follow a plan!
Plans:
- Tracking a reference trajectory
- Time-parameterized: timed path
- Index-parameterized: no time in path
Open-loop: get commands and run without modification
Feedback control:
1. Measure error between reference and current state
2. Take actions to reduce error
Choosing reference state for index-parameterized path:
- Can use nearest point, but can cause issues
- Pick point with at least \(l\) distance away
Compute error to reference state:
- Give reference state it’s own coordinate frame
- Errors:
  - Along-track error: \(e_\text{at}\) (error in \(x\))
  - Cross-track error: \(e_\text{ct}\) (error in \(y\))
  - Heading error: \(\theta_e\) (error in \(\theta\))
- See slides for equations
We will only control steering angle in this course, so we ignore along-track error

Lecture 9, April 17

Compute control law: \(u = K(x, e)\)
BANG-BANG CONTROL:
- Choose between max left and max right
  - (Note: left is positive, right is negative)
- Think about controllers as a system: this is a pendulum
PID (Proportional-Integral-Derivative) control:
- The “benchmark” controller (like linear regression in ML)
- Very popular in practice
- \(u = -(K_p e_{ct} + K_i \int e_{ct} dt + K_d \dot{e}_{ct})\)
  - \(K_p e_{ct}\): (Proportional) present term
  - \(K_i \int e_{ct} dt\): (Integral) past term
  - \(K_d \dot{e}_{ct}\): (Derivative) future term
  - The \(K\)’s are the gains for each of the terms
- Proportional gain:
  - Low: long time to correct, smaller oscillations
  - High: quickly correct, larger oscillations
- Integral term:
  - Can cap the absolute value of term
  - Discount over time
  - Finite-time
- Derivative term:
  - Dampens the action by the change in error
  - Reduces oscillations
  - Numeric differentiation is terrible, use analytic derivatives
  - \(\dot{e}_{ct} = V \sin(\theta_e)\)
How to prove a controller is stable?
- Stability means error and derivative of error are \(0\) as \(t \rightarrow \infty\)
- Passive error dynamics: what happens as we don’t apply any controls?
- With a gradient controller, we’re essentially performing gradient descent on an energy loss landscape
- This shows any positive controls brings the error to zero
- There exists a closed-form solution for a provably-correct controller (in theory)

Lecture …, missed

Lecture ?, April 24

Social navigation: help humans feel comfortable and safe with robots
Collaboration/teaming: shared autonomy
Learning from humans: how can humans teach a robot?

April 29

Global planning: plan high level abstract actions
Local planning: physical actions being taken for a high-level action
Configuration space (C-space):
- Represent state as a point in high-dim space
- We can use \(\mathcal{S}\) to represent a circle
  - The cross product of two is a torus
Obstacle specification:
- Robot is a point in C-space
- Some of this space is full of obstacles
- Our obstacle C-space is all points between the intersection of real obstacles and possible robots

May 1

Explicit graph: precompute all collsions with a graph
Implicit graph: jointly discover graph and find path within it
“Best”-first search:
- Use priority queue with some priority function
- Completeness: returns a solution if it exists, else returns nothing
Heuristics: estimate the cost to go
- Admissibility: heurisic is less than or equal to the goal
- Consistentency: triangle inequality, \(h(s) \leq c(s, s') + h(s')\)
Weighted A* multiplies heuristic by \(\epsilon\), solution is within \(1 + \epsilon\) of optimal

May 6

Create discrete graph approximations of the continuous configuration space
Graph creation:
- Sample collision-free configurations as verticies
- Add neighboring edges between close verticies
- Use a collision checker to ensure there’s no bad paths
Collision checking:
- Use a steer function to find a feasible path
- Check points along the path for collisions via Van Der Corput sequence
  - Good for catching objects early!
The boundary value problem relates to connecting configurations - I’m not quite sure
Sampling strategies:
- Lattice sampling
- Uniform random sampling
- Low-Dispersion sampling (Halton sequence)
Can interleave graph construction and search

May 8

DARPA Racer presentation
Goal: Move as quick as possible with fewest safety inverventions
Car is a very large version of Mushr
The autonomy stack is nearly the same as with Mushr
Perception:
- Local costmap: map of obstacle costs
Need to convert local perspective to birds-eye view for planner
- Pinhole -> voxel -> meanpool
- Use a backbone to complete the full map
- Use image inpaiting for prediction
  - Note: this is different from segmentation, does inference on outside data
- Costmap is weighed average of features
Planning:
- A* search (working on D*…what is that?)
MPC:
- Have a set of possible trajectories
Sensing uncertainty:
- Aleatoric: uncertainty due to environment (e.g. occlusion)
- Epistemic: lack of support from training data (ML uncertainty)
- Where these pop up:
  - Fluctuating obstacles
  - Consistent inaccurate predictions
  - Myopic perception
- Determinization: sample or threshold stochastic world into a deterministic one
Super fucking cool

May 20

RL Lecture: typical MDP stuff
Inverse RL: learn reward function from (optimal policy) examples
(Rough) algo:
- Start with some reward function \(r\)
- Learn policy for \(r\)
- Compare policy to demonstrations
- Update \(r\)
- Repeat
Linear reward function: reward is a linear function of the features
Linear inverse RL: update \(r\) via difference in feature expectations