…

- Motion models:
- Capture change in state, \(P(x | a)\)
- We can use simple models, adding some noise to account for errors
- Good models incorporate it’s uncertainty
- Better to overstate uncertainty

- Our MuSHR motion model:
- Kinematic: assumes “speed”, not acceleration, is the action
- Assume perfect, flat ground
- State: vec of x, y, theta (pos, heading)
- Controls: speed, steering angle
- New state is old state plus the delta
- Noise:
- Add noise to control before propagating through the model
- We add noise to state after the new prediction

- Euler’s Theorem: TLDR; for any moving, turning object, there is a “center of rotation”, where all of the forces on the car are perpendicular to that center
- Equations:
- \(\dot{\theta} = w = \frac{v}{L} \tan
\delta\)
- Change in direction

- \(\dot{\theta} = w = \frac{v}{L} \tan
\delta\)
- MuSHR sensor model:
- \(P(z_t | x_t) \rightarrow P(z_t | x_t,
m)\)
- \(z_t\): observation (laser scan), \(x_t\): state, \(m\): map

- Sources of stochasticity:
- Measurement noise (gaussian around measurement)
- Unexpected objects, e.g. map is wrong and someone is in between (add
probability between 0 and sensor reading)
- In class, we’ll use linear probability between zero and obs

- Sensor failures (add some noise at \(z_\text{max}\))
- Random measurements (add uniform random measurement)

- \(P(z_t | x_t) \rightarrow P(z_t | x_t,
m)\)

- Monte-Carlo methods: random sampling to estimate expectation
- Importance sampling:
- Sample from a simple distribution
- Take samples from proposal dist \(q(x)\)

- Reweight the samples to match target distribution \(p(x)\)
- Reweight samples by \(\frac{p(x)}{q(x)}\)

- Sample from a simple distribution
- Belief updating:
- Wikihow How-To:
- Predict our current beliefs forward with motion model
- Reweight our beliefs with observation model
- Renormalize our beliefs

- Resample from our beliefs to prevent getting stuck

- Wikihow How-To:
- Particle samping issues:
- Two-room issue:
- Beliefs will coalesce in a feedback loop
- E.g. fix: resample less if equal confidence in most weights
- Fix: low-variance resampling, take initial point and then uniform steps

- Two-room issue:
- Can have variable amounts of particles

- Kalman Filtering: belief is a Gaussian, linear Gaussian systems
- Works with continuous distributions
- Paramaturized by \(\mu_t\) and \(\sigma_t^2\)
- Works with linear models with Gaussian noise
- \(x_t = ax_{t-1} + bu_t + \mathcal{N}(0, \sigma_r^2)\)
- \(z_t = cx_t + \mathcal{N}(0, \sigma_q^2)\)

- Kalman gain: \(\frac{\text{Var(estimate)}}{\text{Var(estimate)} + Var(measurement)}\)
- In multiple dimensions, we start to use covariance instead of variance
- \(x_t\) is a distribution, but often used in equations as the mean of the distribution
- More efficient than particle filters, but requires linear motion and sensor models

- We are not studying the extended Kalman filter
- Requires us to compute the Jacobian
- Has issues with highly non-linear functions, only the mean gets the nonlinear update

- Unscented Kalman filter:
- Easy to get the mean of the new Gaussian - just run the old mean through the non-linear update
- Getting the variance is harder, we don’t have an easy exact way to do so
- Instead, we model the distribution (and get it’s variance) via
sampling:
- We get \(2n+1\) “sigma points”, where \(n\) is the dimensionality of the state vector
- First sigma point is the mean
- \(x_i = \mu + (\sqrt{(n+\lambda)\Sigma})_i\) for \(i=1,\dots,n\)
- \(x_i = \mu - (\sqrt{(n+\lambda)\Sigma})_{i-n}\) for \(i=n+1,\dots,2n\)
- Weight the mean higher, because we know it is true

**Control:**create actuator controls to follow a plan!- Plans:
- Tracking a reference trajectory
- Time-parameterized: timed path
- Index-parameterized: no time in path

- Open-loop: get commands and run without modification
- Feedback control:
- Measure error between reference and current state
- Take actions to reduce error

- Choosing reference state for index-parameterized path:
- Can use nearest point, but can cause issues
- Pick point with at least \(l\) distance away

- Compute error to reference state:
- Give reference state it’s own coordinate frame
- Errors:
- Along-track error: \(e_\text{at}\) (error in \(x\))
- Cross-track error: \(e_\text{ct}\) (error in \(y\))
- Heading error: \(\theta_e\) (error in \(\theta\))

- See slides for equations

- We will only control steering angle in this course, so we ignore along-track error

- Compute control law: \(u = K(x, e)\)
- BANG-BANG CONTROL:
- Choose between max left and max right
- (Note: left is
*positive*, right is*negative*)

- (Note: left is
- Think about controllers as a system: this is a pendulum

- Choose between max left and max right
- PID (Proportional-Integral-Derivative) control:
- The “benchmark” controller (like linear regression in ML)
- Very popular in practice
- \(u = -(K_p e_{ct} + K_i \int e_{ct} dt +
K_d \dot{e}_{ct})\)
- \(K_p e_{ct}\): (Proportional) present term
- \(K_i \int e_{ct} dt\): (Integral) past term
- \(K_d \dot{e}_{ct}\): (Derivative) future term
- The \(K\)’s are the gains for each of the terms

- Proportional gain:
- Low: long time to correct, smaller oscillations
- High: quickly correct, larger oscillations

- Integral term:
- Can cap the absolute value of term
- Discount over time
- Finite-time

- Derivative term:
- Dampens the action by the change in error
- Reduces oscillations
- Numeric differentiation is terrible, use analytic derivatives
- \(\dot{e}_{ct} = V \sin(\theta_e)\)

- How to prove a controller is stable?
- Stability means error and derivative of error are \(0\) as \(t \rightarrow \infty\)
- Passive error dynamics: what happens as we
*don’t*apply any controls? - With a gradient controller, we’re essentially performing gradient descent on an energy loss landscape
- This shows any positive controls brings the error to zero
- There exists a closed-form solution for a provably-correct controller (in theory)

- Social navigation: help humans feel comfortable and safe with robots
- Collaboration/teaming: shared autonomy
- Learning from humans: how can humans teach a robot?

- Global planning: plan high level abstract actions
- Local planning: physical actions being taken for a high-level action
- Configuration space (C-space):
- Represent state as a point in high-dim space
- We can use \(\mathcal{S}\) to
represent a circle
- The cross product of two is a torus

- Obstacle specification:
- Robot is a point in C-space
- Some of this space is full of obstacles
- Our obstacle C-space is all points between the intersection of real obstacles and possible robots

- Explicit graph: precompute all collsions with a graph
- Implicit graph: jointly discover graph and find path within it
- “Best”-first search:
- Use priority queue with some priority function
- Completeness: returns a solution if it exists, else returns nothing

- Heuristics: estimate the cost to go
- Admissibility: heurisic is less than or equal to the goal
- Consistentency: triangle inequality, \(h(s) \leq c(s, s') + h(s')\)

- Weighted A* multiplies heuristic by \(\epsilon\), solution is within \(1 + \epsilon\) of optimal

- Create discrete graph approximations of the continuous configuration space
- Graph creation:
- Sample collision-free configurations as verticies
- Add neighboring edges between close verticies
- Use a collision checker to ensure there’s no bad paths

- Collision checking:
- Use a
`steer`

function to find a feasible path - Check points along the path for collisions via Van Der Corput
sequence
- Good for catching objects early!

- Use a
- The boundary value problem relates to connecting configurations - I’m not quite sure
- Sampling strategies:
- Lattice sampling
- Uniform random sampling
- Low-Dispersion sampling (Halton sequence)

- Can interleave graph construction and search

- DARPA Racer presentation
- Goal: Move as quick as possible with fewest safety inverventions
- Car is a very large version of Mushr
- The autonomy stack is nearly the same as with Mushr
- Perception:
- Local costmap: map of obstacle costs

- Need to convert local perspective to birds-eye view for planner
- Pinhole -> voxel -> meanpool
- Use a backbone to complete the full map
- Use image inpaiting for prediction
- Note: this is different from segmentation, does inference on outside data

- Costmap is weighed average of features

- Planning:
- A* search (working on D*…what is that?)

- MPC:
- Have a set of possible trajectories

- Sensing uncertainty:
- Aleatoric: uncertainty due to environment (e.g. occlusion)
- Epistemic: lack of support from training data (ML uncertainty)
- Where these pop up:
- Fluctuating obstacles
- Consistent inaccurate predictions
- Myopic perception

- Determinization: sample or threshold stochastic world into a deterministic one

- Super fucking cool

- RL Lecture: typical MDP stuff
- Inverse RL: learn reward function from (optimal policy) examples
- (Rough) algo:
- Start with some reward function \(r\)
- Learn policy for \(r\)
- Compare policy to demonstrations
- Update \(r\)
- Repeat

- Linear reward function: reward is a linear function of the features
- Linear inverse RL: update \(r\) via difference in feature expectations