diff --git a/S57_diffdrive_summary.ipynb b/S57_diffdrive_summary.ipynb index 6e440519..15134516 100644 --- a/S57_diffdrive_summary.ipynb +++ b/S57_diffdrive_summary.ipynb @@ -272,7 +272,7 @@ "The kinematics of differential drive robots are described in detail in [Introduction to Autonomous Mobile Robots](https://mitpress.mit.edu/9780262015356/introduction-to-autonomous-mobile-robots/) by Siegwart, Nourbakhsh, Scaramuzza {cite:p}`Siegwart11book_robots`\n", "\n", "The first mathematically rigorous book on robot motion planning was written by Latombe\n", - "in the early nineties {cite:p}`JCL:91`.\n", + "in the early nineties {cite:p}`Latombe91book`.\n", "Brian Eno once remarked\n", "that only about 1,000 people bought the first Velvet Underground album, but every one of them formed a rock 'n' roll band. Latombe's book held this status in robotics; if you owned it, likely as not, you went on to become a researcher in robot motion planning.\n", "In subsequent years, [Principles of Robot Motion](https://mitpress.mit.edu/9780262033275/principles-of-robot-motion/) by Choset, et al. {cite:p}`Choset05book_motion`\n", @@ -283,7 +283,7 @@ "Excellent introductions to the material on machine learning can be found in\n", "[Deep Learning](https://www.deeplearningbook.org/) by Goodfellow, Bengio, and Courville {cite:p}`Goodfellow16book_dl`\n", "and\n", - "[Dive into Deep Learning](https://d2l.ai/) by Zhang et al. {cite:p}`Zhang20book_d2l`" + "[Dive into Deep Learning](https://d2l.ai/) by Zhang et al. {cite:p}`Zhang20book_d2l`." ] } ], diff --git a/S65_driving_planning.ipynb b/S65_driving_planning.ipynb index baa43c4f..874ea41c 100644 --- a/S65_driving_planning.ipynb +++ b/S65_driving_planning.ipynb @@ -197,7 +197,7 @@ "
Initial and final configuration for a lane change maneuver.
\n", "\n", "\n", - "Here, we have taken the $s$-axis to be longitudinal direction (parallel to the highway), and the $d$-axis is along the lateral direction (perpendicular to the highway). This choice of coordinates will be convenient below, when we generalize trajectrories to arbitrary curves. In addition, for now let us assume that the car speed satisfies $s=t$.\n", + "Here, we have taken the $s$-axis to be longitudinal direction (parallel to the highway), and the $d$-axis is along the lateral direction (perpendicular to the highway). This choice of coordinates will be convenient below, when we generalize trajectories to arbitrary curves. In addition, for now let us assume that the car speed satisfies $s=t$.\n", "Below, we will generalize further by defining $s$ to be the distance along the path\n", "(instead of a linear distance along a straight lane), and\n", "$s(t)$ to be the time parameterization of the path.\n", @@ -244,15 +244,15 @@ "So, if we wish to match initial and final conditions on heading (which is defined by the first derivative of $d$),\n", "we would require a cubic polynomial, and if we wished to also satisfy lateral acceleration constraints, we would need a fifth order polynomial.\n", "The lateral velocity and acceleration for the trajectory are given by the first and second derivatives of $d$,\n", - "which we denote by $d’$ and $d’’$, respectively.\n", + "which we denote by $d'$ and $d''$, respectively.\n", "For a fifth order polynomial, we have \n", "\n", "\n", "$$ \n", "\\begin{aligned}\n", "d(s) &=& \\alpha_0 + \\alpha_1 s + \\alpha_2 s^2 + \\alpha_3 s^3 + \\alpha_4 s^4 + \\alpha_5 s^5\\\\\n", - "d’(s) &=& \\alpha_1 + 2 \\alpha_2 s + 3 \\alpha_3 s^2 + 4 \\alpha_4 s^3 + 5 \\alpha_5 s^4\\\\\n", - "d’’(s) &=& 2 \\alpha_2 +6 \\alpha_3 s + 12 \\alpha_4 s^2 + 20 \\alpha_5 s^3 \n", + "d'(s) &=& \\alpha_1 + 2 \\alpha_2 s + 3 \\alpha_3 s^2 + 4 \\alpha_4 s^3 + 5 \\alpha_5 s^4\\\\\n", + "d''(s) &=& 2 \\alpha_2 +6 \\alpha_3 s + 12 \\alpha_4 s^2 + 20 \\alpha_5 s^3 \n", "\\end{aligned}\n", "$$" ] @@ -266,13 +266,13 @@ "$$ \n", "\\begin{aligned}\n", "y(0) = 0& =& \\alpha_0 \\\\\n", - "y’(0) = 0 & =& \\alpha_1 \\\\\n", - "y’’(0) = 0 &=& 2 \\alpha_2 \\\\\n", + "y'(0) = 0 & =& \\alpha_1 \\\\\n", + "y''(0) = 0 &=& 2 \\alpha_2 \\\\\n", "y(x_\\mathrm{g}) = y_\\mathrm{g} &=&\n", " \\alpha_0 + \\alpha_1 x_\\mathrm{g} + \\alpha_2 x_\\mathrm{g}^2 + \\alpha_3 x_\\mathrm{g}^3 + \\alpha_4 x_\\mathrm{g}^4 + \\alpha_5 x_\\mathrm{g}^5 \\\\\n", - "y’(x_\\mathrm{g}) = 0 &=&\n", + "y'(x_\\mathrm{g}) = 0 &=&\n", " \\alpha_1 + 2 \\alpha_2 x_\\mathrm{g} + 3 \\alpha_3 x_\\mathrm{g}^2 + 4 \\alpha_4 x_\\mathrm{g}^3 + 5 \\alpha_5 x_\\mathrm{g}^4 \\\\\n", - "y’’(x_\\mathrm{g}) = 0 &=&\n", + "y''(x_\\mathrm{g}) = 0 &=&\n", " 2 \\alpha_2 +6 \\alpha_3 x_\\mathrm{g} + 12 \\alpha_4 x_\\mathrm{g}^2 + 20 \\alpha_5 x_\\mathrm{g}^3 \n", "\\end{aligned}\n", "$$\n", diff --git a/S66_driving_DRL.ipynb b/S66_driving_DRL.ipynb index 82bb571f..f1b0d8f8 100644 --- a/S66_driving_DRL.ipynb +++ b/S66_driving_DRL.ipynb @@ -68,7 +68,7 @@ "\n", "\"Splash\n", "\n", - "Deep reinforcement learning (DRL) brings the power of deep learning to much more complex domains than what we were able to tackle with the Markov Decision Processes and RL concepts introduced in Chapter 3. The use of large, expressive neural networks has allowed researchers and practitioners alike to work with high bandwidth sensors such as video streams and LIDAR, and bring the promise of RL into real-world domains such as autonomous driving. This is still a field of active discovery and research, however, and we can give but a brief introduction here about what is a vast literature and problem space." + "Deep reinforcement learning (DRL) applies the power of deep learning to bring reinforcement learning to much more complex domains than what we were able to tackle with the Markov Decision Processes and RL concepts introduced in Chapter 3. The use of large, expressive neural networks has allowed researchers and practitioners alike to work with high bandwidth sensors such as video streams and LIDAR, and bring the promise of RL into real-world domains such as autonomous driving. This is still a field of active discovery and research, however, and we can give but a brief introduction here about what is a vast literature and problem space." ] }, { @@ -82,14 +82,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A simple example in the autonomous driving domain is *lane switching*. Suppose we are driving along at 3-lane highway, and we can see some ways ahead, and some ways behind us. We are driving at a speed that is comfortable to us, but other cars have different ideas about the optimal speed to drive at. Hence, sometimes we would like to change lanes, and we could learn a policy to do this for us. This is called **lateral control**. A more sophisticated example would also allow us to adapt our speed to the traffic pattern, but by relying on a smart cruise control system we could safely ignore this **longitudinal control** problem." + "A simple example in the autonomous driving domain is *lane switching*. Suppose we are driving along at 3-lane highway, and we can see some ways ahead, and some ways behind us. We are driving at a speed that is comfortable to us, but other cars have different ideas about the optimal speed to drive at. Hence, sometimes we would like to change lanes, and we could learn a policy to do this for us. As discussed in Section 6.5, this is **lateral control**. A more sophisticated example would also allow us to adapt our speed to the traffic pattern, but by relying on a smart cruise control system we could safely ignore the **longitudinal control** problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "To turn this into a reinforcement learning problem, we first need to define a state space ${\\cal X}$ and an action space ${\\cal A}$. There are a variety of ways to engineer this aspect of the problem. For example, we could somehow encode the longitudinal distance and lane index for each of the K closest cars, where K is a parameter, say 5 or 10. One problem is that the number of cars that are *actually* present is variable, which is difficult to deal with. Another approach is to make this into an image processing problem, by creating a finite element representation of the road before and behind us, and marking a cell as occupied or not. The latter is fairly compatible with automotive sensors such as LIDAR." + "To turn this into a reinforcement learning problem, we first need to define a state space ${\\cal X}$ and an action space ${\\cal A}$. There are a variety of ways to engineer this aspect of the problem. For example, we could somehow encode the longitudinal distance and lane index for each of the K closest cars, where K is a parameter, say 5 or 10. One problem is that the number of cars that are *actually* present is variable, which is difficult to deal with. Another approach is to make this into an image processing problem, by creating a finite element representation of the road before and behind us, and marking each cell as occupied or not. The latter is fairly compatible with automotive sensors such as LIDAR." ] }, { @@ -98,7 +98,7 @@ "source": [ "In terms of actions, the easiest approach is to have a number of *discrete* choices to go `left`, `right`, or `stay` in the current lane. We could be more sophisticated about it and have both \"aggressive\" and \"slow\" versions of these in addition to a default version, akin to the motion primitives we previously discussed.\n", "\n", - "Actually implementing this on an autonomous vehicle, or even sketching an implementation in a notebook with recorded or simulated data, is beyond what we can accomplish in a notebook. Hence, we will be content below to sketch three popular foundational methods from deep reinforcement learning, without actually implementing them here. At the end of this chapter we provide some references where you can delve into these topics more deeply." + "Actually implementing this on an autonomous vehicle, or even sketching an implementation in a notebook with recorded or simulated data, is beyond what we can accomplish in a jupyter notebook. Hence, we will be content below to sketch three popular foundational methods from deep reinforcement learning, without actually implementing them here. At the end of this chapter we provide some references where you can delve into these topics more deeply." ] }, { @@ -115,13 +115,13 @@ "\\pi^*(x) = \\arg \\max_a Q^*(x,a)\n", "$$\n", "\n", - "where $Q^*(x,a)$ denote the Q-values for the *optimal* policy. In Q-learning, we start with some random Q-values and then gradually estimate the optimal Q-values by alpha-blending between old and new estimates:\n", + "where $Q^*(x,a)$ denote the Q-values for the *optimal* policy. In Q-learning, we start with some random Q-values and then iteratively improve the estimate for the optimal Q-values by alpha-blending between old and new estimates:\n", "\n", "$$\n", "\\hat{Q}(x,a) \\leftarrow (1-\\alpha) \\hat{Q}(x,a) + \\alpha~\\text{target}(x,a,x')\n", "$$\n", "\n", - "where $\\text{target}(x,a,x') \\doteq R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a')$ is the \"target\" value that we think is an improvment on the previous value $\\hat{Q}(x,a)$." + "where $\\text{target}(x,a,x') \\doteq R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a')$ is the \"target\" value that we think is an improvement on the previous value $\\hat{Q}(x,a)$. Indeed: the target $\\text{target}(x,a,x')$ uses the current estimate of the Q-values for future states, but improves on this by using the *known* reward $R(x,a,x')$ for the current action in the current state." ] }, { @@ -134,15 +134,21 @@ "Q^*(x,a) \\approx Q(x,a; \\theta)\n", "$$\n", "\n", - "DQN as a method uses two additional ideas that are crucial in making the training converge to something sensible in difficult problems. The first is splitting the training into *execution* and *experience replay* phases:\n", + "It might be worthwhile to re-visit Section 5.6, where we introduced neural networks and how to train them using stochastic gradient descent (SGD). In the context of RL, the DQN method uses two additional ideas that are crucial in making the training converge to something sensible in difficult problems. The first is splitting the training into *execution* and *experience replay* phases:\n", "\n", - "- during the **execution phase**, it executes the policy (possibly with some degree of randomness) and stores the experiences $(x,a,r,x')$, with $r$ the reward, in a dataset $D$;\n", - "- during **experience replay**, it *randomly samples* from these experiences to create mini-batches of data, which are in turn used to perform stochastic gradient descent (SGD) on the parameters $\\theta$.\n", + "- during the **execution phase**, the policy is executed (possibly with some degree of randomness) and the experiences $(x,a,r,x')$, with $r$ the reward, are stored in a dataset $D$;\n", + "- during **experience replay**, we *randomly sample* from these experiences to create mini-batches of data, which are in turn used to perform SGD on the parameters $\\theta$.\n", "\n", - "The second idea is to calculate the target values $\\text{target}(x,a,x') \\doteq R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a'; \\theta^{old})$ with the parameters $\\theta^{old}$ from the previous epoch, to provide a more stable approximation target. The mini-batch loss we minimize using SGD is then\n", + "The second idea is to calculate the target values \n", "\n", "$$\n", - "\\sum_{(x,a,r,x')} [R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a'; \\theta^{old}) - Q(x,a; \\theta)]^2\n", + "\\text{target}(x,a,x') \\doteq R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a'; \\theta^{old})$$\n", + "\n", + "with the parameters $\\theta^{old}$ from the previous epoch, to provide a more stable approximation target.\n", + "The mini-batch loss we minimize using SGD is then\n", + "\n", + "$$\n", + "\\mathcal{L}_{\\text{DQN}}(\\theta; D) \\doteq \\sum_{(x,a,r,x')\\in D} [R(x,a,x') + \\gamma \\max_{a'} \\hat{Q}(x',a'; \\theta^{old}) - Q(x,a; \\theta)]^2\n", "$$\n", "\n", "With this basic scheme, a team from DeepMind was able to achieve human or super-human performance on about 50 Atari 2600 games in 2015 {cite:p}`Mnih15nature_dqn`.\n", @@ -161,12 +167,12 @@ "Whereas the above gets at an optimal policy indirectly, via deep Q-learning, a different and very popular idea is to directly parameterize the policy using a neural network, with weights $\\theta$. It is common to make this a **stochastic policy**,\n", "\n", "$$\n", - "\\pi(a|x, \\theta)\n", + "\\pi(a|x; \\theta)\n", "$$\n", "\n", - "where $a \\in {\\cal A}$ is an action, $x \\in {\\cal X}$ is a state, and the policy outputs a *probability* for each action $a$ based on the state $x$. One of the reasons to prefer stochastic policies is that they are differentiable, which allows us to optimize for them via fradient descent, as we explore in the next section.\n", + "where $a \\in {\\cal A}$ is an action, $x \\in {\\cal X}$ is a state, and the policy outputs a *probability* for each action $a$ based on the state $x$. One of the reasons to prefer stochastic policies is that they are differentiable, as they output continuous values rather than discrete actions. This allows us to optimize for them via gradient descent, as we explore in the next section.\n", "\n", - "In Section 5 we used *supervised* learning to train neural networks, and we just this above for learning Q-values in DQN. It is useful to consider how this might work for training a *policy*. Recall from Section 5.6 that we defined the empirical cross-entropy loss as\n", + "In Chapter 5 we used *supervised* learning to train neural networks, and we just applied this for learning Q-values in DQN. It is useful to consider how this might work for training a *policy*. Recall from Section 5.6 that we defined the empirical cross-entropy loss as\n", "\n", "$$\\mathcal{L}_{\\text{CE}}(\\theta; D) \\doteq \\sum_c \\sum_{(x,y=c)\\in D}\\log\\frac{1}{p_c(x;\\theta)}$$\n", "\n", @@ -176,7 +182,7 @@ "\n", "This formulation is equivalent, but now we are summing over all classes for each data point, with $y_c$ acting as a *weight*, either one or zero. When someone is so kind as to give us the optimal action $y_a$ (as a one-hot encoding) for every state $x$ in some dataset $D$, we can apply this same loss function to a stochastic policy, obtaining\n", "\n", - "$$\\mathcal{L}_{\\text{CE}}(\\theta; D) = -\\sum_{(x,y)\\in D} \\sum_a y_a \\log \\pi(a| x, \\theta)$$" + "$$\\mathcal{L}_{\\text{CE}}(\\theta; D) = -\\sum_{(x,y)\\in D} \\sum_a y_a \\log \\pi(a| x; \\theta)$$" ] }, { @@ -192,7 +198,7 @@ "\\hat{Q}(x_{it},a_{it}) \\doteq \\sum_{k=t}^H \\gamma^{k-t}R(x_{ik},a_{ik},x_{ik}').\n", "$$\n", "\n", - "We can then use this as an alternative to the \"one or zero\" weight above, obtaining\n", + "Note in each rollout we can only sum until $k=H$, so Q-values earlier in the rollout will be estimated more accurately. Regardless, we can then use these estimated Q-values as an alternative to the \"one or zero\" weight above, obtaining\n", "\n", "$$\n", "\\mathcal{L}(\\theta) = - \\sum_i \\sum_{t=1}^H \\hat{Q}(x_{it},a_{it}) \\log \\pi(a_{it}|x_{it}, \\theta)\n", @@ -246,8 +252,8 @@ "\n", "where $\\alpha$ is a learning rate.\n", "\n", - "The algorithm above, using the estimated Q-values, is almost identical to the REINFORCE method {cite:p}`Williams92ml_reinforce`. That algorithm further improves on performance by not using the raw Q-values but rather the difference between the Q-values and some baseline policy. This has the effect of reducing the variance that comes from estimating the Q-values from a finite amount of data each time.\n", - "The REINFORCE algorithm was introduced in 1992 and hence pre-dates the deep-learning revolution by about 20 years. It should also be said that in DRL, the neural networks that are used are typically not very deep. Several modern methods, such as \"proximal policy optimization\" (PPO) apply a number of techniques to improve this basic method even further and make it more sample-efficient. PPO is now one of the most often-used DRL methods." + "The algorithm above, using the estimated Q-values, is almost identical to the REINFORCE method {cite:p}`Williams92ml_reinforce`. That algorithm further improves on performance by not using the raw Q-values but rather the difference between the Q-values and some baseline policy. This has the effect of reducing the variance in the estimated Q-values due to using only a finite amount of data.\n", + "The REINFORCE algorithm was introduced in 1992 and hence pre-dates the deep-learning revolution by about 20 years. It should also be said that in DRL, the neural networks that are used are typically not very deep. Several modern methods, such as \"proximal policy optimization\" (PPO) {cite:p}`Schulman17_PPO` apply a number of techniques to improve this basic method even further and make it more sample-efficient. PPO is now one of the most often-used DRL methods." ] }, { diff --git a/references.bib b/references.bib index 6be2bcc0..8dc2351f 100644 --- a/references.bib +++ b/references.bib @@ -1,477 +1,465 @@ -@String { RAM = {Robotics and Automation Magazine (RAM)} } -@String { TRO = {{IEEE} Transactions on Robotics} } + +@string{ram = {Robotics and Automation Magazine (RAM)}} + +@string{tro = {{IEEE} Transactions on Robotics}} + + +@book{Latombe91book, + address = {USA}, + author = {Latombe, Jean-Claude}, + date-added = {2024-07-17 14:37:24 +0200}, + date-modified = {2024-07-17 14:37:44 +0200}, + isbn = {0792391292}, + publisher = {Kluwer Academic Publishers}, + title = {Robot Motion Planning}, + year = {1991}} @article{Adamkiewicz22ral_nerf_nav, - author={Adamkiewicz, Michal and Chen, Timothy and Caccavale, Adam and Gardner, Rachel and Culbertson, Preston and Bohg, Jeannette and Schwager, Mac}, - journal={IEEE Robotics and Automation Letters}, - title={Vision-Only Robot Navigation in a Neural Radiance World}, - year={2022}, - volume={7}, - number={2}, - pages={4606-4613}, - doi={10.1109/LRA.2022.3150497} -} + author = {Adamkiewicz, Michal and Chen, Timothy and Caccavale, Adam and Gardner, Rachel and Culbertson, Preston and Bohg, Jeannette and Schwager, Mac}, + doi = {10.1109/LRA.2022.3150497}, + journal = {IEEE Robotics and Automation Letters}, + number = {2}, + pages = {4606-4613}, + title = {Vision-Only Robot Navigation in a Neural Radiance World}, + volume = {7}, + year = {2022}, + bdsk-url-1 = {https://doi.org/10.1109/LRA.2022.3150497}} @article{Baum66aoms_hmms, - author = {Leonard E. Baum and Ted Petrie}, - title = {{Statistical Inference for Probabilistic Functions of Finite State Markov Chains}}, - volume = {37}, - journal = {The Annals of Mathematical Statistics}, - number = {6}, - publisher = {Institute of Mathematical Statistics}, - pages = {1554 -- 1563}, - year = {1966}, - doi = {10.1214/aoms/1177699147}, -} - -@Book{Beard11book, - Title = {{Small Unmanned Aircraft: Theory and Practice}}, - Author = {Beard, Randal W. and McLain, Timothy W.}, - Publisher = {Princeton University Press}, - Year = {2012}, - Month = feb -} - -@ARTICLE{Bellman60, - author={Bellman, Richard and Kalaba, Robert}, - journal={IRE Transactions on Automatic Control}, - title={Dynamic programming and adaptive processes {I}: Mathematical foundation}, - year={1960}, - volume={AC-5}, - number={1}, - pages={5-10}, - doi={10.1109/TAC.1960.6429288} -} + author = {Baum, Leonard E. and Petrie, Ted}, + doi = {10.1214/aoms/1177699147}, + journal = {The Annals of Mathematical Statistics}, + number = {6}, + pages = {1554 -- 1563}, + publisher = {Institute of Mathematical Statistics}, + title = {{Statistical Inference for Probabilistic Functions of Finite State Markov Chains}}, + volume = {37}, + year = {1966}, + bdsk-url-1 = {https://doi.org/10.1214/aoms/1177699147}} + +@book{Beard11book, + author = {Beard, Randal W. and McLain, Timothy W.}, + month = feb, + publisher = {Princeton University Press}, + title = {{Small Unmanned Aircraft: Theory and Practice}}, + year = {2012}} + +@article{Bellman60, + author = {Bellman, Richard and Kalaba, Robert}, + doi = {10.1109/TAC.1960.6429288}, + journal = {IRE Transactions on Automatic Control}, + number = {1}, + pages = {5-10}, + title = {Dynamic programming and adaptive processes {I}: Mathematical foundation}, + volume = {AC-5}, + year = {1960}, + bdsk-url-1 = {https://doi.org/10.1109/TAC.1960.6429288}} @book{Chan23book_prob4ds, - author = {Stanley H. Chan}, - title = {Introduction to Probability for Data Science}, - year = {2023}, - publisher = {Michigan Publishing Services}, - isbn = {978-1-60785-747-1}, - url = {https://probability4datascience.com/index.html} -} + author = {Chan, Stanley H.}, + isbn = {978-1-60785-747-1}, + publisher = {Michigan Publishing Services}, + title = {Introduction to Probability for Data Science}, + url = {https://probability4datascience.com/index.html}, + year = {2023}, + bdsk-url-1 = {https://probability4datascience.com/index.html}} @book{Choset05book_motion, - title = {Principles of Robot Motion}, - author = {Howie Choset and Kevin M. Lynch and Seth Hutchinson and George Kantor and Wolfram Burgard and Lydia E. Kavraki and Sebastian Thrun}, - publisher = {MIT Press}, - year = {2005}, - isbn = {9780262033275}, - url = {https://mitpress.mit.edu/9780262033275/principles-of-robot-motion/} -} - -@INPROCEEDINGS{Dellaert99icra_mcl, - author={Dellaert, Frank and Fox, Dieter and Burgard, Wolfram and Thrun, Sebastian}, - booktitle={Proceedings 1999 IEEE International Conference on Robotics and Automation}, - title={Monte Carlo localization for mobile robots}, - year={1999}, - volume={2}, - number={}, - pages={1322-1328 vol.2}, - doi={10.1109/ROBOT.1999.772544} -} + author = {Choset, Howie and Lynch, Kevin M. and Hutchinson, Seth and Kantor, George and Burgard, Wolfram and Kavraki, Lydia E. and Thrun, Sebastian}, + isbn = {9780262033275}, + publisher = {MIT Press}, + title = {Principles of Robot Motion}, + url = {https://mitpress.mit.edu/9780262033275/principles-of-robot-motion/}, + year = {2005}, + bdsk-url-1 = {https://mitpress.mit.edu/9780262033275/principles-of-robot-motion/}} + +@inproceedings{Dellaert99icra_mcl, + author = {Dellaert, Frank and Fox, Dieter and Burgard, Wolfram and Thrun, Sebastian}, + booktitle = {Proceedings 1999 IEEE International Conference on Robotics and Automation}, + doi = {10.1109/ROBOT.1999.772544}, + pages = {1322-1328 vol.2}, + title = {Monte Carlo localization for mobile robots}, + volume = {2}, + year = {1999}, + bdsk-url-1 = {https://doi.org/10.1109/ROBOT.1999.772544}} @article{Dellaert17fnt_fg, - url = {https://www.nowpublishers.com/article/Details/ROB-043}, - year = {2017}, - volume = {6}, - journal = {Foundations and Trends® in Robotics}, - title = {Factor Graphs for Robot Perception}, - doi = {10.1561/2300000043}, - issn = {1935-8253}, - number = {1-2}, - pages = {1-139}, - author = {Frank Dellaert and Michael Kaess} -} + author = {Dellaert, Frank and Kaess, Michael}, + doi = {10.1561/2300000043}, + issn = {1935-8253}, + journal = {Foundations and Trends{\textregistered} in Robotics}, + number = {1-2}, + pages = {1-139}, + title = {Factor Graphs for Robot Perception}, + url = {https://www.nowpublishers.com/article/Details/ROB-043}, + volume = {6}, + year = {2017}, + bdsk-url-1 = {https://www.nowpublishers.com/article/Details/ROB-043}, + bdsk-url-2 = {https://doi.org/10.1561/2300000043}} @article{Dellaert21ar_fg, - title={Factor graphs: Exploiting structure in robotics}, - author={Dellaert, Frank}, - journal={Annual Review of Control, Robotics, and Autonomous Systems}, - volume={4}, - pages={141--166}, - year={2021}, - publisher={Annual Reviews}, - utl={http://annualreviews.org/eprint/85PQDQYUGNEU6JPHW698/full/10.1146/annurev-control-061520-010504}, - doi={10.1146/annurev-control-061520-010504} -} - -@Book{Farrell08book, - Title = {Aided Navigation: {GPS} with High Rate Sensors}, - Author = {Jay A. Farrell}, - Publisher = {McGraw-Hill}, - Year = {2008}, -} - -@Article{Forster16tro, - fullauthor = {Christian Forster and Luca Carlone and Frank - Dellaert and Davide Scaramuzza}, - author = {C. Forster and L. Carlone and F. Dellaert and - D. Scaramuzza}, - title = {On-Manifold Preintegration for Real-Time - Visual-Inertial Odometry}, - journal = TRO, - year = 2016, - DOI={10.1109/TRO.2016.2597321} -} - -@INPROCEEDINGS{Gamagedara19acc_geometric_control, - author={Gamagedara, Kanishke and Bisheban, Mahdis and Kaufman, Evan and Lee, Taeyoung}, - booktitle={2019 American Control Conference (ACC)}, - title={Geometric Controls of a Quadrotor UAV with Decoupled Yaw Control}, - year={2019}, - pages={3285-3290}, - doi={10.23919/ACC.2019.8815189} -} + author = {Dellaert, Frank}, + doi = {10.1146/annurev-control-061520-010504}, + journal = {Annual Review of Control, Robotics, and Autonomous Systems}, + pages = {141--166}, + publisher = {Annual Reviews}, + title = {Factor graphs: Exploiting structure in robotics}, + utl = {http://annualreviews.org/eprint/85PQDQYUGNEU6JPHW698/full/10.1146/annurev-control-061520-010504}, + volume = {4}, + year = {2021}, + bdsk-url-1 = {https://doi.org/10.1146/annurev-control-061520-010504}} + +@book{Farrell08book, + author = {Farrell, Jay A.}, + publisher = {McGraw-Hill}, + title = {Aided Navigation: {GPS} with High Rate Sensors}, + year = {2008}} + +@article{Forster16tro, + author = {Forster, C. and Carlone, L. and Dellaert, F. and Scaramuzza, D.}, + doi = {10.1109/TRO.2016.2597321}, + fullauthor = {Christian Forster and Luca Carlone and Frank Dellaert and Davide Scaramuzza}, + journal = TRO, + title = {On-Manifold Preintegration for Real-Time Visual-Inertial Odometry}, + year = 2016, + bdsk-url-1 = {https://doi.org/10.1109/TRO.2016.2597321}} + +@inproceedings{Gamagedara19acc_geometric_control, + author = {Gamagedara, Kanishke and Bisheban, Mahdis and Kaufman, Evan and Lee, Taeyoung}, + booktitle = {2019 American Control Conference (ACC)}, + doi = {10.23919/ACC.2019.8815189}, + pages = {3285-3290}, + title = {Geometric Controls of a Quadrotor UAV with Decoupled Yaw Control}, + year = {2019}, + bdsk-url-1 = {https://doi.org/10.23919/ACC.2019.8815189}} @book{Goodfellow16book_dl, - title = {Deep Learning}, - author = {Ian Goodfellow and Yoshua Bengio and Aaron Courville}, - publisher = {MIT Press}, - year = {2016}, - isbn = {9780262035613}, - url = {https://www.deeplearningbook.org} -} - -@Book{Hartley00, - Title = {Multiple View Geometry in Computer Vision}, - Author = {Richard Hartley and Andrew Zisserman}, - Publisher = {Cambridge University Press}, - Year = {2000}, -} - -@ARTICLE{Jelinek75it_decoder, - author={Jelinek, F. and Bahl, L. and Mercer, R.}, - journal={IEEE Transactions on Information Theory}, - title={Design of a linguistic statistical decoder for the recognition of continuous speech}, - year={1975}, - volume={21}, - number={3}, - pages={250-256}, - keywords={}, - doi={10.1109/TIT.1975.1055384} -} + author = {Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron}, + isbn = {9780262035613}, + publisher = {MIT Press}, + title = {Deep Learning}, + url = {https://www.deeplearningbook.org}, + year = {2016}, + bdsk-url-1 = {https://www.deeplearningbook.org}} + +@book{Hartley00, + author = {Hartley, Richard and Zisserman, Andrew}, + publisher = {Cambridge University Press}, + title = {Multiple View Geometry in Computer Vision}, + year = {2000}} + +@article{Jelinek75it_decoder, + author = {Jelinek, F. and Bahl, L. and Mercer, R.}, + doi = {10.1109/TIT.1975.1055384}, + journal = {IEEE Transactions on Information Theory}, + number = {3}, + pages = {250-256}, + title = {Design of a linguistic statistical decoder for the recognition of continuous speech}, + volume = {21}, + year = {1975}, + bdsk-url-1 = {https://doi.org/10.1109/TIT.1975.1055384}} @book{LaValle06book_planning, - title = {Planning Algorithms}, - author = {Steven M. LaValle}, - publisher = {Cambridge University Press}, - year = {2006}, - isbn={9780521862059}, - url = {https://lavalle.pl/planning/} -} - -@Article{Lupton12tro, - Title = {Visual-Inertial-Aided Navigation for High-Dynamic - Motion in Built Environments Without Initial Conditions}, - Author = {Lupton, Todd and Sukkarieh, Salah}, - Journal = TRO, - Year = {2012}, - - Month = {Feb}, - Number = {1}, - Pages = {61-76}, - Volume = {28}, - DOI= {10.1109/TRO.2011.2170332} -} + author = {LaValle, Steven M.}, + isbn = {9780521862059}, + publisher = {Cambridge University Press}, + title = {Planning Algorithms}, + url = {https://lavalle.pl/planning/}, + year = {2006}, + bdsk-url-1 = {https://lavalle.pl/planning/}} + +@article{Lupton12tro, + author = {Lupton, Todd and Sukkarieh, Salah}, + doi = {10.1109/TRO.2011.2170332}, + journal = TRO, + month = {Feb}, + number = {1}, + pages = {61-76}, + title = {Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments Without Initial Conditions}, + volume = {28}, + year = {2012}, + bdsk-url-1 = {https://doi.org/10.1109/TRO.2011.2170332}} @book{Lynch17book_MR, - author = {Kevin M. Lynch and Frank C. Park}, - title = {Modern Robotics: Mechanics, Planning, and Control}, - year = {2017}, - publisher = {Cambridge University Press}, - isbn = {1107156300}, - address = {USA}, - url = {http://modernrobotics.org/} -} - -@Book{Ma04book, - Title = {An Invitation to {3-D} Vision}, - Author = {Yi Ma and Stefano Soatto and Jana Kosecka and Shankar S. Sastry}, - Publisher = {Springer}, - Year = {2004}, -} + address = {USA}, + author = {Lynch, Kevin M. and Park, Frank C.}, + isbn = {1107156300}, + publisher = {Cambridge University Press}, + title = {Modern Robotics: Mechanics, Planning, and Control}, + url = {http://modernrobotics.org/}, + year = {2017}, + bdsk-url-1 = {http://modernrobotics.org/}} + +@book{Ma04book, + author = {Ma, Yi and Soatto, Stefano and Kosecka, Jana and Sastry, Shankar S.}, + publisher = {Springer}, + title = {An Invitation to {3-D} Vision}, + year = {2004}} @article{Mahony12ram, - title = {Multirotor aerial vehicles: Modeling, estimation, - and control of quadrotor}, - author = {Mahony, Robert and Kumar, Vijay and Corke, Peter}, - year = 2012, - publisher = {IEEE}, - journal = RAM, - DOI= {10.1109/MRA.2012.2206474} -} + author = {Mahony, Robert and Kumar, Vijay and Corke, Peter}, + doi = {10.1109/MRA.2012.2206474}, + journal = RAM, + publisher = {IEEE}, + title = {Multirotor aerial vehicles: Modeling, estimation, and control of quadrotor}, + year = 2012, + bdsk-url-1 = {https://doi.org/10.1109/MRA.2012.2206474}} @article{Mildenhall22eccv_Nerf, - author = {Mildenhall, Ben and Srinivasan, Pratul P. and Tancik, Matthew and Barron, Jonathan T. and Ramamoorthi, Ravi and Ng, Ren}, - title = {NeRF: representing scenes as neural radiance fields for view synthesis}, - year = {2021}, - publisher = {Association for Computing Machinery}, - address = {New York, NY, USA}, - volume = {65}, - number = {1}, - issn = {0001-0782}, - doi = {10.1145/3503250}, - journal = {Commun. ACM}, - month = {dec}, - pages = {99–106}, - numpages = {8} -} - -@Book{Murray94book, - Title = {A Mathematical Introduction to Robotic Manipulation}, - Publisher = {CRC Press}, - Year = {1994}, - Author = {Richard M. Murray and Zexiang Li and Shankar S. Sastry} -} - -@Article{Schoenemann66_procrustes, - author={Peter Schönemann}, - title={{A generalized solution of the orthogonal procrustes problem}}, - journal={Psychometrika}, - year=1966, - volume={31}, - number={1}, - pages={1-10}, - month={March}, - doi={10.1007/BF02289451}, -} + address = {New York, NY, USA}, + author = {Mildenhall, Ben and Srinivasan, Pratul P. and Tancik, Matthew and Barron, Jonathan T. and Ramamoorthi, Ravi and Ng, Ren}, + doi = {10.1145/3503250}, + issn = {0001-0782}, + journal = {Commun. ACM}, + month = {dec}, + number = {1}, + numpages = {8}, + pages = {99--106}, + publisher = {Association for Computing Machinery}, + title = {NeRF: representing scenes as neural radiance fields for view synthesis}, + volume = {65}, + year = {2021}, + bdsk-url-1 = {https://doi.org/10.1145/3503250}} + +@book{Murray94book, + author = {Murray, Richard M. and Li, Zexiang and Sastry, Shankar S.}, + publisher = {CRC Press}, + title = {A Mathematical Introduction to Robotic Manipulation}, + year = {1994}} + +@article{Schoenemann66_procrustes, + author = {Sch{\"o}nemann, Peter}, + doi = {10.1007/BF02289451}, + journal = {Psychometrika}, + month = {March}, + number = {1}, + pages = {1-10}, + title = {{A generalized solution of the orthogonal procrustes problem}}, + volume = {31}, + year = 1966, + bdsk-url-1 = {https://doi.org/10.1007/BF02289451}} @book{Siegwart11book_robots, - title = {Introduction to Autonomous Mobile Robots}, - author = {Roland Siegwart and Illah Reza Nourbakhsh and Davide Scaramuzza}, - publisher = {MIT Press}, - year = {2011}, - ISBN= {9780262015356}, - url = {https://mitpress.mit.edu/9780262015356/introduction-to-autonomous-mobile-robots/} -} - -@INPROCEEDINGS{Sun22cvpr_dvgo, - author={Sun, Cheng and Sun, Min and Chen, Hwann-Tzong}, - booktitle={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, - title={Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction}, - year={2022}, - pages={5449-5459}, - doi={10.1109/CVPR52688.2022.00538} -} + author = {Siegwart, Roland and Nourbakhsh, Illah Reza and Scaramuzza, Davide}, + isbn = {9780262015356}, + publisher = {MIT Press}, + title = {Introduction to Autonomous Mobile Robots}, + url = {https://mitpress.mit.edu/9780262015356/introduction-to-autonomous-mobile-robots/}, + year = {2011}, + bdsk-url-1 = {https://mitpress.mit.edu/9780262015356/introduction-to-autonomous-mobile-robots/}} + +@inproceedings{Sun22cvpr_dvgo, + author = {Sun, Cheng and Sun, Min and Chen, Hwann-Tzong}, + booktitle = {2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, + doi = {10.1109/CVPR52688.2022.00538}, + pages = {5449-5459}, + title = {Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction}, + year = {2022}, + bdsk-url-1 = {https://doi.org/10.1109/CVPR52688.2022.00538}} @book{Sutton18book_reinforcement, - title = {Reinforcement Learning: An Introduction}, - author = {Richard S. Sutton and Andrew G. Barto}, - edition = {2}, - year = {2018}, - isbn = {9780262039246}, - publisher = {The MIT Press}, - address = {Cambridge, MA, USA}, - url = {http://incompleteideas.net/book/the-book-2nd.html} -} + address = {Cambridge, MA, USA}, + author = {Sutton, Richard S. and Barto, Andrew G.}, + edition = {2}, + isbn = {9780262039246}, + publisher = {The MIT Press}, + title = {Reinforcement Learning: An Introduction}, + url = {http://incompleteideas.net/book/the-book-2nd.html}, + year = {2018}, + bdsk-url-1 = {http://incompleteideas.net/book/the-book-2nd.html}} @book{Thrun05book_probabilistic, - title = {Probabilistic Robotics}, - author = {Sebastian Thrun and Wolfram Burgard and Dieter Fox}, - publisher = {The MIT Press}, - year = {2005}, - isbn = {9780262201629}, - url = {https://mitpress.mit.edu/9780262201629/probabilistic-robotics/} -} + author = {Thrun, Sebastian and Burgard, Wolfram and Fox, Dieter}, + isbn = {9780262201629}, + publisher = {The MIT Press}, + title = {Probabilistic Robotics}, + url = {https://mitpress.mit.edu/9780262201629/probabilistic-robotics/}, + year = {2005}, + bdsk-url-1 = {https://mitpress.mit.edu/9780262201629/probabilistic-robotics/}} @book{Watkins89thesis_Qlearning, - title={Learning from delayed rewards}, - author={Watkins, Christopher John Cornish Hellaby}, - year={1989}, - publisher={King's College, Cambridge United Kingdom} -} + author = {Watkins, Christopher John Cornish Hellaby}, + publisher = {King's College, Cambridge United Kingdom}, + title = {Learning from delayed rewards}, + year = {1989}} @book{Zhang20book_d2l, - title = {Dive into Deep Learning}, - author = {Aston Zhang and Zack Lipton and Mu Li and Alexander J. Smola}, - publisher = {d2l.ai}, - year = {2020}, - isbn={978-1009389433}, - url = {https://d2l.ai/} -} + author = {Zhang, Aston and Lipton, Zack and Li, Mu and Smola, Alexander J.}, + isbn = {978-1009389433}, + publisher = {d2l.ai}, + title = {Dive into Deep Learning}, + url = {https://d2l.ai/}, + year = {2020}, + bdsk-url-1 = {https://d2l.ai/}} @book{HaldBook98, - title = {A history of mathematical statistics from 1750 to 1930}, - author = {Anders Hald}, - publisher = {Wiley}, - year = {1998} -} + author = {Hald, Anders}, + publisher = {Wiley}, + title = {A history of mathematical statistics from 1750 to 1930}, + year = {1998}} + @book{HaldBook03, - title = {A History of Probability and Statistics and Their Applications before 1750}, - author = {Anders Hald}, - publisher = {Wiley}, - year = {2003} -} - -@BOOK{Pearl88Probabilistic, - AUTHOR = {Pearl, J.}, - TITLE = {Probabilistic Reasoning in Intelligent Systems: - Networks of Plausible Inference}, - PUBLISHER = {Morgan Kaufmann Publishers, Inc.}, - YEAR = {1988} - } + author = {Hald, Anders}, + publisher = {Wiley}, + title = {A History of Probability and Statistics and Their Applications before 1750}, + year = {2003}} + +@book{Pearl88Probabilistic, + author = {Pearl, J.}, + publisher = {Morgan Kaufmann Publishers, Inc.}, + title = {Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference}, + year = {1988}} @book{ProbGraphModels, - Title = {Probabilistic Graphical Models Principles and Techniques}, - Author = {Koller, D. and Friedman, N.}, - Publisher = {The MIT Press}, - Year = {2009} -} + author = {Koller, D. and Friedman, N.}, + publisher = {The MIT Press}, + title = {Probabilistic Graphical Models Principles and Techniques}, + year = {2009}} @book{duda2012pattern, - title={Pattern Classification}, - author={Duda, R.O. and Hart, P.E. and Stork, D.G.}, - isbn={9781118586006}, - url={https://books.google.co.uk/books?id=Br33IRC3PkQC}, - year={2012}, - publisher={Wiley} -} - + author = {Duda, R.O. and Hart, P.E. and Stork, D.G.}, + isbn = {9781118586006}, + publisher = {Wiley}, + title = {Pattern Classification}, + url = {https://books.google.co.uk/books?id=Br33IRC3PkQC}, + year = {2012}, + bdsk-url-1 = {https://books.google.co.uk/books?id=Br33IRC3PkQC}} @article{SmaSon73, - author = {Smallwood, Richard D. and Sondik, Edward J.}, - journal = {Operations Research}, - month = {Sep.}, - number = {5}, - pages = {1071--1088}, - publisher = {INFORMS}, - title = {The Optimal Control of Partially Observable {M}arkov Processes Over a Finite Horizon}, - volume = {21}, - year = {1973} -} + author = {Smallwood, Richard D. and Sondik, Edward J.}, + journal = {Operations Research}, + month = {Sep.}, + number = {5}, + pages = {1071--1088}, + publisher = {INFORMS}, + title = {The Optimal Control of Partially Observable {M}arkov Processes Over a Finite Horizon}, + volume = {21}, + year = {1973}} @article{Son78, - author = {Sondik, Edward J.}, - journal = {Operations Research}, - month = {March}, - number = {2}, - pages = {282--304}, - publisher = {INFORMS}, - title = {The Optimal Control of Partially Observable {M}arkov Processes Over the Infinite Horizon: Discounted Costs}, - volume = {26}, - year = {1978} -} + author = {Sondik, Edward J.}, + journal = {Operations Research}, + month = {March}, + number = {2}, + pages = {282--304}, + publisher = {INFORMS}, + title = {The Optimal Control of Partially Observable {M}arkov Processes Over the Infinite Horizon: Discounted Costs}, + volume = {26}, + year = {1978}} @book{AutoMobileRobotsBook, - Title = {Introduction to Autonomous Mobile Robots}, - Author = {Siegwart, R. and Nourbakhsh, I. and Scaramuzza, D.}, - Publisher = {The MIT Press}, - Year = {2011} -} - -@BOOK{JCL:91, - AUTHOR = {Latombe, J.C.}, - TITLE = {Robot Motion Planning}, - PUBLISHER = {Kluwer Academic Publishers}, - YEAR = {1991}, - ADDRESS = {Boston, MA}} + author = {Siegwart, R. and Nourbakhsh, I. and Scaramuzza, D.}, + publisher = {The MIT Press}, + title = {Introduction to Autonomous Mobile Robots}, + year = {2011}} + +@book{JCL:91, + address = {Boston, MA}, + author = {Latombe, J.C.}, + publisher = {Kluwer Academic Publishers}, + title = {Robot Motion Planning}, + year = {1991}} @book{LavalleBook, - author = {S. M. LaValle}, - title = {Planning Algorithms}, - address = {Cambridge, MA}, - publisher = {Cambridge University Press}, - year = 2006} - -@ARTICLE{Ila10tro_PoseSLAM, - author={Ila, Viorela and Porta, Josep M. and Andrade-Cetto, Juan}, - journal={IEEE Transactions on Robotics}, - title={Information-Based Compact Pose SLAM}, - year={2010}, - volume={26}, - number={1}, - pages={78-93}, - doi={10.1109/TRO.2009.2034435}} + address = {Cambridge, MA}, + author = {LaValle, S. M.}, + publisher = {Cambridge University Press}, + title = {Planning Algorithms}, + year = 2006} + +@article{Ila10tro_PoseSLAM, + author = {Ila, Viorela and Porta, Josep M. and Andrade-Cetto, Juan}, + doi = {10.1109/TRO.2009.2034435}, + journal = {IEEE Transactions on Robotics}, + number = {1}, + pages = {78-93}, + title = {Information-Based Compact Pose SLAM}, + volume = {26}, + year = {2010}, + bdsk-url-1 = {https://doi.org/10.1109/TRO.2009.2034435}} @article{Williams92ml_reinforce, - author = {R. J. Williams}, - title = {Simple statistical gradient-following algorithms for - connectionist reinforcement learning}, - journal = {Machine Learning}, - year = {1992}, - volume = 8, - pages = {229-259} -} + author = {Williams, R. J.}, + journal = {Machine Learning}, + pages = {229-259}, + title = {Simple statistical gradient-following algorithms for connectionist reinforcement learning}, + volume = 8, + year = {1992}} @article{Schulman17_PPO, - author = {John Schulman and - Filip Wolski and - Prafulla Dhariwal and - Alec Radford and - Oleg Klimov}, - title = {Proximal Policy Optimization Algorithms}, - journal = {CoRR}, - volume = {abs/1707.06347}, - year = {2017}, - url = {http://arxiv.org/abs/1707.06347}, - eprinttype = {arXiv}, - eprint = {1707.06347}, - timestamp = {Mon, 13 Aug 2018 16:47:34 +0200}, - biburl = {https://dblp.org/rec/journals/corr/SchulmanWDRK17.bib}, - bibsource = {dblp computer science bibliography, https://dblp.org} -} + author = {Schulman, John and Wolski, Filip and Dhariwal, Prafulla and Radford, Alec and Klimov, Oleg}, + bibsource = {dblp computer science bibliography, https://dblp.org}, + biburl = {https://dblp.org/rec/journals/corr/SchulmanWDRK17.bib}, + eprint = {1707.06347}, + eprinttype = {arXiv}, + journal = {CoRR}, + timestamp = {Mon, 13 Aug 2018 16:47:34 +0200}, + title = {Proximal Policy Optimization Algorithms}, + url = {http://arxiv.org/abs/1707.06347}, + volume = {abs/1707.06347}, + year = {2017}, + bdsk-url-1 = {http://arxiv.org/abs/1707.06347}} @book{Spong96book, - author={M.\ Spong and S.\ Hutchinson and M. Vidyasagar}, - title={Robot Modeling and Control}, - publisher={John Wiley and Sons}, - address={NY, NY}, - year={2006} -} - -@ARTICLE{Besl92pami_ICP, - author={Besl, P.J. and McKay, Neil D.}, - journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, - title={A method for registration of 3-D shapes}, - year={1992}, - volume={14}, - number={2}, - pages={239-256}, - doi={10.1109/34.121791}} - -@INPROCEEDINGS{Li20icma_LaneChange, - author={Li, Zhiyuan and Liang, Huawei and Zhao, Pan and Wang, Shaobo and Zhu, Hui}, - booktitle={2020 IEEE International Conference on Mechatronics and Automation (ICMA)}, - title={Efficient Lane Change Path Planning based on Quintic spline for Autonomous Vehicles}, - year={2020}, - pages={338-344}, - doi={10.1109/ICMA49215.2020.9233841}} - -@INPROCEEDINGS{Werling10icra_Frenet, - author={Werling, Moritz and Ziegler, Julius and Kammel, Sören and Thrun, Sebastian}, - booktitle={2010 IEEE International Conference on Robotics and Automation}, - title={Optimal trajectory generation for dynamic street scenarios in a Frenét Frame}, - year={2010}, - pages={987-993}, - doi={10.1109/ROBOT.2010.5509799}} + address = {NY, NY}, + author = {Spong, M.\ and Hutchinson, S.\ and Vidyasagar, M.}, + publisher = {John Wiley and Sons}, + title = {Robot Modeling and Control}, + year = {2006}} + +@article{Besl92pami_ICP, + author = {Besl, P.J. and McKay, Neil D.}, + doi = {10.1109/34.121791}, + journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, + number = {2}, + pages = {239-256}, + title = {A method for registration of 3-D shapes}, + volume = {14}, + year = {1992}, + bdsk-url-1 = {https://doi.org/10.1109/34.121791}} + +@inproceedings{Li20icma_LaneChange, + author = {Li, Zhiyuan and Liang, Huawei and Zhao, Pan and Wang, Shaobo and Zhu, Hui}, + booktitle = {2020 IEEE International Conference on Mechatronics and Automation (ICMA)}, + doi = {10.1109/ICMA49215.2020.9233841}, + pages = {338-344}, + title = {Efficient Lane Change Path Planning based on Quintic spline for Autonomous Vehicles}, + year = {2020}, + bdsk-url-1 = {https://doi.org/10.1109/ICMA49215.2020.9233841}} + +@inproceedings{Werling10icra_Frenet, + author = {Werling, Moritz and Ziegler, Julius and Kammel, S{\"o}ren and Thrun, Sebastian}, + booktitle = {2010 IEEE International Conference on Robotics and Automation}, + doi = {10.1109/ROBOT.2010.5509799}, + pages = {987-993}, + title = {Optimal trajectory generation for dynamic street scenarios in a Fren{\'e}t Frame}, + year = {2010}, + bdsk-url-1 = {https://doi.org/10.1109/ROBOT.2010.5509799}} @article{FrancoisLavet18fnt_DRL, - title={An Introduction to Deep Reinforcement Learning}, - volume={11}, - ISSN={1935-8245}, - url={http://dx.doi.org/10.1561/2200000071}, - DOI={10.1561/2200000071}, - number={3–4}, - journal={Foundations and Trends in Machine Learning}, - publisher={Now Publishers}, - author={François-Lavet, Vincent and Henderson, Peter and Islam, Riashat and Bellemare, Marc G. and Pineau, Joelle}, - year={2018}, - pages={219–354} -} - + author = {Fran{\c c}ois-Lavet, Vincent and Henderson, Peter and Islam, Riashat and Bellemare, Marc G. and Pineau, Joelle}, + doi = {10.1561/2200000071}, + issn = {1935-8245}, + journal = {Foundations and Trends in Machine Learning}, + number = {3--4}, + pages = {219--354}, + publisher = {Now Publishers}, + title = {An Introduction to Deep Reinforcement Learning}, + url = {http://dx.doi.org/10.1561/2200000071}, + volume = {11}, + year = {2018}, + bdsk-url-1 = {http://dx.doi.org/10.1561/2200000071}} @article{Mnih15nature_dqn, - title = {Human-level control through deep reinforcement learning}, - volume = {518}, - issn = {1476-4687}, - url = {https://doi.org/10.1038/nature14236}, + author = {Mnih, Volodymyr and Kavukcuoglu, Koray and Silver, David and Rusu, Andrei A. and Veness, Joel and Bellemare, Marc G. and Graves, Alex and Riedmiller, Martin and Fidjeland, Andreas K. and Ostrovski, Georg and Petersen, Stig and Beattie, Charles and Sadik, Amir and Antonoglou, Ioannis and King, Helen and Kumaran, Dharshan and Wierstra, Daan and Legg, Shane and Hassabis, Demis}, doi = {10.1038/nature14236}, - number = {7540}, + issn = {1476-4687}, journal = {Nature}, - author = {Mnih, Volodymyr and Kavukcuoglu, Koray and Silver, David and Rusu, Andrei A. and Veness, Joel and Bellemare, Marc G. and Graves, Alex and Riedmiller, Martin and Fidjeland, Andreas K. and Ostrovski, Georg and Petersen, Stig and Beattie, Charles and Sadik, Amir and Antonoglou, Ioannis and King, Helen and Kumaran, Dharshan and Wierstra, Daan and Legg, Shane and Hassabis, Demis}, month = feb, - year = {2015}, + number = {7540}, pages = {529--533}, -} + title = {Human-level control through deep reinforcement learning}, + url = {https://doi.org/10.1038/nature14236}, + volume = {518}, + year = {2015}, + bdsk-url-1 = {https://doi.org/10.1038/nature14236}}