Abadi,, M., Agarwal,, A., Barham,, P., Brevdo,, E., Chen,, Z., Citro,, C., … Zheng,, X. (2016). *Tensorflow: Large‐scale machine learning on heterogeneous distributed systems. CoRR, abs/1603.04467*. Retrieved from http://arxiv.org/abs/1603.04467

Andresson,, J. A. E., Gillis,, J., Horn,, G., Rawlings,, J. B., & Diehl,, M. (2018). CasADi—A software framework for nonlinear optimization and optimal control. Mathematical Programming Computation. https://doi.org/10.1007/s12532-018-0139-4

Aubert,, P., Di Cesare,, N., & Pironneau,, O. (2001). Automatic differentiation in c++ using expression templates and application to a flow control problem. Computing and Visualization in Science, 3, 197–208. https://doi.org/10.1007/s007910000048

Baydin,, A. G., Pearlmutter,, B. A., Radul,, A. A., & Siskind,, J. M. (2018). Automatic differentiation in machine learning: A survey. *CoRR, abs/1502.05767*. Retrieved from http://arxiv.org/abs/1502.05767

Bell,, B. M. (2012). Cppad: A package for c++ algorithmic differentiation. Computational Infrastructure for Operations Research. Retrieved from https://projects.coin-or.org/CppAD

Bell,, B. M., & Burke,, J. V. (2008). Algorithmic differentiation of implicit functions and optimal values. Advances in Automatic Differentiation, 64. https://doi.org/10.1007/978-3-540-68942-3_17

Betancourt,, M. (2013). *A general metric for Riemannian manifold Hamiltonian Monte Carlo*. Retrieved from https://arxiv.org/pdf/1212.4693.pdf

Betancourt,, M. (2017). *Nomad: A high‐performance automatic differentiation package [Computer software manual]*. Retrieved from https://github.com/stan-dev/nomad

Betancourt,, M. (2018a, January). A conceptual introduction to hamiltonian monte carlo. arXiv:1701.02434v1.

Betancourt,, M. (2018b). A geometric theory of higher‐order automatic differentiation. *In preparation*. Retrieved from https://arxiv.org/abs/1812.11592

Bischof,, C., & Bucker,, H. (2000). Computing derivatives of computer programs. In J. Grotendorst, (Ed.), Modern methods and algorithms of quantum chemistry: Proceedings (Vol. 3, 2nd ed., pp. 315–327). Jülich, Germany: NIC‐Directors Retrieved from https://juser.fz-juelich.de/record/44658/files/Band_3_Winterschule.pdf

Bischof,, C., Corliss,, G., & Griewank,, A. (1993). Structured second‐and higher‐order derivatives through univariate Taylor series. Optimization Methods and Softwares, 2, 211–232. https://doi.org/10.1080/10556789308805543

Bischof,, C., Khademi,, P., Mauer,, A., & Carle,, A. (1996). Adifor 2.0: Automatic differentiation of fortran 77 programs. Computational Science Engineering, IEEE, 3(3), 18–32. https://doi.org/10.1109/99.537089

Carpenter,, B., Gelman,, A., Hoffman,, M., Lee,, D., Goodrich,, B., Betancourt,, M., … Riddel,, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76, 1–32. https://doi.org/10.18637/jss.v076.i01

Carpenter,, B., Hoffman,, M. D., Brubaker,, M. A., Lee,, D., Li,, P., & Betancourt,, M. J. (2015). *The stan math library: Reverse‐mode automatic differentiation in c++*. Retrieved from https://arxiv.org/abs/1509.07164

Eddelbuettel,, D., & Balamuta,, J. J. (2017, August). Extending extitR with extitC++: A brief introduction to extitRcpp. *PeerJ Preprints*, *5*, e3188v1. Retrieved from https:// doi.org/10.7287/peerj.preprints.3188v1

Gay,, D. (2005). Semiautomatic differentiation for efficient gradient computations. In H. M. Buecker,, G. F. Corliss,, P. Hovland,, U. Naumann,, & B. Norris, (Eds.), Automatic differentiation: Applications, theory, and implementations (Vol. 50, p. 147158). New York, NY: Springer.

Gay,, D., & Aiken,, A. (2001, May). Language support for regions. SIGPLAN Notices, 36(5), 70–80. https://doi.org/10.1145/381694.378815

Gebremedhin,, A. H., Tarafdar,, A., Pothen,, A., & Walther,, A. (2009). Efficient computation of sparse hessians using coloring and automatic differentiation. INFORMS Journal on Computing, 21, 209–223. https://doi.org/10.1287/ijoc.1080.0286

Giles,, M. B. (2008). Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation. In C. H. Bischof, H. M. Bücker, P. Hovland, U. Naumann & J. Utke (Eds.), *Advances in Automatic Differentiation. Lecture Notes in Computational Science and Engineering* (vol. 64). Berlin, Heidelberg: Springer.

Girolami,, M., Calderhead,, B., & Chin,, S. A. (2013). Riemannian manifold hamiltonian monte carlo. *arXiv:0907.1100*. Retrieved from https://arxiv.org/abs/0907.1100

Griewank,, A. (1992). Achieving logarithmic growth of temporal and spatial complexity in reverse automatic differentiation. Optimization Methods and Software, 1(1), 35–54. https://doi.org/10.1080/10556789208805505

Griewank,, A., Juedes,, D., & Utke,, J. (1999). Adol‐c: A package for the automatic differentiation of algorithms written in c/c++. ACM Transactions on Mathematical Software, 22, 131‐167. Retrieved from http://www3.math.tu-berlin.de/Vorlesungen/SS06/AlgoDiff/adolc‐110.pdf

Griewank,, A., Utke,, J., & Walther,, A. (2000). Evaluating higher derivative tensors by forward propagation of univariate Taylor series. Mathematics of Computation, 69(231), 1117–1130. https://doi.org/10.1090/S0025-5718-00-01120-0

Griewank,, A., & Walther,, A. (2008). Evaluating derivatives: Principles and techniques of algorithmic differentiation. Society for Industrial and Applied Mathematics (SIAM), 2. https://doi.org/10.1137/1.9780898717761

Grimm,, J., Pottier,, L., & Rostaing‐Schmidt,, N. (1996). Optimal time and minimum space‐time product for reversing a certain class of programs. In M. Berz, C. Bischof, G. Corliss & A. Griewank (Eds.), *Computational Differentiation: Techniques, Applications, and Tools* (pp. 161 ‐172). Philadelphia, PA: SIAM.

Guennebaud,, G., & Jacob,, B. (2010). *Eigen v3*. Retrieved from http://eigen.tuxfamily.org

Hascoet,, L., & Pascual,, V. (2013). The tapenade automatic differentiation tool: Principles, model, and specification. ACM Transaction on Mathematical Software, 39, 1–43. https://doi.org/10.1145/2450153.2450158

Hoffman,, M. D., & Gelman,, A. (2014, April). The no‐u‐turn sampler: Adaptively setting path lengths in hamiltonian Monte Carlo. Journal of Machine Learning Research, *15*, 1593‐1623.

Hogan,, R. J. (2014). Fast reverse‐mode automatic differentiation using expression templates in c++. ACM Transactions on Mathematical Software, 40(4), 1–16. https://doi.org/10.1145/2560359

Kucukelbir,, A., Tran,, D., Ranganath,, R., Gelman,, A., & Blei,, D. M. (2016). *Automatic differentiation variational inference*. Retrieved from https://arxiv.org/abs/160300788

Margossian,, C. C. (2018, January). Computing steady states with stan`s nonlinear algebraic solver. In *Stan Conference 2018 California*.

Margossian,, C. C., & Gillespie,, W. R. (2016, October). Stan functions for pharmacometrics modeling. Journal of Pharmacokinetics and Pharmacodynamics, 43(Suppl 1): 11. https://doi.org/10.1007/s10928-016-9485-x

Moler,, C., & Van Loan,, C. (2003, March). Nineteen dubious ways to compute the exponential of a matrix, twenty‐five years later. SIAM Review, 45, 3–49.

Neal,, R. (2011). MCMC Using Hamiltonian Dynamics. In S. Brooks, A. Gelman, G. L. Jones & X. L. Meng (Eds.), Handbook of Markov Chain Monte Carlo (pp. 116 ‐162). Chapman %26 Hall/CRC.

Paszke,, A., Gross,, S., Chintala,, S., Chanan,, G., Yang,, E., DeVito,, Z., … Lerer,, A. (2017. December 9). In NIPS 2017 Autodiff Workshop: The Future of Gradient‐based Machine Learning Software and Techniques, Long Beach, CA.

Pearlmutter,, B. A. (1994). Fast exact multiplication by the hessian. Neural Computation, *6*, 147‐160. https://doi.org/10.1162/neco.1994.6.1.147

Phipps,, E., & Pawlowski,, R. (2012). Efficient Expression Templates for Operator Overloading‐Based Automatic Differentiation. In S. Forth, P. Hovland, E. Phipps, J. Utke & A. Walther. *Recent Advances in Algorithmic Differentiation* (pp. 309‐319), Berlin, Heidelberg: Springer Berlin Heidelberg.

Powell,, M. J. D. (1970). A hybrid method for nonlinear equations. In P. Rabinowitz, (Ed.), Numerical methods for nonlinear algebraic equations. New York: Gordon and Breach.

Rowland,, T., & Weisstein,, E. W. (n.d.). *Matrix exponential*. MathWorld. Retrieved from http://mathworld.wolfram.com/MatrixExponential.html

Sagebaum,, M., Albring,, T., & Gauger,, N. R. (2017). *High‐performance derivative computations using codipack*. Retrieved from https://arxiv.org/pdf/1709.07229.pdf

Salvatier,, J., Wiecki,, T. V., & Fonnesbeck,, C. (2016). Probabilistic programming in python using pymc3. PeerJ Computer Science, 2, e55. https://doi.org/10.7717/peerj-cs.55

Stan Development Team. (2018). *Bayesian statistics using Stan, version 2.18.1*. Retrieved from http://www.stat.columbia.edu/~gelman/bda.courseAbook/

Veldhuizen,, T. (1995). Expression templates. (Tech. Rep.). *C++* Report 7, 26‐31.

Vofibeck,, M., Giering,, R., & Kaminski,, T. (2008). Development and first applications of tac++. 64. doi: https://doi.org/10.1007/978-3-540-68942-3_17

Walther,, A., & Griewank,, A. (2012). Getting started with ADOL‐C. In U. Naumann, & O. Schenk, (Eds.), Combinatorial scientific computing (pp. 181–202). Chapman‐Hall CRC Computational Science.

Wickham,, H. (2009). ggplot2: Elegant graphics for data analysis. New York, NY: Springer‐Verlag. Retrieved from http://ggplot2.org

Widrow,, B., & Lehr,, M. A. (1990, September). 30 years of adaptive neural networks: Perceptron, madaline, and backpropagation. Proceedings of the IEEE, 78(9), 1415–1442. https://doi.org/10.1109/5.58323