This website maintained by S. Purcell provides what I think is a very clear, straightforward introduction to MLE. In particular, we'll want to look at the second major heading on the page that comes up, Model-Fitting.

This other site lists some of the advantages of MLE, vis-a-vis OLS.

Lindsay Reed, our former computer lab director, once loaned me a book on the history of statistics, the unusually titled,

*The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century*(by David Salsburg, published in 2001).

This book discusses the many statistical contributions of Sir Ronald A. Fisher, among which is MLE. Writes Salsburg:

*In spite of Fisher's ingenuity, the majority of situations presented intractable mathematics to the potential user of the MLE*(p. 68).

Practically speaking, obtaining MLE solutions required repeated iterations, which was very difficult to achieve, until the computer revolution. Citing the ancient mathematician Robert Recorde, Salsburg writes:

*...you first guess the answer and apply it to the problem. There will be a discrepancy between the result of using this guess and the result you want. You take that discrepancy and use it to produce a better guess... For Fisher's maximum likelihood, it might take thousands or even millions of iterations before you get a good answer... What are a mere million iterations to a patient computer?*(p. 70).

**UPDATE I:**The 2013 textbook by Texas Tech Business Administration professor Peter Westfall and Kevin Henning,

*Understanding Advanced Statistical Methods*, includes additional description of MLE. The above-referenced Purcell page provides an example with a relatively simple equation for the likelihood function. Westfall and Henning, while providing a more mathematically intense discussion of MLE, have several good explanatory quotes:

*In cases of complex advanced statistical models such as regressions, structural equation models, and neural networks, there are often dozens or perhaps even hundreds of parameters in the likelihood function*(p. 317).

*In practice, likelihood functions tend to be much more complicated [than the book's examples], and you won't be able to solve the calculus problem even if you excel at math. Instead, you'll have to use*(p. 325; emphasis in original).

**numerical methods**, a fancy term for "letting the computer do the calculus for you." ... Numerical methods for finding MLEs work by**iterative approximation**. They start with an initial guess... then update the guess to some value... by climbing up the likelihood function... The iteration continues until the successive values... are so close to one another that the computer is willing to assume that the peak has been achieved. When this happens, the algorithm is said to**converge**This is what the Minimization History portion of the AMOS output refers to, along with the the possible error message that one's model has failed to converge.

**UPDATE II:**The reference given by our 2014 guest speaker on MLE is:

Ferron, J. M., & Hess, M. R. (2007). Estimation in SEM: A concrete example.

*Journal of Educational and*

*Behavioral Statistics, 32*, 110-120.