|“||is concerned with the design and development of algorithms and techniques that allow computers to 'learn.' The major focus of machine learning research is to extract information from data automatically, using computational and statistical methods. This extracted information may then be generalized into rules and patterns.||”|
|“||[is a] type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed, and focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.||”|
Machine learning is one of the most important technical approaches to AI and the basis of many recent advances and commercial applications of AI. Modern machine learning is a statistical process that starts with a body of data and tries to derive a rule or procedure that explains the data or can predict future data. This approach — learning from data — contrasts with the older "expert system” approach to AI, in which programmers sit down with human domain experts to learn the rules and criteria used to make decisions, and translate those rules into software code. An expert system aims to emulate the principles used by human experts, whereas machine learning relies on statistical methods to find a decision procedure that works well in practice.
An advantage of machine learning is that it can be used even in cases where it is infeasible or difficult to write down explicit rules to solve a problem. . . In a sense, machine learning is not an algorithm for solving a specific problem, but rather a more general approach to finding solutions for many different problems, given data about them.
To apply machine learning, a practitioner starts with a historical data set, which the practitioner divides into a training set and a test set. The practitioner chooses a model, or mathematical structure that characterizes a range of possible decision-making rules with adjustable parameters. A common analogy is that the model is a “box” that applies a rule, and the parameters are adjustable knobs on the front of the box that control how the box operates. In practice, a model might have many millions of parameters.
The practitioner also defines an objective function used to evaluate the desirability of the outcome that results from a particular choice of parameters. The objective function will typically contain parts that reward the model for closely matching the training set, as well as parts that reward the use of simpler rules.
Training the model is the process of adjusting the parameters to maximize the objective function. Training is the difficult technical step in machine learning. A model with millions of parameters will have astronomically more possible outcomes than any algorithm could ever hope to try, so successful training algorithms have to be clever in how they explore the space of parameter settings so as to find very good settings with a feasible level of computational effort.
Once a model has been trained, the practitioner can use the test set to evaluate the accuracy and effectiveness of the model. The goal of machine learning is to create a trained model that will generalize — it will be accurate not only on examples in the training set, but also on future cases that it has never seen before. While many of these models can achieve better-than-human performance on narrow tasks such as image labeling, even the best models can fail in unpredictable ways. For example, for many image labeling models it is possible to create images that clearly appear to be random noise to a human but will be falsely labeled as a specific object with high confidence by a trained model.
- "Overview" section: Preparing for the Future of Artificial Intelligence, at 8-9.
See also Edit
External resources Edit
- Frank Chen, "AI, Deep Learning, and Machine Learning: A Primer," Andreessen Horowitz (June 10, 2016) (full-text).