Mention the advantages of deep learning from the predecessor of deep learning to compare. The division of the historical stage of machine learning can be divided according to the hierarchical structure of the machine learning model. From the 1980s to the present, the process of machine learning has gone through two stages: shallow learning and deep learning. <br>Shallow learning uses shallow structure. The shallow structure is used for most traditional machine learning and signal processing technology applications (such as Gaussian mixture models, linear or nonlinear dynamic systems, conditional random fields, maximum entropy models, support for vertigo, kernel regression, multi-layer perceptrons, etc. ). These structures usually contain one to two layers of non-linear feature transformations, and can be considered as a structure with one hidden layer or no hidden layer. The shallow structure has shown its effectiveness in solving some easy or limited problems, but because of the limited modeling and representation capabilities of shallow learning, it can be used to deal with more complex practical applications (such as natural speech signals, Natural image signals and visual scene signals are very difficult. <br>Deep learning uses deep networks. Deep networks are networks with multiple hidden layer structures. By introducing a deep network, the system model can achieve an approximation of a complex function by learning a deep nonlinear network, thereby calculating more complex input features. Since each hidden layer can perform a non-linear transformation on the output of the previous layer, the deep network has a better expression ability than the shallow network. For example, more complex functional relationships can be obtained through learning, and it shows from The ability to learn the essential characteristics of data in a few samples. <br>The main advantage of the deep network is that it can represent a much larger set of functions than the traditional shallow network in a simpler way, and the advantage of the multi-layer is that it can use less parameters to represent complex function relationships. It should be noted that in the process of training deep networks, each hidden layer needs to use a nonlinear activation function, because the combination of multiple linear functions is also a linear function in nature. Therefore, if the activation function uses a linear function, it does not increase the expressive power compared to the network with a single hidden layer.<br>When the processing object is an image, the decomposition relationship of "part-whole" can be learned by using a deep network. For example, the first layer can learn how to combine pixels in an image to detect edges, and the second layer can combine edges to detect longer contours or simple "target parts." At a deeper level, these contours can be further combined to detect more complex features. <br>The essence of deep learning is to learn more useful features by building a learning model with multiple hidden layers and massive training data, thereby improving the accuracy of classification or prediction. Therefore, "deep model" is a means, " "Feature learning" is the purpose.
正在翻译中..