深度学习的术语（terms）

wenluderen · 发表于 2020-5-25 10:06:13

本帖最后由 wenluderen 于 2020-5-25 10:09 编辑

In the following, we describe the most important terms used in the context of deep learning:

anchor
Anchors are fixed bounding boxes. They serve as reference boxes, with the aid of which the network
proposes bounding boxes for the objects to be localized.

annotation
An annotation is the ground truth information, what a given instance in the data represents, in a way
recognizable for the network. This is e.g., the bounding box and the corresponding label for an instance in
object detection.

anomaly
An anomaly means something deviating from the norm, something unknown.

backbone
A backbone is a part of a pretrained classification network. Its task is to generate various feature maps,
for what reason the classifying layer has been removed.

batch size - hyperparameter
’batch_size’ The dataset is divided into smaller subsets of data, which are called
batches. The batch size determines the number of images taken into a batch and thus processed simultaneously.

bounding box
Bounding boxes are rectangular boxes used to define a part within an image and to specify the
localization of an object within an image.

class agnostic
Class agnostic means without the knowledge of the different classes.In HALCON, we use it for reduction of overlapping predicted bounding boxes. This means, for a class
agnostic bounding box suppression the suppression of overlapping instances is done ignoring the knowledge
of classes, thus strongly overlapping instances get suppressed independently of their class.

wenluderen · 发表于 2020-5-25 10:14:18

change strategy A change strategy denotes the strategy, when and how hyperparameters are changed during the
training of a DL model.
class
Classes are discrete categories (e.g., ’apple’, ’peach’, ’pear’) that the network distinguishes. In HALCON,
the class of an instance is given by its appropriate annotation.

classifier
In the context of deep learning we refer to the term classifier as follows. The classifier takes an image
as input and returns the inferred confidence values, expressing how likely the image belongs to every distinguished
class. E.g., the three classes ’apple’, ’peach’, and ’pear’ are distinguished. Now we give an image
of an apple to the classifier. As a result, the confidences ’apple’: 0.92, ’peach’: 0.07, and ’pear’: 0.01 could
be returned.

COCO
COCO is an abbreviation for "common objects in context", a large-scale object detection, segmentation,
and captioning dataset. There is a common file format for each of the different annotation types.

confidence
Confidence is a number expressing the affinity of an instance to a class. In HALCON the confidence
is the probability, given in the range of [0,1]. Alternative name: score

confusion matrix
A confusion matrix is a table which compares the classes predicted by the network (top-1) with
the ground truth class affiliations. It is often used to visualize the performance of the network on a validation
or test set.

××××××××××
Convolutional Neural Networks (CNNs)
Convolutional Neural Networks are neural networks used in deep
learning, characterized by the presence of at least one convolutional layer in the network. They are particularly
successful for image classification.

data
We use the term data in the context of deep learning for instances to be recognized (e.g., images) and their
appropriate information concerning the predictable characteristics (e.g., the labels in case of classification).

data augmentation
Data augmentation is the generation of altered copies of samples within a dataset. This is
done in order to augment the richness of the dataset, e.g., through flipping or rotating.

dataset: training, validation, and test set
With dataset we refer to the complete set of data used for a training.
The dataset is split into three, if possible disjoint, subsets:
• The training set contains the data on which the algorithm optimizes the network directly.
• The validation set contains the data to evaluate the network performance during training.
• The test set is used to test possible inferences (predictions), thus to test the performance on data without
any influence on the network optimization.

×××××

deep learning
The term "deep learning" was originally used to describe the training of neural networks with
multiple hidden layers. Today it is rather used as a generic term for several different concepts in machine
learning. In HALCON, we use the term deep learning for methods using a neural network with multiple
hidden layers.

epoch
In the context of deep learning, an epoch is a single training iteration over the entire training data, i.e., over
all batches. Iterations over epochs should not be confused with the iterations over single batches (e.g., within
an epoch).

errors
In the context of deep learning, we refer to error when the inferred class of an instance does not match the
real class (e.g., the ground truth label in case of classification). Within HALCON, we use the term error in
deep learning when we refer to the top-1 error.

feature map
A feature map is the output of a given layer

feature pyramid
A feature pyramid is simply a group of feature maps, whereby every feature map origins from
another level, i.e., it is smaller than its preceding levels.

head
Heads are subnetworks. For certain architectures they attach on selected pyramid levels. These subnetworks
proceed information from previous parts of the total network in order to generate spatially resolved output,
e.g., for the class predictions. Thereof they generate the output of the total network and therewith constitute
the input of the losses.

wenluderen · 发表于 2020-5-25 10:21:49

hyperparameter
Like every machine learning model, CNNs contain many formulas with many parameters. During
training the model learns from the data in the sense of optimizing the parameters. However, such models
can have other, additional parameters, which are not directly learned during the regular training. These
parameters have values set before starting the training. We refer to this last type of parameters as hyperparameters
in order to distinguish them from the network parameters that are optimized during training. Or
from another point of view, hyperparameters are solver-specific parameters.
Prominent examples are the initial learning rate or the batch size.

inference phase
The inference phase is the stage when a trained network is applied to predict (infer) instances
(which can be the total input image or just a part of it) and eventually their localization. Unlike during the
training phase, the network is not changed anymore in the inference phase.

intersection over union
The intersection over union (IoU) is a measure to quantify the overlap of two areas. We
can determine the parts common in both areas, the intersection, as well as the united areas, the union. The
IoU is the ratio between the two areas intersection and union.
The application of this concept may differ between the methods.

label
Labels are arbitrary strings used to define the class of an image. In HALCON these labels are given by the
image name (eventually followed by a combination of underscore and digits) or by the folder name, e.g.,
’apple_01.png’, ’pear.png’, ’peach/01.png’.

layer and hidden layer
A layer is a building block in a neural network, thus performing specific tasks (e.g., convolution,
pooling, etc., for further details we refer to the “Solution Guide on Classification”).
It can be seen as a container, which receives weighted input, transforms it, and returns the output to the next
layer. Input and output layers are connected to the dataset, i.e., the images or the labels, respectively. All
layers in between are called hidden layers.

learning rate - hyperparameter
’learning_rate’ The learning rate is the weighting, with which the gradient (see
the entry for the stochastic gradient descent SGD) is considered when updating the arguments of the loss
function. In simple words, when we want to optimize a function, the gradient tells us the direction in which
we shall optimize and the learning rate determines how far along this direction we step.
Alternative names: , step size

level
The term level is used to denote within a feature pyramid network the whole group of layers, whose feature
maps have the same width and height. Thereby the input image represents level 0.

loss
A loss function compares the prediction from the network with the given information, what it should find in
the image (and, if applicable, also where), and penalizes deviations. This loss function is the function we
optimize during the training process to adapt the network to a specific task.
Alternative names: objective function, cost function, utility function

momentum - hyperparameter ’momentum’ The momentum 2 [0; 1) is used for the optimization of the loss
function arguments. When the loss function arguments are updated (after having calculated the gradient), a
fraction of the previous update vector (of the past iteration step) is added. This has the effect of damping
oscillations. We refer to the hyperparameter as momentum. When is set to 0, the momentum method has
no influence. In simple words, when we update the loss function arguments, we still remember the step we
did for the last update. Now we go a step in direction of the gradient with a length according to the learning
rate and additionally we repeat the step we did last time, but this time only times as long.

non-maximum suppression
In object detection, non-maximum suppression is used to suppress overlapping predicted
bounding boxes. When different instances overlap more than a given threshold value, only the one
with the highest confidence value is kept while the other instances, not having the maximum confidence
value, are suppressed.

overfitting
Overfitting happens when the network starts to ’memorize’ training data instead of learning how to
find general rules for the classification. This becomes visible when the model continues to minimize error on
the training set but the error on the validation set increases. Since most neural networks have a huge amount
of weights, these networks are particularly prone to overfitting.

wenluderen · 发表于 2020-5-25 10:24:08

regularization - hyperparameter ’weight_prior’ Regularization is a technique to prevent neural networks from
overfitting by adding an extra term to the loss function. It works by penalizing large weights, i.e., pushing
the weights towards zero. Simply put, regularization favors simpler models that are less likely to fit to
noise in the training data and generalize better. In HALCON, regularization is controlled via the parameter
’weight_prior’.
Alternative names: regularization parameter, weight decay parameter, (note that in HALCON we use
for the learning rate and within formulas the symbol for the regularization parameter).

retraining
We define retraining as updating the weights of an already pretrained network, i.e., during retraining
the network learns the specific task.
Alternative names: fine-tuning.

solver
The solver optimizes the network by updating the weights in a way to optimize (i.e., minimize) the loss

stochastic gradient descent (SGD)
SGD is an iterative optimization algorithm for differentiable functions. In
deep learning we use this algorithm to calculate the gradient to optimize (i.e., minimize) the loss function.
A key feature of the SGD is to calculate the gradient only based on a single batch containing stochastically
sampled data and not all data.

top-k error
The classifier infers for a given image class confidences of how likely the image belongs to every
distinguished class. Thus, for an image we can sort the predicted classes according to the confidence value
the classifier assigned. The top-k error tells the ratio of predictions where the ground truth class is not
within the k predicted classes with highest probability. In the case of top-1 error, we check if the target label
matches the prediction with the highest probability. In the case of top-3 error, we check if the target label
matches one of the top 3 predictions (the 3 labels getting the highest probability for this image).
Alternative names: top-k score

transfer learning
Transfer learning refers to the technique where a network is built upon the knowledge of an
already existing network. In concrete terms this means taking an already (pre)trained network with its
weights and adapt the output layer to the respective application to get your network. In HALCON, we also
see the following retraining step as a part of transfer learning.

underfitting
Underfitting occurs when the model over-generalizes. In other words it is not able to describe the
complexity of the task. This is directly reflected in the error on the training set, which does not decrease
significantly.

weights
In general weights are the free parameters of the network, which are altered during the training due to the
optimization of the loss. A layer with weights multiplies or adds them with its input values. In contrast to
hyperparameters, weights are optimized and thus changed during the training.

雾里看花 · 发表于 2020-5-25 19:06:05

有中文的介绍就完美了

美琴小学生 · 发表于 2020-11-25 15:29:29

雾里看花发表于 2020-5-25 19:06
有中文的介绍就完美了

页面直接翻译啊，贼快