Loss Function Sample Clauses
Loss Function. The obtained matrix X represents for each row a class in the pool of classes of the dataset and for each column the attention raised by the heads in Self-Attention . We think that the class represented with the vector of major intensity represents the prediction of the model. We apply the marginal loss introduced in [45] to compute the error in prediction. Σ . + 2 Lmargin = Tc max(0, m − xˆc) + feature maps and identify the entities present in the im- age. Each capsule in the Primary-Caps provides prob- ability vectors with dimension duˆ. We obtain a tensor of λ(1 − Tc) max(0, xˆc − m−)2Σ
