Then, characterized FIG convolution having two localized network layer parameters [theta] is obtained, where the size of two localized network were 1 × 1 × 2048 and 1 × 1 × 6.
The feature map then obtains the parameters through a localized network with two convolution layers, of which the size of the two localized networks is 1 x 1 x 2048 and 1 x 1 x 6, respectively.
Then, the feature map is processed through a localization network with two convolution layers to obtain the parameter θ, where the size of the localization network of the two layers is 1 × 1 × 2048 and 1 × 1 × 6 respectively.<br>