Sound event detection is sensitive to the network depth, and the increase of the network depth will lead to a decrease in the event detection ability. However, event localization has a deeper requirement for the network depth. In this paper, the accuracy of the joint task of event detection and localization is improved by decoupling SELD-TCN. The joint task is reflected in the early fusion of primary features and the enhancement of the generalization ability of the sound event detection branch as the DOA branch mask, while the advanced feature extraction and recognition of the two branches are...