-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SigmoidFocalLoss [Bug] #2906
Comments
I am currently working on fixing softmax focal algorithm in #2893 as an outside contributor. If I understand your issue correctly, you are confusing with the binary classification task here. This is actually a very tricky part. The pytorch's implementaion mentioned above actually drops the background class (you can see a weird In short, |
torchvision implementation is correct for both You can imagine that the I would strongly recommend using a 2-channel input for your case if you are in object detection tasks since 2-channel input solution is common in object detection tasks. However, it is also possible to add a specialized kernel function for |
Prerequisite
Environment
Hello
thank you for a great library.
But looks like a bug in SigmoidFocalLoss:
OrderedDict([('sys.platform', 'linux'), ('Python', '3.10.12 (main, Jul 5 2023, 18:54:27) [GCC 11.2.0]'), ('CUDA available', True), ('numpy_random_seed', 2147483648), ('GPU 0,1,2,3,4,5,6,7', 'NVIDIA A40'), ('CUDA_HOME', '/usr/local/cuda'), ('NVCC', 'Cuda compilation tools, release 12.2, V12.2.91'), ('GCC', 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0'), ('PyTorch', '2.0.1'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 9.3\n - C++ Version: 201703\n - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: AVX2\n - CUDA Runtime 11.8\n - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37\n - CuDNN 8.7\n - Magma 2.6.1\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.15.2'), ('OpenCV', '4.8.0'), ('MMEngine', '0.8.2'), ('MMCV', '2.0.1'), ('MMCV Compiler', 'GCC 9.3'), ('MMCV CUDA Compiler', '11.8')])
I've checked https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/csrc/common/cuda/sigmoid_focal_loss_cuda_kernel.cuh
for num_classes=1=> c=0, the result is that flag_p and flag_n just swap
T flag_p = (t == c);
T flag_n = (t != c);
for sigmoid like target, the num_classes=1 from https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/csrc/pytorch/cuda/focal_loss_cuda.cu
(int num_classes = input.size(1);)
BR
Reproduces the problem - code sample
loss=SigmoidFocalLoss(alpha=-1, gamma=2)
loss(torch.tensor([[-1000.]]).cuda(), torch.tensor([1]).cuda())
tensor(0., device='cuda:0')
loss(torch.tensor([[1000.]]).cuda(), torch.tensor([1]).cuda())
tensor(174.6731, device='cuda:0')
Reproduces the problem - command or script
none
Reproduces the problem - error message
none
Additional information
No response
The text was updated successfully, but these errors were encountered: