Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

Open
jianGao555 opened this issue Apr 11, 2024 · 1 comment

Comments

@jianGao555
Copy link

/home/gj/anaconda3/envs/growsp/bin/python /home/gj/GrowSP-main/train_S3DIS.py
/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/init.py:36: UserWarning: The environment variable OMP_NUM_THREADS not set. MinkowskiEngine will automatically set OMP_NUM_THREADS=16. If you want to set OMP_NUM_THREADS manually, please export it on the command line before running a python script. e.g. export OMP_NUM_THREADS=12; python your_program.py. It is recommended to set it below 24.
warnings.warn(
/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/sklearn/utils/linear_assignment_.py:18: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
warnings.warn(
--- Logging error ---
Traceback (most recent call last):
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 1085, in emit
msg = self.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 929, in format
return fmt.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 668, in format
record.message = record.getMessage()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 70, in main
logger.info('Training Areas', training_areas)
Message: 'Training Areas'
Arguments: (['Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'],)
--- Logging error ---
Traceback (most recent call last):
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 1085, in emit
msg = self.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 929, in format
return fmt.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 668, in format
record.message = record.getMessage()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 70, in main
logger.info('Training Areas', training_areas)
Message: 'Training Areas'
Arguments: (['Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'],)
Res16FPN18(
(conv0p1s1): MinkowskiConvolution(in=6, out=32, kernel_size=[5, 5, 5], stride=[1, 1, 1], dilation=[1, 1, 1])
(bn0): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(conv1p1s2): MinkowskiConvolution(in=32, out=32, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block1): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv2p2s2): MinkowskiConvolution(in=32, out=32, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block2): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=32, out=64, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv3p4s2): MinkowskiConvolution(in=64, out=64, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn3): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block3): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=64, out=128, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv4p8s2): MinkowskiConvolution(in=128, out=128, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn4): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block4): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=128, out=256, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(delayer1): MinkowskiLinear(in_features=256, out_features=128, bias=False)
(delayer2): MinkowskiLinear(in_features=128, out_features=128, bias=False)
(delayer3): MinkowskiLinear(in_features=64, out_features=128, bias=False)
(delayer4): MinkowskiLinear(in_features=32, out_features=128, bias=False)
(relu): MinkowskiReLU()
)
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 48.50s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 59.55 IoUs| mIoU 22.61 | 57.05 46.03 53.70 1.13 0.00 18.02 23.56 19.41 30.60 0.00 21.87 0.00
Train Epoch: 1 [20/21 (95%)]20, Loss: 5.6270502567, lr: 9.982e-02, Elapsed time: 10.4020s(20 iters)
Train Epoch: 2 [20/21 (95%)]41, Loss: 5.5789418936, lr: 9.963e-02, Elapsed time: 9.9852s(20 iters)
Train Epoch: 3 [20/21 (95%)]62, Loss: 5.5750957489, lr: 9.944e-02, Elapsed time: 10.0892s(20 iters)
Train Epoch: 4 [20/21 (95%)]83, Loss: 5.5687583208, lr: 9.925e-02, Elapsed time: 9.9891s(20 iters)
Train Epoch: 5 [20/21 (95%)]104, Loss: 5.5683251858, lr: 9.906e-02, Elapsed time: 10.4144s(20 iters)
Train Epoch: 6 [20/21 (95%)]125, Loss: 5.5693419456, lr: 9.887e-02, Elapsed time: 10.3591s(20 iters)
Train Epoch: 7 [20/21 (95%)]146, Loss: 5.5645267963, lr: 9.869e-02, Elapsed time: 9.8720s(20 iters)
Train Epoch: 8 [20/21 (95%)]167, Loss: 5.5651976585, lr: 9.850e-02, Elapsed time: 10.2304s(20 iters)
Train Epoch: 9 [20/21 (95%)]188, Loss: 5.5588910818, lr: 9.831e-02, Elapsed time: 10.5668s(20 iters)
Train Epoch: 10 [20/21 (95%)]209, Loss: 5.5650004387, lr: 9.812e-02, Elapsed time: 10.0282s(20 iters)
Merging Primitives
Epoch: 10, oAcc 52.05 mAcc 22.16 IoUs| mIoU 15.79 | 85.25 37.50 42.24 0.02 0.00 1.10 0.64 11.51 3.31 0.03 7.89 0.00
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 48.08s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 75.04 IoUs| mIoU 34.88 | 87.34 83.06 65.95 11.11 0.00 26.08 34.00 37.87 41.50 2.70 28.98 0.00
Train Epoch: 11 [20/21 (95%)]230, Loss: 4.6528557539, lr: 9.793e-02, Elapsed time: 10.1864s(20 iters)
Train Epoch: 12 [20/21 (95%)]251, Loss: 4.6018107891, lr: 9.774e-02, Elapsed time: 10.6370s(20 iters)
Train Epoch: 13 [20/21 (95%)]272, Loss: 4.6047326326, lr: 9.755e-02, Elapsed time: 9.9426s(20 iters)
Train Epoch: 14 [20/21 (95%)]293, Loss: 4.5702477932, lr: 9.736e-02, Elapsed time: 10.1286s(20 iters)
Train Epoch: 15 [20/21 (95%)]314, Loss: 4.5666304827, lr: 9.717e-02, Elapsed time: 10.1867s(20 iters)
Train Epoch: 16 [20/21 (95%)]335, Loss: 4.5555877209, lr: 9.698e-02, Elapsed time: 10.0583s(20 iters)
Train Epoch: 17 [20/21 (95%)]356, Loss: 4.5542288780, lr: 9.679e-02, Elapsed time: 10.3939s(20 iters)
Train Epoch: 18 [20/21 (95%)]377, Loss: 4.5525953531, lr: 9.660e-02, Elapsed time: 10.1124s(20 iters)
Train Epoch: 19 [20/21 (95%)]398, Loss: 4.5360535145, lr: 9.641e-02, Elapsed time: 10.4379s(20 iters)
Train Epoch: 20 [20/21 (95%)]419, Loss: 4.5468521357, lr: 9.622e-02, Elapsed time: 10.1236s(20 iters)
Merging Primitives
Epoch: 20, oAcc 67.55 mAcc 31.43 IoUs| mIoU 24.90 | 83.22 85.89 58.02 0.03 0.58 5.07 0.85 29.90 24.17 1.51 9.55 0.03
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 46.09s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 77.39 IoUs| mIoU 38.31 | 89.22 87.58 66.09 17.49 0.00 25.00 29.24 49.12 55.40 2.18 38.37 0.00
Traceback (most recent call last):
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 93, in main
train(train_loader, logger, model, optimizer, loss, epoch, scheduler, classifier)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 212, in train
loss_sem = loss(logits * 3, pseudo_labels_comp).mean()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1163, in forward
return F.cross_entropy(input, target, weight=self.weight,
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/functional.py", line 2996, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
ValueError: Expected input batch_size (316986) to match target batch_size (316985).

Process finished with exit code 1

@zhang-zihui
Copy link
Contributor

We have never had this problem. Did you modify the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants