An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

jianGao555 · 2024-04-11T12:03:10Z

/home/gj/anaconda3/envs/growsp/bin/python /home/gj/GrowSP-main/train_S3DIS.py
/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/init.py:36: UserWarning: The environment variable OMP_NUM_THREADS not set. MinkowskiEngine will automatically set OMP_NUM_THREADS=16. If you want to set OMP_NUM_THREADS manually, please export it on the command line before running a python script. e.g. export OMP_NUM_THREADS=12; python your_program.py. It is recommended to set it below 24.
warnings.warn(
/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/sklearn/utils/linear_assignment_.py:18: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
warnings.warn(
--- Logging error ---
Traceback (most recent call last):
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 1085, in emit
msg = self.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 929, in format
return fmt.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 668, in format
record.message = record.getMessage()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 70, in main
logger.info('Training Areas', training_areas)
Message: 'Training Areas'
Arguments: (['Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'],)
--- Logging error ---
Traceback (most recent call last):
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 1085, in emit
msg = self.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 929, in format
return fmt.format(record)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 668, in format
record.message = record.getMessage()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/logging/init.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 70, in main
logger.info('Training Areas', training_areas)
Message: 'Training Areas'
Arguments: (['Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'],)
Res16FPN18(
(conv0p1s1): MinkowskiConvolution(in=6, out=32, kernel_size=[5, 5, 5], stride=[1, 1, 1], dilation=[1, 1, 1])
(bn0): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(conv1p1s2): MinkowskiConvolution(in=32, out=32, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block1): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=32, out=32, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv2p2s2): MinkowskiConvolution(in=32, out=32, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block2): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=32, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=32, out=64, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv3p4s2): MinkowskiConvolution(in=64, out=64, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn3): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block3): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=64, out=128, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(conv4p8s2): MinkowskiConvolution(in=128, out=128, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(bn4): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
(block4): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=128, out=256, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(delayer1): MinkowskiLinear(in_features=256, out_features=128, bias=False)
(delayer2): MinkowskiLinear(in_features=128, out_features=128, bias=False)
(delayer3): MinkowskiLinear(in_features=64, out_features=128, bias=False)
(delayer4): MinkowskiLinear(in_features=32, out_features=128, bias=False)
(relu): MinkowskiReLU()
)
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 48.50s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 59.55 IoUs| mIoU 22.61 | 57.05 46.03 53.70 1.13 0.00 18.02 23.56 19.41 30.60 0.00 21.87 0.00
Train Epoch: 1 [20/21 (95%)]20, Loss: 5.6270502567, lr: 9.982e-02, Elapsed time: 10.4020s(20 iters)
Train Epoch: 2 [20/21 (95%)]41, Loss: 5.5789418936, lr: 9.963e-02, Elapsed time: 9.9852s(20 iters)
Train Epoch: 3 [20/21 (95%)]62, Loss: 5.5750957489, lr: 9.944e-02, Elapsed time: 10.0892s(20 iters)
Train Epoch: 4 [20/21 (95%)]83, Loss: 5.5687583208, lr: 9.925e-02, Elapsed time: 9.9891s(20 iters)
Train Epoch: 5 [20/21 (95%)]104, Loss: 5.5683251858, lr: 9.906e-02, Elapsed time: 10.4144s(20 iters)
Train Epoch: 6 [20/21 (95%)]125, Loss: 5.5693419456, lr: 9.887e-02, Elapsed time: 10.3591s(20 iters)
Train Epoch: 7 [20/21 (95%)]146, Loss: 5.5645267963, lr: 9.869e-02, Elapsed time: 9.8720s(20 iters)
Train Epoch: 8 [20/21 (95%)]167, Loss: 5.5651976585, lr: 9.850e-02, Elapsed time: 10.2304s(20 iters)
Train Epoch: 9 [20/21 (95%)]188, Loss: 5.5588910818, lr: 9.831e-02, Elapsed time: 10.5668s(20 iters)
Train Epoch: 10 [20/21 (95%)]209, Loss: 5.5650004387, lr: 9.812e-02, Elapsed time: 10.0282s(20 iters)
Merging Primitives
Epoch: 10, oAcc 52.05 mAcc 22.16 IoUs| mIoU 15.79 | 85.25 37.50 42.24 0.02 0.00 1.10 0.64 11.51 3.31 0.03 7.89 0.00
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 48.08s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 75.04 IoUs| mIoU 34.88 | 87.34 83.06 65.95 11.11 0.00 26.08 34.00 37.87 41.50 2.70 28.98 0.00
Train Epoch: 11 [20/21 (95%)]230, Loss: 4.6528557539, lr: 9.793e-02, Elapsed time: 10.1864s(20 iters)
Train Epoch: 12 [20/21 (95%)]251, Loss: 4.6018107891, lr: 9.774e-02, Elapsed time: 10.6370s(20 iters)
Train Epoch: 13 [20/21 (95%)]272, Loss: 4.6047326326, lr: 9.755e-02, Elapsed time: 9.9426s(20 iters)
Train Epoch: 14 [20/21 (95%)]293, Loss: 4.5702477932, lr: 9.736e-02, Elapsed time: 10.1286s(20 iters)
Train Epoch: 15 [20/21 (95%)]314, Loss: 4.5666304827, lr: 9.717e-02, Elapsed time: 10.1867s(20 iters)
Train Epoch: 16 [20/21 (95%)]335, Loss: 4.5555877209, lr: 9.698e-02, Elapsed time: 10.0583s(20 iters)
Train Epoch: 17 [20/21 (95%)]356, Loss: 4.5542288780, lr: 9.679e-02, Elapsed time: 10.3939s(20 iters)
Train Epoch: 18 [20/21 (95%)]377, Loss: 4.5525953531, lr: 9.660e-02, Elapsed time: 10.1124s(20 iters)
Train Epoch: 19 [20/21 (95%)]398, Loss: 4.5360535145, lr: 9.641e-02, Elapsed time: 10.4379s(20 iters)
Train Epoch: 20 [20/21 (95%)]419, Loss: 4.5468521357, lr: 9.622e-02, Elapsed time: 10.1236s(20 iters)
Merging Primitives
Epoch: 20, oAcc 67.55 mAcc 31.43 IoUs| mIoU 24.90 | 83.22 85.89 58.02 0.03 0.58 5.07 0.85 29.90 24.17 1.51 9.55 0.03
computing point feats ....
computing pseduo labels...
labelled points ratio 0.84 clustering time: 46.09s
Superpoints oAcc 93.18 IoUs| mIoU 81.46 | 94.94 91.99 85.66 86.60 74.83 78.93 77.80 88.35 84.25 88.16 88.72 37.30
Primitives oAcc 77.39 IoUs| mIoU 38.31 | 89.22 87.58 66.09 17.49 0.00 25.00 29.24 49.12 55.40 2.18 38.37 0.00
Traceback (most recent call last):
File "/home/gj/GrowSP-main/train_S3DIS.py", line 306, in
main(args, logger)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 93, in main
train(train_loader, logger, model, optimizer, loss, epoch, scheduler, classifier)
File "/home/gj/GrowSP-main/train_S3DIS.py", line 212, in train
loss_sem = loss(logits * 3, pseudo_labels_comp).mean()
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1163, in forward
return F.cross_entropy(input, target, weight=self.weight,
File "/home/gj/anaconda3/envs/growsp/lib/python3.8/site-packages/torch/nn/functional.py", line 2996, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
ValueError: Expected input batch_size (316986) to match target batch_size (316985).

Process finished with exit code 1

The text was updated successfully, but these errors were encountered:

zhang-zihui · 2024-12-03T07:45:21Z

We have never had this problem. Did you modify the code?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

jianGao555 commented Apr 11, 2024

zhang-zihui commented Dec 3, 2024

An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

An error occurs when calculating the loss function, suggesting that the batch_size is inconsistent at around 20+ epochs, what's going on? #20

Comments

jianGao555 commented Apr 11, 2024

zhang-zihui commented Dec 3, 2024