常见问题及解决方案

1. matplotlib无法正常显示中文

Windows环境下添加以下代码

# 中文显示
from pylab import mpl
# 设置显示中文字体
mpl.rcParams["font.sans-serif"] = ["SimHei"]
# 设置正常显示符号
mpl.rcParams["axes.unicode_minus"] = False

Mac环境下添加以下代码

zhfont=mpl.font_manager.FontProperties(fname="/Library/Fonts/Songti.ttc")

Linux环境下参考https://www.yuque.com/fcant/python/ptzo53

2. TensorFlow 显示过多无用log

添加以下代码

import os
import warnings

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
warnings.filterwarnings("ignore")

log level 可设置1-3

3即除报错以外，不在控制台打印log

3. TensorFlow在使用tf.keras.models.load_model()时报“AttributeError: ‘str‘ object has no attribute ‘decode‘ ”错误

原因：h5py的版本错误，注意版本的匹配问题，若出现此类错误，将h5py的版本降低到2.10.0

（此方法针对TensorFlow 版本为 2.0.0-alpha0 时有效）

pip install h5py==2.10.0

4.在使用子类的方法搭建神经网络时报TypeError: '_TupleWrapper' object is not callable错误

错误代码：

# 定义model的子类
class MyModel(tf.keras.Model):
    # 在init方法中定义网络的层结构
    def __init__(self):
        super(MyModel, self).__init__()
        # 第一层：激活函数为relu,权重初始化为he_normal
        self.layer1 = tf.keras.layers.Dense(3, activation="relu",
                     kernel_initializer="he_normal", name="layer1",input_shape=(3,)),
        # 第二层：激活函数为relu,权重初始化为he_normal
        self.layer2 =tf.keras.layers.Dense(2, activation="relu",
                     kernel_initializer="he_normal", name="layer2"),
        # 第三层（输出层）：激活函数为sigmoid,权重初始化为he_normal
        self.layer3 =tf.keras.layers.Dense(2, activation="sigmoid",
                     kernel_initializer="he_normal", name="layer3")
    # 在call方法中万完成前向传播
    def call(self, inputs):
        x = self.layer1(inputs)
        x = self.layer2(x)
        return self.layer3(x)
# 实例化模型
model = MyModel()
# 设置一个输入，调用模型（否则无法使用summay()）
x = tf.ones((1, 3))
y = model(x)

错误原因：在__inti__方法中初始化网络时，在每一层的后面错误地加了逗号，导致对象在实例化是被解释为元组，出现错误

解决办法：移除导致错误的逗号

正确代码:

# 定义model的子类
class MyModel(tf.keras.Model):
    # 在init方法中定义网络的层结构
    def __init__(self):
        super(MyModel, self).__init__()
        # 第一层：激活函数为relu,权重初始化为he_normal
        self.layer1 = tf.keras.layers.Dense(3, activation="relu",
                     kernel_initializer="he_normal", name="layer1",input_shape=(3,))
        # 第二层：激活函数为relu,权重初始化为he_normal
        self.layer2 =tf.keras.layers.Dense(2, activation="relu",
                     kernel_initializer="he_normal", name="layer2")
        # 第三层（输出层）：激活函数为sigmoid,权重初始化为he_normal
        self.layer3 =tf.keras.layers.Dense(2, activation="sigmoid",
                     kernel_initializer="he_normal", name="layer3")
    # 在call方法中万完成前向传播
    def call(self, inputs):
        x = self.layer1(inputs)
        x = self.layer2(x)
        return self.layer3(x)
# 实例化模型
model = MyModel()
# 设置一个输入，调用模型（否则无法使用summay()）
x = tf.ones((1, 3))
y = model(x)

5. LabelImage标注工具Windows环境下无法使用

注意中文路径问题

6. PyTorch出现报错信息 "BrokenPipeError",

如果你是在Windows系统下报错，可以尝试将torch.utils.data.DataLoader()中的num_workers设置为0.

7. PyTorch出现报错信息"Expected input batch_size (25) to match target batch_size (100)."

卷积后的尺寸计算错误，重新计算卷积后图像的尺寸

8.在保存模型过后，要加载模型。出现各种各样莫名奇妙的错误。

大力出奇迹解决办法：一定一定一定要注意路径不要写相对路径，老老实实用绝对路径来写。

9.报错信息为：python : 'gbk' codec can't decode byte 0xbe in position xxxlines

解决方案：将with_open的打开方式，设定一下encoding="utf-8"。

10.报错信息为：OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.

解决方案：在py文件中添加一行 os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

11.cv.imwrite时报错信息为：img data type = 8 is not supported

原因：因为图像保存的时候不是一个numpy.array类型的数组，且默认np.uint是8位，需要进行指定dtype=np.int32 解决方案：

cv2.imwrite(out_file, np.uint(imgs_[i].permute(1, 2, 0)*256.0+128.0))
# 更改为：
cv2.imwrite(out_path, np.array(np.uint(imgs[0].permute(1, 2, 0)*255.0), dtype=np.int32))

12.OpenCV: FFMPEG: tag 0x47504a4d/'MJPG' is not supported with codec id 7 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'倒置多图像合成视频时出错

原因：视频编/解码问题解决方案：cv2.VideoWriter_fourcc("MJPG") -> cv2.VideoWriter_fourcc("mp4v")

cv2.VideoWriter_fourcc(*"MJPG")
# 更改为
cv2.VideoWriter_fourcc(*"mp4v")

13.Corrupt JPEG data: premature end of data segment

原因：图像损坏解决方案：删除图像，或者对损坏的部分进行修改。

14. Building wheel for xxx

原因：当前使用的库过新
解决方案：降低使用的版本（于配置opencv时触发）

15.搭建了模型后，在进行model.train()的时候发生错误，报错信息： File "/home/y/anaconda3/envs/vid2vid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 88, in forward raise NotImplementedError

原因：forward肯定是在哪一个部分写错了，如yolo v3搭建的时候有不同模块，去看看前向传播方法的缩进有没有写对，或者代码是否有错。
解决方案：主要看前向传播函数哪里写错了。

16.模型运行时报错：RuntimeError: Given groups=1, weight of size 1024 512 3 3, expected input[5, 1536, 13, 13] to have 512 channels, but got 1536 channels instead

原因：回去找channels相关的代码，肯定是哪里写错了导致计算错误

17.安装keras后，import keras报错：ImportError: cannot import name 'tf2'或者其他错误

原因：首先考虑tensorflow的版本和keras的版本是否匹配
解决方案：step1：确定tensorflow的版本，虚拟环境下pip安装的tensorflow就用pip list查看
step2：根据tensorflow对应的版本找到相对应的keras版本号然后进行安装
Framework Env name (--env parameter) Description Docker Image Packages and Nvidia Settings TensorFlow 1.14 tensorflow-1.14 TensorFlow 1.14.0 + Keras 2.2.5 on Python 3.6. floydhub/tensorflow TensorFlow-1.14
TensorFlow 1.13 tensorflow-1.13 TensorFlow 1.13.0 + Keras 2.2.4 on Python 3.6. floydhub/tensorflow TensorFlow-1.13
TensorFlow 1.12 tensorflow-1.12 TensorFlow 1.12.0 + Keras 2.2.4 on Python 3.6. floydhub/tensorflow TensorFlow-1.12
tensorflow-1.12:py2 TensorFlow 1.12.0 + Keras 2.2.4 on Python 2. floydhub/tensorflow
TensorFlow 1.11 tensorflow-1.11 TensorFlow 1.11.0 + Keras 2.2.4 on Python 3.6. floydhub/tensorflow TensorFlow-1.11
tensorflow-1.11:py2 TensorFlow 1.11.0 + Keras 2.2.4 on Python 2. floydhub/tensorflow
TensorFlow 1.10 tensorflow-1.10 TensorFlow 1.10.0 + Keras 2.2.0 on Python 3.6. floydhub/tensorflow TensorFlow-1.10
tensorflow-1.10:py2 TensorFlow 1.10.0 + Keras 2.2.0 on Python 2. floydhub/tensorflow
TensorFlow 1.9 tensorflow-1.9 TensorFlow 1.9.0 + Keras 2.2.0 on Python 3.6. floydhub/tensorflow TensorFlow-1.9
tensorflow-1.9:py2 TensorFlow 1.9.0 + Keras 2.2.0 on Python 2. floydhub/tensorflow
TensorFlow 1.8 tensorflow-1.8 TensorFlow 1.8.0 + Keras 2.1.6 on Python 3.6. floydhub/tensorflow TensorFlow-1.8
tensorflow-1.8:py2 TensorFlow 1.8.0 + Keras 2.1.6 on Python 2. floydhub/tensorflow
TensorFlow 1.7 tensorflow-1.7 TensorFlow 1.7.0 + Keras 2.1.6 on Python 3.6. floydhub/tensorflow TensorFlow-1.7
tensorflow-1.7:py2 TensorFlow 1.7.0 + Keras 2.1.6 on Python 2. floydhub/tensorflow
TensorFlow 1.5 tensorflow-1.5 TensorFlow 1.5.0 + Keras 2.1.6 on Python 3.6. floydhub/tensorflow TensorFlow-1.5
tensorflow-1.5:py2 TensorFlow 1.5.0 + Keras 2.1.6 on Python 2. floydhub/tensorflow
TensorFlow 1.4 tensorflow-1.4 TensorFlow 1.4.0 + Keras 2.0.8 on Python 3.6. floydhub/tensorflow
tensorflow-1.4:py2 TensorFlow 1.4.0 + Keras 2.0.8 on Python 2. floydhub/tensorflow
TensorFlow 1.3 tensorflow-1.3 TensorFlow 1.3.0 + Keras 2.0.6 on Python 3.6. floydhub/tensorflow
tensorflow-1.3:py2 TensorFlow 1.3.0 + Keras 2.0.6 on Python 2. floydhub/tensorflow
TensorFlow 1.2 tensorflow-1.2 TensorFlow 1.2.0 + Keras 2.0.6 on Python 3.5. floydhub/tensorflow
tensorflow-1.2:py2 TensorFlow 1.2.0 + Keras 2.0.6 on Python 2. floydhub/tensorflow
TensorFlow 1.1 tensorflow TensorFlow 1.1.0 + Keras 2.0.6 on Python 3.5. floydhub/tensorflow
tensorflow:py2 TensorFlow 1.1.0 + Keras 2.0.6 on Python 2. floydhub/tensorflow
TensorFlow 1.0 tensorflow-1.0 TensorFlow 1.0.0 + Keras 2.0.6 on Python 3.5. floydhub/tensorflow
tensorflow-1.0:py2 TensorFlow 1.0.0 + Keras 2.0.6 on Python 2. floydhub/tensorflow
TensorFlow 0.12 tensorflow-0.12 TensorFlow 0.12.1 + Keras 1.2.2 on Python 3.5. floydhub/tensorflow
tensorflow-0.12:py2 TensorFlow 0.12.1 + Keras 1.2.2 on Python 2. floydhub/tensorflow
PyTorch 1.1 pytorch-1.1 PyTorch 1.1.0 + fastai 1.0.57 on Python 3.6. floydhub/pytorch PyTorch-1.1
PyTorch 1.0 pytorch-1.0 PyTorch 1.0.0 + fastai 1.0.51 on Python 3.6. floydhub/pytorch PyTorch-1.0
pytorch-1.0:py2 PyTorch 1.0.0 on Python 2. floydhub/pytorch
PyTorch 0.4 pytorch-0.4 PyTorch 0.4.1 on Python 3.6. floydhub/pytorch PyTorch-0.4
pytorch-0.4:py2 PyTorch 0.4.1 on Python 2. floydhub/pytorch
PyTorch 0.3 pytorch-0.3 PyTorch 0.3.1 on Python 3.6. floydhub/pytorch PyTorch-0.3
pytorch-0.3:py2 PyTorch 0.3.1 on Python 2. floydhub/pytorch
PyTorch 0.2 pytorch-0.2 PyTorch 0.2.0 on Python 3.5 floydhub/pytorch
pytorch-0.2:py2 PyTorch 0.2.0 on Python 2. floydhub/pytorch
PyTorch 0.1 pytorch-0.1 PyTorch 0.1.12 on Python 3. floydhub/pytorch
pytorch-0.1:py2 PyTorch 0.1.12 on Python 2. floydhub/pytorch
Theano 0.9 theano-0.9 Theano rel-0.8.2 + Keras 2.0.3 on Python3.5. floydhub/theano
theano-0.9:py2 Theano rel-0.8.2 + Keras 2.0.3 on Python2. floydhub/theano
Caffe caffe Caffe rc4 on Python3.5. floydhub/caffe
caffe:py2 Caffe rc4 on Python2. floydhub/caffe
Torch torch Torch 7 with Python 3 env. floydhub/torch
torch:py2 Torch 7 with Python 2 env. floydhub/torch
Chainer 1.23 chainer-1.23 Chainer 1.23.0 on Python 3. floydhub/chainer
chainer-1.23:py2 Chainer 1.23.0 on Python 2. floydhub/chainer
Chainer 2.0 chainer-2.0 Chainer 1.23.0 on Python 3. floydhub/chainer
chainer-2.0:py2 Chainer 1.23.0 on Python 2. floydhub/chainer
MxNet 1.0 mxnet MxNet 1.0.0 on Python 3.6. floydhub/mxnet
mxnet:py2 MxNet 1.0.0 on Python 2. floydhub/mxnet

原文链接：https://blog.csdn.net/qq_37591637/article/details/103925212

18.win10安装pytorch特别的慢

解决办法：更换下载源步骤：step1:首先获取到下载命令https://pytorch.org/
step2:将源添加到conda中，在虚拟环境当中conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
step3:输入命令（注意不用写 -c pytorch，写上的话就又从官网下载了，白折腾一遍）
最后正常下载就好

Linux & win环境下都可以直接从官网上下载对应版本的wheel包

step1:https://pytorch.org/

step2:找到PREVIOUS VERSIONS OF PYTORCH

step3:找到对应版本的wheel包

step4:复制命令的最后一部分的网址如：pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html 就复制https://download.pytorch.org/whl/torch_stable.html 到浏览器访问

step5:Ctrl + F 找到对应版本的pytorch + cuda

step6:直接下载

step7:pip install 这个wheel包即可

注意！此方式可能在某些时候失效，报错提示CUDA版本不符，若提示此类错误只能使用方法一解决

19.安装pytorch时报错specified in the package manifest cannot be found.

解决方法：直接清空package，然后重新安装
step1: 输入conda clean --packages --tarballs进行清空 step2: 再重新进行安装

20.用torchsummary来进行模型参数可视化的时候，出现报错RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

原因：因为模型训练的时候没有指定GPU训练或者CPU训练，导致输入的数据类型与网络参数的类型不符。
解决方案：方法一：将模型指定到GPU上去，同时也在summary中指定调用GPU方法
方法二：如果网络用CPU训练的，就在summary中指定调用CPU方法（注意：一定一定要指定device，否则会出现错误）

# 以resnet为例，在main方法中添加对应代码 
if __name__=="__main__":
    model1 = resnet34()
    print(model1)

    '''
    # 方法一：把网络指定到GPU上，同时summary指定为GPU
    device = torch.device('cuda:0')
    model1.to(device)

    summary(model1, input_size=(3, 224, 224), device="cuda")
    '''

    # 方法二：summary直接指定到CPU
    summary(model1, input_size=(3, 224, 224), device="cpu")

21.使用GPU训练报错Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor)

原因：两种可能，一种是数据没挂载到GPU上，一种是模型没有挂载到GPU上
解决方案：对于数据 tensor = tensor.to(device)# device是GPU
对于模型 model = model.to(device)

22.在模型训练的时候报错：BrokenPipeError: [Errno 32] Broken pipe

原因：在windows上多线程执行程序的时候，num_workers参数可能设置过大解决方案：直接减小为0（标识主进程运行），或者减小num_workers的值

23.背景：加载预训练模型后，无模型结构改动等操作报错RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

可能原因：1.数据或者模型没有挂载到GPU上 2.batchsize过大解决方案：将数据和模型都确保挂载到GPU上后，如果还报错，就调小batchsize（生效了，但是不明白原因）

24.安装tensorflow 2.1.0后（仅仅只安装conda install tensorflow-gpu==2.1.0），后运行import tensorflow，可以正常启动，却在报错信息中缺少dll，而且在测试gpu是否available的时候发现False

可能原因：缺少dll库解决方案：windows系统就安装在C:\Windows\System32 linux上目前还在复现当中

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

常见问题及解决方案.md

常见问题及解决方案.md

常见问题及解决方案

1. matplotlib无法正常显示中文

2. TensorFlow 显示过多无用log

3. TensorFlow在使用tf.keras.models.load_model()时报“AttributeError: ‘str‘ object has no attribute ‘decode‘ ”错误

4.在使用子类的方法搭建神经网络时报TypeError: '_TupleWrapper' object is not callable错误

5. LabelImage标注工具Windows环境下无法使用

6. PyTorch出现报错信息 "BrokenPipeError",

7. PyTorch出现报错信息"Expected input batch_size (25) to match target batch_size (100)."

8.在保存模型过后，要加载模型。出现各种各样莫名奇妙的错误。

9.报错信息为：python : 'gbk' codec can't decode byte 0xbe in position xxxlines

10.报错信息为：OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.

11.cv.imwrite时报错信息为：img data type = 8 is not supported

12.OpenCV: FFMPEG: tag 0x47504a4d/'MJPG' is not supported with codec id 7 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'倒置多图像合成视频时出错

13.Corrupt JPEG data: premature end of data segment

14. Building wheel for xxx

15.搭建了模型后，在进行model.train()的时候发生错误，报错信息： File "/home/y/anaconda3/envs/vid2vid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 88, in forward raise NotImplementedError

16.模型运行时报错：RuntimeError: Given groups=1, weight of size 1024 512 3 3, expected input[5, 1536, 13, 13] to have 512 channels, but got 1536 channels instead

17.安装keras后，import keras报错：ImportError: cannot import name 'tf2'或者其他错误

18.win10安装pytorch特别的慢

19.安装pytorch时报错specified in the package manifest cannot be found.

20.用torchsummary来进行模型参数可视化的时候，出现报错RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

21.使用GPU训练报错Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor)

22.在模型训练的时候报错：BrokenPipeError: [Errno 32] Broken pipe

23.背景：加载预训练模型后，无模型结构改动等操作报错RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

24.安装tensorflow 2.1.0后（仅仅只安装conda install tensorflow-gpu==2.1.0），后运行import tensorflow，可以正常启动，却在报错信息中缺少dll，而且在测试gpu是否available的时候发现False

next

Files

常见问题及解决方案.md

Latest commit

History

常见问题及解决方案.md

File metadata and controls

常见问题及解决方案

1. matplotlib无法正常显示中文

2. TensorFlow 显示过多无用log

3. TensorFlow在使用tf.keras.models.load_model()时报“AttributeError: ‘str‘ object has no attribute ‘decode‘ ”错误

4.在使用子类的方法搭建神经网络时报TypeError: '_TupleWrapper' object is not callable错误

5. LabelImage标注工具Windows环境下无法使用

6. PyTorch出现报错信息 "BrokenPipeError",

7. PyTorch出现报错信息"Expected input batch_size (25) to match target batch_size (100)."

8.在保存模型过后，要加载模型。出现各种各样莫名奇妙的错误。

9.报错信息为：python : 'gbk' codec can't decode byte 0xbe in position xxxlines

10.报错信息为：OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.

11.cv.imwrite时报错信息为：img data type = 8 is not supported

12.OpenCV: FFMPEG: tag 0x47504a4d/'MJPG' is not supported with codec id 7 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'倒置多图像合成视频时出错

13.Corrupt JPEG data: premature end of data segment

14. Building wheel for xxx

15.搭建了模型后，在进行model.train()的时候发生错误，报错信息： File "/home/y/anaconda3/envs/vid2vid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 88, in forward raise NotImplementedError

16.模型运行时报错：RuntimeError: Given groups=1, weight of size 1024 512 3 3, expected input[5, 1536, 13, 13] to have 512 channels, but got 1536 channels instead

17.安装keras后，import keras报错：ImportError: cannot import name 'tf2'或者其他错误

18.win10安装pytorch特别的慢

19.安装pytorch时报错specified in the package manifest cannot be found.

20.用torchsummary来进行模型参数可视化的时候，出现报错RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

21.使用GPU训练报错Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor)

22.在模型训练的时候报错：BrokenPipeError: [Errno 32] Broken pipe

23.背景：加载预训练模型后，无模型结构改动等操作报错RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

24.安装tensorflow 2.1.0后（仅仅只安装conda install tensorflow-gpu==2.1.0），后运行import tensorflow，可以正常启动，却在报错信息中缺少dll，而且在测试gpu是否available的时候发现False

next