Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COCO Mask API is leaking memory when used with Multithreading #202

Open
HenryJia opened this issue Aug 7, 2018 · 5 comments
Open

COCO Mask API is leaking memory when used with Multithreading #202

HenryJia opened this issue Aug 7, 2018 · 5 comments

Comments

@HenryJia
Copy link

HenryJia commented Aug 7, 2018

Minimal example to reproduce

from multiprocessing.pool import Pool

from pycocotools.coco import COCO

ann_file = '/home/nvme/MS-COCO/annotations/instances_train2017.json'

coco = COCO(ann_file)

category_ids = coco.getCatIds()
categories = coco.loadCats(category_ids)
img_ids = coco.getImgIds()
img_dicts = coco.loadImgs(img_ids)
num_classes = max(category_ids) + 1


def run(img_dict):
    ann_ids = coco.getAnnIds(imgIds = img_dict['id'], iscrowd = None)
    anns = coco.loadAnns(ann_ids)
    for a in anns:
        mask = coco.annToMask(a)
    return None

p = Pool(16)

# This does not use much memory
#out = list(map(run, img_dicts))

# This chews up pretty much all of the memory of my system and doesn't give it back once map has finished
out = p.map(run, img_dicts)

del out
@HenryJia HenryJia changed the title COCO Mask API is leaking memory COCO Mask API is leaking memory when used with Multithreading Aug 7, 2018
@svenkreiss
Copy link

Any news? I am running into that issue in a multi-threaded PyTorch data loader.

@HenryJia
Copy link
Author

Any news? I am running into that issue in a multi-threaded PyTorch data loader.

Interesting, I originally discovered the issue by using PyTorch's multi-process dataloader. this was just a minimal example to reproduce and verify that it wasn't PyTorch leaking memory

@lolitsgab
Copy link

lolitsgab commented Oct 18, 2019

It seems that loading that it is leaking even when I run it in single threaded mode.

My code is:


def create_mask(batch):
    print("Processing batch {}".format(batch))
    annFile="../datasets/DeepFashion/train_coco_batch_{}.json".format(batch)
    coco=COCO(annFile)

    cats = coco.loadCats(coco.getCatIds())
    nms=[cat["name"] for cat in cats]
    nms = set([cat["supercategory"] for cat in cats])    
    catIds = coco.getCatIds(catNms=nms);
    imgIds = coco.getImgIds(catIds=catIds );

    with progressbar.ProgressBar(max_value=len(imgIds)) as bar:
        for i in range(len(imgIds)):
            bar.update(i)
            img = coco.loadImgs(imgIds[i])[0]
            I = io.imread("../datasets/DeepFashion/train/image/" + img["file_name"])
            plt.axis("off")
            # plt.imshow(I)
            # plt.show()
            annIds = coco.getAnnIds(imgIds=img["id"], catIds=catIds, iscrowd=None)
            anns = coco.loadAnns(annIds)
            # coco.showAnns(anns)

            mask = coco.annToMask(anns[0])
            for i in range(len(anns)):
                mask += coco.annToMask(anns[i])

            plt.imshow(mask)
            plt.imsave(plt.savefig("../datasets/DeepFashion/train/image_annotated/" + img["file_name"], bbox_inches = "tight", pad_inches = 0), mask)
            plt.clf()
            plt.close("all")

and I run it in a loop like:
for i in range(0,24): create_mask(i) gc.collect()

and it runs out of memory around 4 iterations in (each iteration is 5000 images). Anyone have a solution to this?

@etienne87
Copy link

etienne87 commented Nov 15, 2019

for me, COCO API for object detection also leaks, and it takes too much memory with the pytorch multiprocess dataloader. i am currently trying to cut the "instances.json" into multiple files to see if that helps https://github.com/etienne87/torch_object_rnn/blob/master/datasets/coco_wrapper.py. I basically only load the useful annotation when needed, instead of loading the whole json of training in ram.

@rijobro
Copy link

rijobro commented Mar 29, 2023

Possible fix? #637

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants