Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use scalar inputs #240

Open
wq9 opened this issue Apr 12, 2024 · 3 comments
Open

How to use scalar inputs #240

wq9 opened this issue Apr 12, 2024 · 3 comments
Assignees

Comments

@wq9
Copy link

wq9 commented Apr 12, 2024

I'm trying to use a scalar input to resize a video, but can't figure out how to set the ndim parameter of external_source or the shape of the input in the client.

config.pbtxt

backend: "dali"
max_batch_size: 0

model_transaction_policy {
  decoupled: True
}

1/dali.py

import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize #must include

@dali.plugin.triton.autoserialize
@dali.pipeline_def(batch_size=32, num_threads=4, device_id=0, output_dtype=dali.types.FLOAT, output_ndim=3)
def pipeline():
    vid = dali.fn.experimental.inputs.video(name="INPUT", sequence_length=1, device='mixed')

    height = dali.fn.external_source(name="HEIGHT", ndim=1, dtype=dali.types.INT16, repeat_last=True)
    width = dali.fn.external_source(name="WIDTH", ndim=1, dtype=dali.types.INT16, repeat_last=True)

    vid = dali.fn.resize(vid, resize_x=width, resize_y=height, mode="not_larger") #resize
    vid = dali.fn.crop(vid, crop_w=width, crop_h=height, out_of_bounds_policy="pad") #pad
    vid = dali.fn.squeeze(vid, axes=0) #remove sequence dim
    vid = dali.fn.transpose(vid, perm=[2, 0, 1]) #HWC to CHW
    vid = dali.fn.cast(vid, dtype=dali.types.FLOAT, name="OUTPUT") #UINT8 to FP32
    return vid

client.py from video_decode_remap

...
        width = np.ones((1), dtype=np.int16)*640
        height = np.ones((1), dtype=np.int16)*360

        inputs = [
            tritonclient.grpc.InferInput("INPUT", video_raw.shape, "UINT8"),
            tritonclient.grpc.InferInput("WIDTH", width.shape, "INT16"),
            tritonclient.grpc.InferInput("HEIGHT", height.shape, "INT16"),
        ]
        inputs[0].set_data_from_numpy(video_raw)
        inputs[1].set_data_from_numpy(width)
        inputs[2].set_data_from_numpy(height)
...

If I run that, I get unexpected shape for input 'HEIGHT' for model 'resize_224'. Expected [-1,-1], got [1]. How do you properly set and get the scalar values in both client.py and dali.py?

@banasraf
Copy link
Collaborator

Hey @wq9

I think this should work when you add the batch dimension to the height and width inputs. So, assuming the batch size is 32 in your pipeline, the client code would look like:

width = np.ones((32, 1), dtype=np.int16)*640
height = np.ones((32, 1), dtype=np.int16)*360

@wq9
Copy link
Author

wq9 commented Apr 15, 2024

@banasraf Adding the batch dimension worked. Thanks!

However, when the input is a video (video_raw = np.expand_dims(np.fromfile(FLAGS.video, dtype=np.uint8), axis=0)), the last batch is not 32, so I get the error:

[/opt/dali/dali/pipeline/operator/operator.cc:43] Assert on "curr_batch_size == static_cast<decltype(curr_batch_size)>(arg.second.tvec->num_samples())" failed: 
ArgumentInput has to have the same batch size as an input.

Is there a way to pad the batch dimension?

@banasraf
Copy link
Collaborator

@wq9

Unfortunately this operator does not allow padding of the last batch. I don't see any workaround that would make your case work properly. The only options I see is hardcoding the width and height in the pipeline or if you know the number of frames in the sample, predicting when to send a partial width and height tensors.

I'll add a task to our backlog to extend the video input operator with the option to pad the last batch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants