NVIDIA Interview Coding Question: Reformat Tensor with Given Scale Factor

19 Views
No Comments

You are an engineer on the TensorRT team tasked with developing algorithms for inference on neural networks. A customer has provided a list of networks they are using for inference, and most of them involve a tensor reformat operation. Namely, the reformat operation involves taking in a tensor containing floating-point numbers and outputting a similarly shaped tensor with low-precision integers. You start off with a proof of concept for this functionality in a C++ program.

Task

Reformat tensor with given scale factor

Create a function that takes in an FP32 tensor and floating point scale factor as input. Output an Int8 tensor by converting values using division by the given scale factor.

This problem asks you to implement a basic tensor reformat/quantization step: convert an FP32 tensor into an Int8 tensor by dividing each value by a given scale factor while preserving the tensor shape. The core idea is straightforward element-wise transformation, with attention to floating-point to integer conversion and the role of scaling in low-precision inference.

END
 0