You are an engineer on the TensorRT team tasked with developing algorithms for inference on neural networks. A customer has provided a list of networks they are using for inference, and most of them involve a tensor reformat operation. Namely, the reformat operation involves taking in a tensor containing floating-point numbers and outputting a similarly shaped tensor with low-precision integers. You start off with a proof of concept for this functionality in a C++ program.
Task
Reformat tensor with given scale factor
Create a function that takes in an FP32 tensor and floating point scale factor as input. Output an Int8 tensor by converting values using division by the given scale factor.
这道题考察张量数值转换与定点量化的基础实现:给定一个 FP32 tensor 和缩放因子,需要将每个元素按 scale 进行除法后转换为 Int8 tensor,并保持原有形状不变。解题时通常直接遍历张量元素,完成浮点到整型的转换即可,重点在于理解“按比例缩放后再转成低精度整数”的量化思路,以及处理好数据类型转换带来的精度与截断问题。