nn-meter——预测流程分析

nn-Meter预测流程

  1. 输入预测指令
    1
    nn-meter predict --predictor RedmiK30Pro_cpu_tflite27 --predictor-version 1.0 --onnx /root/workspace/nn-Meter/workspace/models/mobilenetv3small_0.onnx
    这条指令将会被nn_meter/utils/nn_meter_cli/predictor.py#apply_latency_predictor_cli这个函数接收到。函数内进行了参数解析,主要解析了–predictor, –predictor-version,–onnx/–tensorflow等参数。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    def apply_latency_predictor_cli(args):
    # specify model type
    if args.tensorflow:
    input_model, model_type, model_suffix = args.tensorflow, "pb", ".pb"
    elif args.onnx:
    input_model, model_type, model_suffix = args.onnx, "onnx", ".onnx"
    elif args.nn_meter_ir:
    input_model, model_type, model_suffix = args.nn_meter_ir, "nnmeter-ir", ".json"
    elif args.torchvision: # torch model name from torchvision model zoo
    input_model_list, model_type = args.torchvision, "torch"
    ...

    # load predictor
    predictor = load_latency_predictor(args.predictor, args.predictor_version)

    ...
    # predict latency
    result = {}
    for model in input_model_list:
    latency = predictor.predict(model, model_type) # in unit of ms
    result[os.path.basename(model)] = latency
    logging.result(f'[RESULT] predict latency for {os.path.basename(model)}: {latency} ms')

    return result

  2. 加载预测器load_latency_predictor
    step 1中,解析出–predictor, –predictor-version参数后会去加载相关的predictor文件,这里调用了nn_meter/predictor/nn_meter_predictor.py#load_latency_predictor函数根据用户目录下cache过的文件路径去找到预测器和融合规则。这些文件要么是官方默认提供的,要么是用户自己客制化的。找到然后返回一个nnMeterPredictor对象。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    def load_latency_predictor(predictor_name: str, predictor_version: float = None):
    user_data_folder = get_user_data_folder()
    pred_info = load_predictor_config(predictor_name, predictor_version)
    if "download" in pred_info:
    kernel_predictors, fusionrule = loading_to_local(pred_info, os.path.join(user_data_folder, 'predictor'))
    else:
    kernel_predictors, fusionrule = loading_customized_predictor(pred_info)

    return nnMeterPredictor(kernel_predictors, fusionrule)

  3. 预测器对象预测模型延时predictor.predict
    在step 2中得到一个预测器后,我们就会调用nn_meter/predictor/nn_meter_predictor.py#nnMeterPredictor.predict函数去预测模型了。
    1. self.kd.load_graph(graph)这里先将模型转换成graph,解析成kernels,这步的实现在nn_meter/kernel_detector/kernel_detector.py#KernelDetector.load_graph,这里考虑了融合规则,如何发现op的组合符合融合规则的,那么这些OP将会组合在一起去预测。

    2. nn_predict(self.kernel_predictors, self.kd.get_kernels()),将step 3.1中检测出来的kernels送入kernel预测器进行单逐一预测,nn_meter/predictor/prediction/predict_by_kernel.py#nn_predict,这里主要是提取kernels的特征,其实就是conv2d这些op的参数,例如输入输出维度、卷积和大小等。然后就是逐层根据op/kernel name去选择预测器加载特征去预测延时了。

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      def predict(
      self, model, model_type, input_shape=(1, 3, 224, 224), apply_nni=False
      ):
      logging.info("Start latency prediction ...")
      if isinstance(model, str):
      graph = model_file_to_graph(model, model_type, input_shape, apply_nni=apply_nni)
      else:
      graph = model_to_graph(model, model_type, input_shape=input_shape, apply_nni=apply_nni)

      # logging.info(graph)
      self.kd.load_graph(graph)

      py = nn_predict(self.kernel_predictors, self.kd.get_kernels()) # in unit of ms
      logging.info(f"Predict latency: {py} ms")
      return py

nn-meter——预测流程分析

https://blog.yuanhuan.site/posts/dc86034a/

Author

Huan Yang

Posted on

2023-05-22

Updated on

2023-10-16

Licensed under

Your browser is out-of-date!

Update your browser to view this website correctly.&npsb;Update my browser now

×