This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SK-TDA4VM: Issue compiling custom model

Part Number: SK-TDA4VM

Hi,

My team and I have been using the edgeai-benchmark tool in an attempt to compile a custom object detection model for use with EdgeAI apps on the SK-TDA4VM board. The custom model we are currently attempting this process with uses a checkpoint of efficientdet-lite0 as a base that has been finetuned on some of our own data. We have successfully compiled and ran similar models pulled from modelzoo using the benchmark tool (E.G. efficientdet-lite0_bifpn_maxpool2x2_relu_ti-lite), and much like the efficientdet models on modelzoo, we changed the activation type of our models to relu before we began finetuning.

After exporting our model to tflite format and following the instructions located here github.com/.../custom_models.md, we have created the following pipeline config settings which uses coco as the input and calibration datasets:

'od-xxxx': utils.dict_update(common_cfg,
            preprocess=preproc_transforms.get_transform_tflite((320, 320), (320, 320), backend='cv2'),
            session=tflite_session_type(**utils.dict_update(tflite_session_cfg, input_mean=(127.0,  127.0,  127.0), input_scale=(1.0/128.0, 1.0/128.0, 1.0/128.0)),
                runtime_options=utils.dict_update(runtime_options_tflite_np2, {'object_detection:meta_arch_type': 5, 'object_detection:meta_layers_names_list':f'{settings.models_path}/d0-lite_mod.prototxt'}),
                model_path=f'{settings.models_path}/d0-lite_mod.tflite'),
            postprocess=postproc_transforms.get_transform_detection_tflite( normalized_detections=False, ignore_index=0, resize_with_pad=False),
            metric=dict(label_offset_pred=datasets.coco_det_label_offset_90to90(label_offset=0)),
            model_info=dict(metric_reference={'accuracy_ap[.5:.95]%':31.57})

The prototext file we are using is mostly a copy of the modelzoo prototexts, with the only change being to the top_k, detection threshold, and input dimension values. When attempting to compile the model with this config, we get the error:

DIM Error - For Tensor 0, Dim 1 is 0

Followed by a segmentation fault. We also alternatively get the same dimension error, but instead of 0, it says Dim 1 is some large negative number.

In an effort to avoid crashing, we modified the above pipeline config by removing the meta layer names list option. This modification did allow the script to complete without crashing, however doing so caused the script to auto generate this as the meta pipeline data:

TIDL Meta PipeLine (Proto) File  :   

Number of OD backbone nodes = 0 
Size of odBackboneNodeIds = 0 

 Number of subgraphs:2 , 306 nodes delegated out of 363 nodes 

This is incorrect for a variety of reasons (didn't count any OD backbone nodes, identified 2 subgraphs, not every node delegate), and unsurprisingly, inference calculated very poor results:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

Packaging the compiled model and running it on the board results in a very choppy framerate with no objects ever being detected. Attempting the same process using a model with efficientdet-lite1 as a base performed similarly, except it detected nothing but a large amount of false positives during inference. These results occur regardless of configuration in settings.yaml.

Is there something missing with our prototext or pipeline config that would account for the dimension error or the poor inference results, or could the issue lie in our model and how the TIDL tools interpret it?

Thank you,

Andrew

  • Hi Andrew,

    The primary issue is this:

    >>When attempting to compile the model with this config, we get the error:

    >>DIM Error - For Tensor 0, Dim 1 is 0

    >>Followed by a segmentation fault.

    Can you add that input_optimization=False like we did in the other thread and see if it helps: (https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1099374/sk-tda4vm-efficientdet-lite1_relu-and-efficientdet-lite3_relu-compilation/4083019#4083019)

    On a side note, I would like to notify you about our newest tool released here: https://github.com/TexasInstruments/edgeai-modelmaker It has an end-to-end model development flow including model training and compilation (Annotation is not yet integrated but we support external tools).  It is easy to get started. Currently supports only command line.  You would be the first one to try, so expect a few issues - but looking forward to your feedback if you try.

  • We get the same error with input_optimization=False.

    Thank you for sharing the edgeai-modelmaker tool.  My team is trying that out; we'll let you know if we have any feedback.

  • Hi Manu,

    Will the TensorFlow models from edgeai-modelzoo eventually be integrated into edgai-modelmaker?

    Regards,

    Andrew

  • All the packages that we add have to be installed in the same Python environment. That makes it hard to install both Pytorch and Tensorflow based repositories together. It may work, but hard to maintain in the long term. So, we may not attempt to add Tensorflow based training into edgeai-modelmaker.

    But one need is to add support for more tasks (eg. Semantic segmentation, keypoint detection etc).

  • Hi Andrew,

    Just curious - How is it going so far with edgeai-modelmaker?

  • To help you export the efficientdet-lite model correctly, I have looped in my colleague Debu. He may be able to help here because he has already done it once. 

  • Here is a summary from a member of my team who has worked more with edgai-modelmaker on what has worked and on issues encountered:

    Modelmaker has worked well so far in what it aims to do. We have successfully been able to train and compile a variety of object detection models using this tool, and all of them performed efficient inference which could be smoothly ran on the TDA4. Dataset and base model checkpoints combinations that our team has had success with are as follows:
    
    COCO Dataset : ssd_mobilenetv2_fpn_lite_mmdet
    custom_data : ssd_regnetx_800mf_fpn_bgr_lite_mmdet
    custom_data : yolox_s_lite_mmdet
    TIscapes Driving : ssd_mobilenetv2_fpn_lite_mmdet
    TIscapes Driving : ssd_regnetx_1p6gf_fpn_bgr_lite_mmdet
    TIscapes Driving : yolox_s_lite_mmdet
    
    Training was done with 30 epochs, 8 batch size, 1e-05 initial learning rate, and 1 gpu (4gb-12gb). Compilation was done with 8 tensor bits. Final inference on all successful data/model pairs reported mAP [.5-.95] > 10%.
    
    Additionally, a member of our team was able to add a different pytorch checkpoint (different to the six already available on the tool) to the pipeline and compile a successful model out of it
    
    There have been a few issues we ran into while using this tool, but thankfully, so far they have been fixable. The issues we have had, and the necessary fixes taken are as follows:
    
    Error- ONNXRTSession has no attribute get_run_dir.  
    This occurs on line 166 of edgeai-benchmark.py in the modelmaker directory. Only occurred on one machine. We believe it occurred because modelmaker was installed in a directory which already contained edgeai-benchmark. Cloning and installing modelmaker in a clean directory resolved this error.
    
    Bug - Compiled model creates blank label_offset_pred parameter in param.yaml. This occurred on every machine. The fix was simply manually creating that parameter prior to running inference on the board. Since our board testing was done with Edge AI Apps, creating a label id to name dictionary inside of classnames.py (or edgeai_classnames.cpp) was also necessary for the post processor to function, but we believe this part is expected behavior. Bug - Attempting to train a model with the PASCAL Dataset always throws a CUDA out of memory error. This occurred for both the 4gb and 12gb machines. We believe this to be a bug because the error is thrown regardless of batch size setting, and these same machines were able to train and compile the much larger COCO dataset without issue. Error persists regardless of whether pascal was downloaded through the modelmaker tool, or manually imported. Bug - Cloning the repo corrupts every zip file dataset in the data/examples directory. As a result, animal_classification, animal_detection, and tiscapes2017_driving can not be loaded. Can be rectified by going back to the repo and individually redownloading every zip file. Bug - On some machines, modelmaker fails to separate images into val and train directories properly, which results in the script crashing. This isn't to say that the script fails to separate the data at all, instances.json is separated into instances_val.json and istances_train.json as expected. The issue is that the script creates train and val directories that can not be opened--attempting to do so results in an unknown file system error. We believe this is due to a conflict between OS and os library versions. Fix is just creating train and val directories oneself (copying over all of the images into each isn't space efficient, but it works without the need for sorting) Overall, in regards to using custom data to train, export, compile, and run a model on the TDA4, modelmaker has given us the most success. While our intention is to use a version of efficientdet-lite and, by proxy, tensorflow, we can appreciate how simple this tool is to use and how much it expedites the model development pipeline.


  • Thank you Andrew. This is an encouraging feedback.

    You already had a suggestion about supporting tensorflow models, if you have any another suggestions and feature requests, do let us know.

  • label id to name dictionary can be written in the param.yaml or an equivalent file during compilation and the SDK can just read that (instead of having to manually modify classnames.py or edgeai_classnames.cpp) - we are thinking about this.

  • In the param.yaml file that is generated for me, I can see the following entry:

    metric:
        label_offset_pred: 1

    I am wondering why this field is not there in your case.

  • Hi, are there any updates regarding this issue?

  • Hi Andrew,

    • Will it be possible to share a a sample tflite model that you are trying?
    • Which repository did you use for training and export?

    Regards, Debapriya

  • Hi Debapriya,

    We have a sample tflite file, but attempting to upload it to the forum using the image/video/file option from the insert menu results in this message:

    "The file or URL is not allowed to be inserted."

    We used the Google automl repo for training and export.

    The model was trained and exported using the default automl scripts, and it's only changes were to hparams.config. Activation type was changed from relu6 to relu, and base learning rate and warmup rate were reduced from 0.08 and 0.008 to 0.01 and 0.001 respectively.

    Regards,

    Andrew

  • Hi Andrew,

    Thanks for the details.

    Can you please try making a zip file and then upload.

    Regards, Debapriya

  • Hi Andrew,

    Thanks for sharing the model. One more request: Can you please share the prototxt file as well.

    Regards, Debapriya

  • Hi Andrew,

    For the model that you shared, it has input preprocessing inside the model. Again, it is being performed inside edgeai-benchmark script. Therefore in your setup, you are performing the preprocessing twice. That is probably the reason that the accuracy numbers are always zero.

    If you look into the model that we shared in model zoo, you will find out that there is no pre-processing inside the model. This was done to optimize the inference.  I have disabled the preprocessing  while exporting with the attached changes on commit id:39c39e5 in the file efficientdet/inference.py.

    I have attached a snippet with the changes as well as the inference.py file/

    Please try it out and let us know your observations.

    # Copyright 2020 Google Research. All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    # ==============================================================================
    r"""Inference related utilities."""
    import copy
    import functools
    import os
    import time
    from typing import Text, Dict, Any, List, Tuple, Union
    from absl import logging
    import numpy as np
    from PIL import Image
    import tensorflow.compat.v1 as tf
    
    import dataloader
    import det_model_fn
    import hparams_config
    import utils
    from tf2 import efficientdet_keras
    from tf2 import label_util
    from tf2 import postprocess
    from visualize import vis_utils
    from tensorflow.python.client import timeline  # pylint: disable=g-direct-tensorflow-import
    
    
    def image_preprocess(image, image_size, mean_rgb, stddev_rgb):
      """Preprocess image for inference.
    
      Args:
        image: input image, can be a tensor or a numpy arary.
        image_size: single integer of image size for square image or tuple of two
          integers, in the format of (image_height, image_width).
        mean_rgb: Mean value of RGB, can be a list of float or a float value.
        stddev_rgb: Standard deviation of RGB, can be a list of float or a float
          value.
    
      Returns:
        (image, scale): a tuple of processed image and its scale.
      """
      input_processor = dataloader.DetectionInputProcessor(image, image_size)
      #input_processor.normalize_image(mean_rgb, stddev_rgb)
      #input_processor.set_scale_factors_to_output_size()
      #image = input_processor.resize_and_crop_image()
      image_scale = input_processor.image_scale_to_original
      return image, image_scale
    
    
    @tf.autograph.to_graph
    def batch_image_files_decode(image_files):
      #raw_images = tf.TensorArray(tf.float32, size=0, dynamic_size=True)
      #for i in tf.range(tf.shape(image_files)[0]):
      image = tf.io.decode_image(image_files[0], dtype=tf.float32)
      #image.set_shape([None, None, None])
      image.set_shape([1, 512, 512, 3])
      #raw_images = raw_images.write(i, image)
      #return raw_images.stack()
      return image
    
    
    def batch_image_preprocess(raw_images,
                               image_size: Union[int, Tuple[int, int]],
                               mean_rgb,
                               stddev_rgb,
                               batch_size: int = None):
      """Preprocess batched images for inference.
    
      Args:
        raw_images: a list of images, each image can be a tensor or a numpy arary.
        image_size: single integer of image size for square image or tuple of two
          integers, in the format of (image_height, image_width).
        mean_rgb: Mean value of RGB, can be a list of float or a float value.
        stddev_rgb: Standard deviation of RGB, can be a list of float or a float
          value.
        batch_size: if None, use map_fn to deal with dynamic batch size.
    
      Returns:
        (image, scale): a tuple of processed images and scales.
      """
      if not batch_size:
        # map_fn is a little bit slower due to some extra overhead.
        # map_fn -> vectorized_map (fully parallelizes the batch).
        map_fn = functools.partial(
            image_preprocess,
            image_size=image_size,
            mean_rgb=mean_rgb,
            stddev_rgb=stddev_rgb)
        images, scales = tf.vectorized_map(map_fn, raw_images)
        #images = tf.stop_gradient(tf.cast(images, tf.float32))
        scales = tf.stop_gradient(tf.cast(scales, tf.float32))
        return (images, scales)
    
      # If batch size is known, use a simple loop.
      scales, images = [], []
      for i in range(batch_size):
        image, scale = image_preprocess(raw_images[i], image_size, mean_rgb,
                                        stddev_rgb)
        scales.append(scale)
        images.append(image)
      images = tf.stack(images)
      scales = tf.stack(scales)
      return (raw_images, scales)
    
    
    def build_inputs(
        image_path_pattern: Text,
        image_size: Union[int, Tuple[int, int]],
        mean_rgb,
        stddev_rgb,
    ):
      """Read and preprocess input images.
    
      Args:
        image_path_pattern: a path to indicate a single or multiple files.
        image_size: single integer of image size for square image or tuple of two
          integers, in the format of (image_height, image_width).
        mean_rgb: Mean value of RGB, can be a list of float or a float value.
        stddev_rgb: Standard deviation of RGB, can be a list of float or a float
          value.
    
      Returns:
        (raw_images, images, scales): raw images, processed images, and scales.
    
      Raises:
        ValueError if image_path_pattern doesn't match any file.
      """
      raw_images, images, scales = [], [], []
      for f in tf.io.gfile.glob(image_path_pattern):
        image = Image.open(f)
        raw_images.append(image)
        image, scale = image_preprocess(image, image_size, mean_rgb, stddev_rgb)
        images.append(image)
        scales.append(scale)
      if not images:
        raise ValueError(
            'Cannot find any images for pattern {}'.format(image_path_pattern))
      return raw_images, tf.stack(images), tf.stack(scales)
    
    
    def build_model(model_name: Text, inputs: tf.Tensor, **kwargs):
      """Build model for a given model name.
    
      Args:
        model_name: the name of the model.
        inputs: an image tensor or a numpy array.
        **kwargs: extra parameters for model builder.
    
      Returns:
        (cls_outputs, box_outputs): the outputs for class and box predictions.
        Each is a dictionary with key as feature level and value as predictions.
      """
      mixed_precision = kwargs.get('mixed_precision', None)
      precision = utils.get_precision(kwargs.get('strategy', None), mixed_precision)
    
      if kwargs.get('use_keras_model', None):
    
        def model_arch(feats, model_name=None, **kwargs):
          """Construct a model arch for keras models."""
          config = hparams_config.get_efficientdet_config(model_name)
          config.override(kwargs)
          model = efficientdet_keras.EfficientDetNet(config=config)
          cls_out_list, box_out_list = model(feats, training=False)
          # convert the list of model outputs to a dictionary with key=level.
          assert len(cls_out_list) == config.max_level - config.min_level + 1
          assert len(box_out_list) == config.max_level - config.min_level + 1
          cls_outputs, box_outputs = {}, {}
          for i in range(config.min_level, config.max_level + 1):
            cls_outputs[i] = cls_out_list[i - config.min_level]
            box_outputs[i] = box_out_list[i - config.min_level]
          return cls_outputs, box_outputs
    
      else:
        model_arch = det_model_fn.get_model_arch(model_name)
    
      cls_outputs, box_outputs = utils.build_model_with_precision(
          precision, model_arch, inputs, model_name, **kwargs)
    
      if mixed_precision:
        # Post-processing has multiple places with hard-coded float32.
        # TODO(tanmingxing): Remove them once post-process can adpat to dtypes.
        cls_outputs = {k: tf.cast(v, tf.float32) for k, v in cls_outputs.items()}
        box_outputs = {k: tf.cast(v, tf.float32) for k, v in box_outputs.items()}
    
      return cls_outputs, box_outputs
    
    
    def restore_ckpt(sess, ckpt_path, ema_decay=0.9998, export_ckpt=None):
      """Restore variables from a given checkpoint.
    
      Args:
        sess: a tf session for restoring or exporting models.
        ckpt_path: the path of the checkpoint. Can be a file path or a folder path.
        ema_decay: ema decay rate. If None or zero or negative value, disable ema.
        export_ckpt: whether to export the restored model.
      """
      sess.run(tf.global_variables_initializer())
      if tf.io.gfile.isdir(ckpt_path):
        ckpt_path = tf.train.latest_checkpoint(ckpt_path)
      if ema_decay > 0:
        ema = tf.train.ExponentialMovingAverage(decay=0.0)
        ema_vars = utils.get_ema_vars()
        var_dict = ema.variables_to_restore(ema_vars)
        ema_assign_op = ema.apply(ema_vars)
      else:
        var_dict = utils.get_ema_vars()
        ema_assign_op = None
    
      tf.train.get_or_create_global_step()
      sess.run(tf.global_variables_initializer())
      saver = tf.train.Saver(var_dict, max_to_keep=1)
      if ckpt_path == '_':
        logging.info('Running test: do not load any ckpt.')
        return
    
      # Restore all variables from ckpt.
      saver.restore(sess, ckpt_path)
    
      if export_ckpt:
        print('export model to {}'.format(export_ckpt))
        if ema_assign_op is not None:
          sess.run(ema_assign_op)
        saver = tf.train.Saver(max_to_keep=1, save_relative_paths=True)
        saver.save(sess, export_ckpt)
    
    
    def det_post_process(params: Dict[Any, Any], cls_outputs: Dict[int, tf.Tensor],
                         box_outputs: Dict[int, tf.Tensor], scales: List[float]):
      """Post preprocessing the box/class predictions.
    
      Args:
        params: a parameter dictionary that includes `min_level`, `max_level`,
          `batch_size`, and `num_classes`.
        cls_outputs: an OrderDict with keys representing levels and values
          representing logits in [batch_size, height, width, num_anchors].
        box_outputs: an OrderDict with keys representing levels and values
          representing box regression targets in [batch_size, height, width,
          num_anchors * 4].
        scales: a list of float values indicating image scale.
    
      Returns:
        detections_batch: a batch of detection results. Each detection is a tensor
          with each row as [image_id, ymin, xmin, ymax, xmax, score, class].
      """
      if params.get('combined_nms', None):
        # Use combined version for dynamic batch size.
        nms_boxes, nms_scores, nms_classes, _ = postprocess.postprocess_combined(
            params, cls_outputs, box_outputs, scales)
      else:
        nms_boxes, nms_scores, nms_classes, _ = postprocess.postprocess_global(
            params, cls_outputs, box_outputs, scales)
    
      batch_size = tf.shape(cls_outputs[params['min_level']])[0]
      img_ids = tf.expand_dims(
          tf.cast(tf.range(0, batch_size), nms_scores.dtype), -1)
      detections = [
          img_ids * tf.ones_like(nms_scores),
          nms_boxes[:, :, 0],
          nms_boxes[:, :, 1],
          nms_boxes[:, :, 2],
          nms_boxes[:, :, 3],
          nms_scores,
          nms_classes,
      ]
      return tf.stack(detections, axis=-1, name='detections')
    
    
    def visualize_image(image,
                        boxes,
                        classes,
                        scores,
                        label_map=None,
                        min_score_thresh=0.01,
                        max_boxes_to_draw=1000,
                        line_thickness=2,
                        **kwargs):
      """Visualizes a given image.
    
      Args:
        image: a image with shape [H, W, C].
        boxes: a box prediction with shape [N, 4] ordered [ymin, xmin, ymax, xmax].
        classes: a class prediction with shape [N].
        scores: A list of float value with shape [N].
        label_map: a dictionary from class id to name.
        min_score_thresh: minimal score for showing. If claass probability is below
          this threshold, then the object will not show up.
        max_boxes_to_draw: maximum bounding box to draw.
        line_thickness: how thick is the bounding box line.
        **kwargs: extra parameters.
    
      Returns:
        output_image: an output image with annotated boxes and classes.
      """
      label_map = label_util.get_label_map(label_map or 'coco')
      category_index = {k: {'id': k, 'name': label_map[k]} for k in label_map}
      img = np.array(image)
      vis_utils.visualize_boxes_and_labels_on_image_array(
          img,
          boxes,
          classes,
          scores,
          category_index,
          min_score_thresh=min_score_thresh,
          max_boxes_to_draw=max_boxes_to_draw,
          line_thickness=line_thickness,
          **kwargs)
      return img
    
    
    def visualize_image_prediction(image,
                                   prediction,
                                   label_map=None,
                                   **kwargs):
      """Viusalize detections on a given image.
    
      Args:
        image: Image content in shape of [height, width, 3].
        prediction: a list of vector, with each vector has the format of [image_id,
          ymin, xmin, ymax, xmax, score, class].
        label_map: a map from label id to name.
        **kwargs: extra parameters for vistualization, such as min_score_thresh,
          max_boxes_to_draw, and line_thickness.
    
      Returns:
        a list of annotated images.
      """
      boxes = prediction[:, 1:5]
      classes = prediction[:, 6].astype(int)
      scores = prediction[:, 5]
    
      return visualize_image(image, boxes, classes, scores, label_map, **kwargs)
    
    
    class ServingDriver(object):
      """A driver for serving single or batch images.
    
      This driver supports serving with image files or arrays, with configurable
      batch size.
    
      Example 1. Serving streaming image contents:
    
        driver = inference.ServingDriver(
          'efficientdet-d0', '/tmp/efficientdet-d0', batch_size=1)
        driver.build()
        for m in image_iterator():
          predictions = driver.serve_files([m])
          driver.visualize(m, predictions[0])
          # m is the new image with annotated boxes.
    
      Example 2. Serving batch image contents:
    
        imgs = []
        for f in ['/tmp/1.jpg', '/tmp/2.jpg']:
          imgs.append(np.array(Image.open(f)))
    
        driver = inference.ServingDriver(
          'efficientdet-d0', '/tmp/efficientdet-d0', batch_size=len(imgs))
        driver.build()
        predictions = driver.serve_images(imgs)
        for i in range(len(imgs)):
          driver.visualize(imgs[i], predictions[i])
    
      Example 3: another way is to use SavedModel:
    
        # step1: export a model.
        driver = inference.ServingDriver('efficientdet-d0', '/tmp/efficientdet-d0')
        driver.build()
        driver.export('/tmp/saved_model_path')
    
        # step2: Serve a model.
        with tf.Session() as sess:
          tf.saved_model.load(sess, ['serve'], self.saved_model_dir)
          raw_images = []
          for f in tf.io.gfile.glob('/tmp/images/*.jpg'):
            raw_images.append(np.array(PIL.Image.open(f)))
          detections = sess.run('detections:0', {'image_arrays:0': raw_images})
          driver = inference.ServingDriver(
            'efficientdet-d0', '/tmp/efficientdet-d0')
          driver.visualize(raw_images[0], detections[0])
          PIL.Image.fromarray(raw_images[0]).save(output_image_path)
      """
    
      def __init__(self,
                   model_name: Text,
                   ckpt_path: Text,
                   batch_size: int = 1,
                   use_xla: bool = False,
                   min_score_thresh: float = None,
                   max_boxes_to_draw: float = None,
                   line_thickness: int = None,
                   model_params: Dict[Text, Any] = None):
        """Initialize the inference driver.
    
        Args:
          model_name: target model name, such as efficientdet-d0.
          ckpt_path: checkpoint path, such as /tmp/efficientdet-d0/.
          batch_size: batch size for inference.
          use_xla: Whether run with xla optimization.
          min_score_thresh: minimal score threshold for filtering predictions.
          max_boxes_to_draw: the maximum number of boxes per image.
          line_thickness: the line thickness for drawing boxes.
          model_params: model parameters for overriding the config.
        """
        self.model_name = model_name
        self.ckpt_path = ckpt_path
        self.batch_size = batch_size
    
        self.params = hparams_config.get_detection_config(model_name).as_dict()
    
        if model_params:
          self.params.update(model_params)
        self.params.update(dict(is_training_bn=False))
        self.label_map = self.params.get('label_map', None)
    
        self.signitures = None
        self.sess = None
        self.use_xla = use_xla
    
        self.min_score_thresh = min_score_thresh
        self.max_boxes_to_draw = max_boxes_to_draw
        self.line_thickness = line_thickness
    
      def __del__(self):
        if self.sess:
          self.sess.close()
    
      def _build_session(self):
        sess_config = tf.ConfigProto()
        if self.use_xla:
          sess_config.graph_options.optimizer_options.global_jit_level = (
              tf.OptimizerOptions.ON_2)
        return tf.Session(config=sess_config)
    
      def build(self, params_override=None):
        """Build model and restore checkpoints."""
        params = copy.deepcopy(self.params)
        if params_override:
          params.update(params_override)
    
        if not self.sess:
          self.sess = self._build_session()
        with self.sess.graph.as_default():
          image_files = tf.placeholder(tf.string, name='image_files', shape=[None])
          raw_images = batch_image_files_decode(image_files)
          #raw_images = tf.identity(raw_images, name='image_arrays')
          images, scales = batch_image_preprocess(raw_images, params['image_size'],
                                                  params['mean_rgb'],
                                                  params['stddev_rgb'],
                                                  self.batch_size)
          if params['data_format'] == 'channels_first':
            images = tf.transpose(images, [0, 3, 1, 2])
          class_outputs, box_outputs = build_model(self.model_name, images,
                                                   **params)
          params.update(dict(batch_size=self.batch_size))
          detections = det_post_process(params, class_outputs, box_outputs, scales)
    
          restore_ckpt(
              self.sess,
              self.ckpt_path,
              ema_decay=self.params['moving_average_decay'],
              export_ckpt=None)
    
        self.signitures = {
            'image_files': image_files,
            'image_arrays': raw_images,
            'prediction': detections,
        }
        return self.signitures
    
      def visualize(self, image, prediction, **kwargs):
        """Visualize prediction on image."""
        return visualize_image_prediction(
            image,
            prediction,
            label_map=self.label_map,
            **kwargs)
    
      def serve_files(self, image_files: List[Text]):
        """Serve a list of input image files.
    
        Args:
          image_files: a list of image files with shape [1] and type string.
    
        Returns:
          A list of detections.
        """
        if not self.sess:
          self.build()
        predictions = self.sess.run(
            self.signitures['prediction'],
            feed_dict={self.signitures['image_files']: image_files})
        return predictions
    
      def benchmark(self, image_arrays, trace_filename=None):
        """Benchmark inference latency/throughput.
    
        Args:
          image_arrays: a list of images in numpy array format.
          trace_filename: If None, specify the filename for saving trace.
        """
        if not self.sess:
          self.build()
    
        # init session
        self.sess.run(
            self.signitures['prediction'],
            feed_dict={self.signitures['image_arrays']: image_arrays})
    
        start = time.perf_counter()
        for _ in range(10):
          self.sess.run(
              self.signitures['prediction'],
              feed_dict={self.signitures['image_arrays']: image_arrays})
        end = time.perf_counter()
        inference_time = (end - start) / 10
    
        print('Per batch inference time: ', inference_time)
        print('FPS: ', self.batch_size / inference_time)
    
        if trace_filename:
          run_options = tf.RunOptions()
          run_options.trace_level = tf.RunOptions.FULL_TRACE
          run_metadata = tf.RunMetadata()
          self.sess.run(
              self.signitures['prediction'],
              feed_dict={self.signitures['image_arrays']: image_arrays},
              options=run_options,
              run_metadata=run_metadata)
          with tf.io.gfile.GFile(trace_filename, 'w') as trace_file:
            trace = timeline.Timeline(step_stats=run_metadata.step_stats)
            trace_file.write(trace.generate_chrome_trace_format(show_memory=True))
    
      def serve_images(self, image_arrays):
        """Serve a list of image arrays.
    
        Args:
          image_arrays: A list of image content with each image has shape [height,
            width, 3] and uint8 type.
    
        Returns:
          A list of detections.
        """
        if not self.sess:
          self.build()
        predictions = self.sess.run(
            self.signitures['prediction'],
            feed_dict={self.signitures['image_arrays']: image_arrays})
        return predictions
    
      def load(self, saved_model_dir_or_frozen_graph: Text):
        """Load the model using saved model or a frozen graph."""
        if not self.sess:
          self.sess = self._build_session()
        self.signitures = {
            'image_files': 'image_files:0',
            'image_arrays': 'image_arrays:0',
            'prediction': 'detections:0',
        }
    
        # Load saved model if it is a folder.
        if tf.io.gfile.isdir(saved_model_dir_or_frozen_graph):
          return tf.saved_model.load(self.sess, ['serve'],
                                     saved_model_dir_or_frozen_graph)
    
        # Load a frozen graph.
        graph_def = tf.GraphDef()
        with tf.gfile.GFile(saved_model_dir_or_frozen_graph, 'rb') as f:
          graph_def.ParseFromString(f.read())
        return tf.import_graph_def(graph_def, name='')
    
      def freeze(self):
        """Freeze the graph."""
        output_names = [self.signitures['prediction'].op.name]
        graphdef = tf.graph_util.convert_variables_to_constants(
            self.sess, self.sess.graph_def, output_names)
        return graphdef
    
      def export(self,
                 output_dir: Text,
                 tflite_path: Text = None,
                 tensorrt: Text = None):
        """Export a saved model, frozen graph, and potential tflite/tensorrt model.
    
        Args:
          output_dir: the output folder for saved model.
          tflite_path: the path for saved tflite file.
          tensorrt: If not None, must be {'FP32', 'FP16', 'INT8'}.
        """
        signitures = self.signitures
        signature_def_map = {
            'serving_default':
                tf.saved_model.predict_signature_def(
                    {signitures['image_arrays'].name: signitures['image_arrays']},
                    {signitures['prediction'].name: signitures['prediction']}),
        }
        b = tf.saved_model.Builder(output_dir)
        b.add_meta_graph_and_variables(
            self.sess,
            tags=['serve'],
            signature_def_map=signature_def_map,
            assets_collection=tf.get_collection(tf.GraphKeys.ASSET_FILEPATHS),
            clear_devices=True)
        b.save()
        logging.info('Model saved at %s', output_dir)
    
        # also save freeze pb file.
        graphdef = self.freeze()
        pb_path = os.path.join(output_dir, self.model_name + '_frozen.pb')
        tf.io.gfile.GFile(pb_path, 'wb').write(graphdef.SerializeToString())
        logging.info('Frozen graph saved at %s', pb_path)
    
        if tflite_path:
          height, width = utils.parse_image_size(self.params['image_size'])
          input_name = signitures['image_arrays'].op.name
          input_shapes = {input_name: [None, height, width, 3]}
          converter = tf.lite.TFLiteConverter.from_saved_model(
              output_dir,
              input_arrays=[input_name],
              input_shapes=input_shapes,
              output_arrays=[signitures['prediction'].op.name])
          converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
          tflite_model = converter.convert()
    
          tf.io.gfile.GFile(tflite_path, 'wb').write(tflite_model)
          logging.info('TFLite is saved at %s', tflite_path)
    
        if tensorrt:
          from tensorflow.python.compiler.tensorrt import trt  # pylint: disable=g-direct-tensorflow-import,g-import-not-at-top
          sess_config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
          trt_path = os.path.join(output_dir, 'tensorrt_' + tensorrt.lower())
          trt.create_inference_graph(
              None,
              None,
              precision_mode=tensorrt,
              input_saved_model_dir=output_dir,
              output_saved_model_dir=trt_path,
              session_config=sess_config)
          logging.info('TensorRT model is saved at %s', trt_path)
    
    
    class InferenceDriver(object):
      """A driver for doing batch inference.
    
      Example usage:
    
       driver = inference.InferenceDriver('efficientdet-d0', '/tmp/efficientdet-d0')
       driver.inference('/tmp/*.jpg', '/tmp/outputdir')
    
      """
    
      def __init__(self,
                   model_name: Text,
                   ckpt_path: Text,
                   model_params: Dict[Text, Any] = None):
        """Initialize the inference driver.
    
        Args:
          model_name: target model name, such as efficientdet-d0.
          ckpt_path: checkpoint path, such as /tmp/efficientdet-d0/.
          model_params: model parameters for overriding the config.
        """
        self.model_name = model_name
        self.ckpt_path = ckpt_path
        self.params = hparams_config.get_detection_config(model_name).as_dict()
        if model_params:
          self.params.update(model_params)
        self.params.update(dict(is_training_bn=False))
        self.label_map = self.params.get('label_map', None)
    
      def inference(self, image_path_pattern: Text, output_dir: Text, **kwargs):
        """Read and preprocess input images.
    
        Args:
          image_path_pattern: Image file pattern such as /tmp/img*.jpg
          output_dir: the directory for output images. Output images will be named
            as 0.jpg, 1.jpg, ....
          **kwargs: extra parameters for for vistualization, such as
            min_score_thresh, max_boxes_to_draw, and line_thickness.
    
        Returns:
          Annotated image.
        """
        params = copy.deepcopy(self.params)
        with tf.Session() as sess:
          # Buid inputs and preprocessing.
          raw_images, images, scales = build_inputs(image_path_pattern,
                                                    params['image_size'],
                                                    params['mean_rgb'],
                                                    params['stddev_rgb'])
          if params['data_format'] == 'channels_first':
            images = tf.transpose(images, [0, 3, 1, 2])
          # Build model.
          class_outputs, box_outputs = build_model(self.model_name, images,
                                                   **self.params)
          restore_ckpt(
              sess,
              self.ckpt_path,
              ema_decay=self.params['moving_average_decay'],
              export_ckpt=None)
          # Build postprocessing.
          detections_batch = det_post_process(params, class_outputs, box_outputs,
                                              scales)
          predictions = sess.run(detections_batch)
          # Visualize results.
          for i, prediction in enumerate(predictions):
            img = visualize_image_prediction(
                raw_images[i],
                prediction,
                label_map=self.label_map,
                **kwargs)
            output_image_path = os.path.join(output_dir, str(i) + '.jpg')
            Image.fromarray(img).save(output_image_path)
            print('writing file to %s' % output_image_path)
    
          return predictions
    

    Regards, Debapriya

  • Thank you, Debapriya.

    We're trying the various efficientdet-lite models with preprocessing disabled, and so far the results have been good.

    According to the instructions here, it appears that the only changes needed to the automl code are act_type, h.learning_rate, and h.lr_warmup_init in hparams_config.py, so that's what we had been doing.

    It sounds like you've disabled preprocessing in a TI version of the automl repo? Is that publicly available? Are there any other modifications we should be aware of?

    Regards,

    Andrew

  • Hi Andrew,

    • Glad to know that it worked after disabling the preprocessing.
    • We are not hosting any repository for automl since the changes were quite minimal.
    • This part of the change is missing from the documentation. We will update the documentation based on the feedback that you have given. Thanks for your feedback.
    • There are no other changes. Please let us know if you face any other difficulty in enabling efficientdet-lite models.

    Regards, Debapriya