Object Detector

This article mainly introduces how to use HornedSungem to load the SSD-Mobilenet convolutional neural network on the Android platform to realize Object detection.

Preparation

  1. For details on the configuration environment, please refer to quick start, I will not elaborate here.
  2. Download the model graph_object_SSD required for Object detection, In your Android Studio, create a new assets package under the current module, copy the downloaded model file to this directory.
  3. Because the project needs to process and display the image, the javacv library is use. Developers can download it from GitHub or copy it from the official sample project to their own project.

Implementation

The model file in this article provides 20 object detections,as follows:

  String[] labels = {"aeroplane", "bicycle", "bird", "boat",
      "bottle", "bus", "car", "cat", "chair",
      "cow", "diningtable", "dog", "horse",
      "motorbike", "person", "pottedplant",
      "sheep", "sofa", "train", "tvmonitor"};

Implementation steps:

  • Open HornedSungem device.
  • Create an instance of a Graph that represents a neural network.
  • Processing image data in two modes.
  • Get the returned processing result.
  • The result is displayed on the view.
  • Implemented in two modes:

    1. Use HornedSungem built-in camera:

      int status = openDevice();
      if (status != ConnectStatus.HS_OK) {
          return;
      }
      int id = allocateGraphByAssets("graph_object_SSD");
      if (id < 0) {
          return;
      }
      while (true) {
             byte[] bytes = getImage(0.007843f, 1.0f,id);
             float[] result = getResult(id);
             if (bytes != null && result != null) {
                 opencv_core.IplImage bgrImage = null;
                 if (zoom) {
                     FRAME_W = 640;
                     FRAME_H = 360;
                     bgrImage = opencv_core.IplImage.create(FRAME_W, FRAME_H, opencv_core.IPL_DEPTH_8U, 3);
                     bgrImage.getByteBuffer().put(bytes);
                 } else {
                     FRAME_W = 1920;
                     FRAME_H = 1080;
                     byte[] bytes_rgb = new byte[FRAME_W * FRAME_H * 3];
                     for (int i = 0; i < FRAME_H * FRAME_W; i++) {
                         bytes_rgb[i * 3 + 2] = bytes[i];//r
                         bytes_rgb[i * 3 + 1] = bytes[FRAME_W * FRAME_H + i];//g
                         bytes_rgb[i * 3] = bytes[FRAME_W * FRAME_H * 2 + i];//b
                     }
                     bgrImage = opencv_core.IplImage.create(FRAME_W, FRAME_H, opencv_core.IPL_DEPTH_8U, 3);
                     bgrImage.getByteBuffer().put(bytes_rgb);
                 }
             } else {
                 continue;
             }
           }
      
    2. Using external image data, the calling interface is loadTensor().

      SoftReference<Bitmap> softRef = new SoftReference<>(Bitmap.createBitmap(1280, 720, Bitmap.Config.ARGB_8888));
      Bitmap bitmap = softRef.get();
      allocations[0].copyTo(bitmap);
      Matrix matrix = new Matrix();
      matrix.postScale(300f / 1280, 300f / 720);
      Bitmap newbm = Bitmap.createBitmap(bitmap, 0, 0, 1280, 720, matrix,true);
      int[] ints = new int[300 * 300];
      newbm.getPixels(ints, 0, 300, 0, 0, 300, 300);
      float[] float_tensor = new float[300 * 300 * 3];
      for (int j = 0; j < 300 * 300; j++) {
      float_tensor[j * 3] = Color.red(ints[j]) * 0.007843f - 1;
      float_tensor[j * 3 + 1] = Color.green(ints[j]) * 0.007843f - 1;
      float_tensor[j * 3 + 2] = Color.blue(ints[j]) * 0.007843f - 1;
      }
      int status_load = mFaceDetectorBySelfThread.loadTensor(float_tensor, float_tensor.length, id);
      

Annotation:

  • 7 numbers are a group.
  • The first number of the first set of arrays represents how many Objects are detected, and the remaining are useless.
  • The second number of each group is the object category, the third number is the confidence, and the remaining four numbers are the coordinates of the top, bottom, left, and right.
  • Value type is float32.

Code:

  public HornedSungemFrame getFrameResult(opencv_core.IplImage image, float[] floats) {
  int num = (int) floats[0];
  ArrayList<HornedSungemFrame.ObjectInfo> objectInfos = new ArrayList<>();
  if (num > 0) {
      for (int i = 0; i < num; i++) {
          HornedSungemFrame.ObjectInfo objectInfo = new HornedSungemFrame.ObjectInfo();
          int type = (int) (floats[7 * (i + 1) + 1]);
          int x1 = (int) (floats[7 * (i + 1) + 3] * FRAME_W);
          int y1 = (int) (floats[7 * (i + 1) + 4] * FRAME_H);
          int x2 = (int) (floats[7 * (i + 1) + 5] * FRAME_W);
          int y2 = (int) (floats[7 * (i + 1) + 6] * FRAME_H);
          int wight = x2 - x1;
          int height = y2 - y1;
          int percentage = (int) (floats[7 * (i + 1) + 2] * 100);
          if (type == 0) {
              continue;
          }
          if (percentage <= MIN_SCORE_PERCENT) {
              continue;
          }
          if (wight >= FRAME_W * 0.8 || height >= FRAME_H * 0.8) {
              continue;
          }
          if (x1 < 0 || x2 < 0 || y1 < 0 || y2 < 0 || wight < 0 || height < 0) {
              continue;
          }
          objectInfo.setType(labels[type - 1]);
          objectInfo.setRect(new Rect(x1, y1, x2, y2));
          objectInfo.setScore(percentage);
          objectInfos.add(objectInfo);
        }
    }
  return new HornedSungemFrame(IplImageToBitmap(image), objectInfos, num);
  }

Display of results:

objectDetector

PS: My shooting technique is limited, and it is best to download and run it myself.

Friendly Node: Because Android devices are basically USB2.0, the transmission is time consuming, there will be a feeling of stagnation, so it is not recommended to use 1080P images, you can use 360P images, spread the screen.

The code can be downloaded from GitHub, the address is as follows SungemSDK-AndroidExamples