Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Object Detection #15

Open
1 of 3 tasks
anirudhs001 opened this issue May 30, 2021 · 0 comments
Open
1 of 3 tasks

[WIP] Object Detection #15

anirudhs001 opened this issue May 30, 2021 · 0 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@anirudhs001
Copy link
Collaborator

anirudhs001 commented May 30, 2021

The drone needs to do object detection to have any real utility. That means finding the 3D coordinates of nearby objects(as many as possible). These detections will then be used to aid in Path planning, since the motion of the detected obstacles can be used to model and predict their state in the future which will help plan safer trajectories.

Since the bot is equipped with a lidar and a stereo camera as well, we decided to go ahead with just stereo cameras for now. This is mainly to make the pipeline as streamlined and fast as possible. The process of getting the depth information from stereo image pairs is called stereopsis. We and all other 2-eyed animals do it all the time. Checkout OpenCV: Depth Map from Stereo Images

The current plan is to directly apply the object detector on the stereo images, without calculating the disparity map or the point cloud. Again, this is done to speed things up; Disparity calculation takes time(checkout this link at computerphile: Stereo 3D Vision to get a feel.
Still, there are existing ros packages like stereo_image_proc - ROS Wiki that do it, if one wants to try.

What’s done and what needs to be:

  • Research and find the approaches that give a good mix of accuracy and speed. Here’s an non-exhaustive list: vision comparison.
    The candidates under consideration are either RCNN(e.g. Stereo-RCNN) or YOLO based (e,g. Complex YOLO)
    RCNN based algorithms are more accurate but slower; and ones like YOLO which directly output in a single pass are faster but less accurate. Currently trying to improve on both aspects.
  • [WIP] Recreate/Modify and implement
    Implement the DL pipeline.
    The current network uses HRNet as the base, and the KITTI stereo dataset to train.
  • Integrate with our quadcopter
    The current pipeline uses the camera calibration data from KITTI, which is going to be different for our model. So make necessary changes and create a ROS node that reads data from the stereo camera and outputs the Object points.
@anirudhs001 anirudhs001 added the good first issue Good for newcomers label May 30, 2021
@anirudhs001 anirudhs001 self-assigned this May 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant