Abstract:
In today's era of automation, the integration of computer vision (CV) and robotics
has paved the way for advanced autonomous systems. This research presents an
approach to autonomous object manipulation using a robot equipped with a camera.
The system performs panoptic segmentation (PS), a CV technique, to identify and
characterize objects within the robot's environment. In segmentation, the system
calculates precise pixel-level object masks and shape information. The architecture
results are enhanced through the implementation of task enhanced attention. Providing
accuracy of 0.99159 on 100 images of COCO validation 2017 dataset. These pixel
coordinates are then converted into real-world coordinates, enabling the robot to
interact with objects. The core contribution of this work lies in the identification of
objects, conversion of 2D camera coordinate to 3D world coordinate and to the robot's
decision-making process.
Once an object is identified, the robot autonomously generates a trajectory to reach
the target, adjusts its end-effector accordingly, and performs a successful object
handling task. The simulation of the robotic system was facilitated using RoboDK,
underscoring the software's integral role in advancing the field of robotics and
automation. Applications extend to a wide range of industries, including logistics,
manufacturing, and healthcare. This work serves as a synergistic relationship between
computer vision and robotics, propelling humans closer to a future where autonomous
robots seamlessly integrate into our daily lives, augmenting productivity, and
streamlining complex tasks.