Abstract
The estimation of the pose of surgical instruments is important in
Robot-assisted Minimally Invasive Surgery (RMIS) to assist surgical
navigation and enable autonomous robotic task execution. The performance
of current instrument pose estimation methods deteriorates significantly
in the presence of partial tool visibility, occlusions, and changes in
the surgical scene. In this work, a vision-based framework is proposed
for markerless estimation of the 6DoF pose of surgical instruments. To
deal with partial instrument visibility, a keypoint object
representation is used and stable and accurate instrument poses are
computed using a PnP solver. To boost the learning process of the model
under occlusion, a new mask-based data augmentation approach has been
proposed. To validate our model, a dataset for instrument pose
estimation with highly accurate ground truth data has been generated
using different surgical robotic instruments. The proposed network can
achieve submillimeter accuracy and our experimental results verify its
generalisability to different shapes of occlusion.