0 INTRODUCTION

During the lifting operation of port cranes, it is necessary to position the grab boom at the designated position, accurately control the angle and direction of the boom, and enable it to complete the task of grasping and transporting objects. If the position or angle deviation of the crane grasping the boom is too large, it can lead to low efficiency, inaccurate operation, and major safety accidents caused by operational errors. Therefore, by using image processing technology to detect the corner coordinates of the crane’s grab boom, the position and direction of the crane’s grab boom can be monitored in real-time, and whether the boom is at the correct angle and position can be determined, thereby ensuring efficient and accurate alignment between the crane’s grab boom and the segment beam body lifting hole. At the same time, the detection of the corner coordinates of the crane’s grab boom also lays the foundation for the autonomous control and intelligent application of the crane[1-2]. By detecting and analyzing the coordinates of corner points, the crane can achieve automated control and intelligent decision-making, thereby further improving the operational efficiency and safety of the crane.
Corner detection algorithms aim to detect all possible planar coordinate information of corners in an image. During the process of extracting useful information from an image through feature extraction, edge detection, and other methods, it is inevitable to lose some useful information or introduce unnecessary interference. Corner detection algorithms usually start from the global perspective and detect all possible corner coordinates in the input image information. Therefore, this series of corner detection algorithms cannot directly provide corner coordinates for certain specific positions. Domestic and foreign scholars have conducted extensive research on the direction of diagonal point detection. Harris corner detection[3], Shi Tomasi corner detection[4] based on Harris corner detection, Fast corner detection[5-6], Sift feature detection algorithm[7-9], etc. These corner detection algorithms are suitable for matching in massive feature databases. The above literature mainly focuses on detecting all corners in the image and cannot be used for corner localization at certain specified positions.
Semantic segmentation [10-14] based on deep learning can segment images at the pixel level, achieving precise and accurate segmentation of images, and distinguishing different objects, object boundaries, and backgrounds in the image. Semantic segmentation can automatically segment the crane’s grab boom, but this series of semantic segmentation methods often make it difficult to accurately recognize and segment different objects when dealing with complex scenes and small targets. At the same time, semantic segmentation methods based on deep learning require processing a large amount of data and require a large amount of computational resources, which increases costs. Before applying this series of semantic segmentation methods, it is usually necessary to annotate and train a large number of high-quality images. The time cost of collecting images and annotating data is extremely high. If the quality of annotation and image acquisition is not good, it will lead to low accuracy of the trained model and even complete segmentation failure. Compared to semantic segmentation, Otsu algorithm is suitable for most images and can quickly and accurately classify images into foreground and background categories. It does not require prior information and has good robustness to noise. Ashish [15] et al. introduced an optimal multi-level 3D Otsu image thresholding technique and proposed a 1-D-Otsu thresholding method based on the CFA cuttlefish algorithm to reduce noise and weak edge effects, optimizing the traditional Otsu algorithm for color image segmentation. Jiqing Chen [16] et al. proposed a navigation extraction method for greenhouse cucumber harvesting robots using predicted point Hough transform. A new grayscale factor was used for image segmentation, and finally the predicted point Hough transform was used to fit the navigation path. The calculation time of this method was reduced by 35.20ms compared to traditional Hough transform, but the grayscale factor in this method is prone to image oversegmentation. Ziwen Chen [17] et al. proposed a vegetable crop extraction method based on automatic Hough transform accumulation threshold. The image was segmented using a component independent of light in the Lab color space, and the feature points of crop rows were extracted using the dual threshold segmentation vertical projection method. Finally, the cluster analysis in the accumulator was clustered into the same number of classes as the number of crop rows using the k-means clustering method. This method provides a certain basis for solving the robustness and adaptability problems of algorithms under multiple environmental variables, but the accuracy of line fitting needs to be improved.
The application scenarios and issues presented in the above literature for precise calibration and alignment of the crane’s grab boom and segmental beam body lifting holes. This article provides a new approach to corner positioning, which involves calculating the intersection coordinates of the fitted straight line to locate the three corner coordinates of the two sides of the crane’s grab boom. Use these three corner coordinates to fit the plane and determine the position and direction of the crane’s grab boom in space.
Firstly, in the image preprocessing stage, a grayscale difference map is constructed through the R and G channels of the RGB color space. The resulting difference grayscale map avoids the problem of oversegmentation and undersegmentation of the target object, making the grayscale histogram of the foreground and background appear bimodal, which is conducive to Otsu’s threshold segmentation. And use the open close operation to denoise the small impurities in the Canny edge detection results. Secondly, in the edge line detection and fitting process of the crane grabbing the boom, this paper proposes an optimal adaptive threshold determination method to screen the number of votes in the clustering results, eliminate interfering straight lines, and then improve the clustering centroid calculation method by using weight calculation formulas based on different proportion of votes, replacing the original clustering centroid as the basis for line fitting. Finally, in the corner detection and plane fitting process, the coordinates of the three corner points of the crane’s grab boom are calculated, and the plane information is determined using the three corner point coordinates. The research results of this article provide a methodological basis for solving the algorithm accuracy and robustness problems of port cranes under multiple environmental variables.