Halcon Learning Notes: OCR Recognition of Character Arrangement Round or Font Tilt Processing Method

In the last blog Halcon Learning Notes (8) - OCR Recognition Preliminary Template Recognition and Generation of Training Documents In this paper, we focus on the analysis of the main routines of OCR recognition using templates and the formation of their own training files. Next, we analyze how to deal with the character arrangement when the character is circular or the font is inclined.

Third, OCR Recognition of Characters Arranged Roundly or Inclined

Characters in circular arrangement of ocr_cd_print_polar_trans routine

Pictured

How do we deal with such a circular array of digital characters?

This routine describes how to transform a printed symbol into a rectangular coordinate system if it is not a linear image, such as a circular arrangement.
The most valuable part of this routine is the threshold segmentation part and the coordinate transformation part.
The first step is the idea of threshold segmentation. First, the mean_image of the filtered Image is obtained, and the Image Mean and the original Image are used to do the local threshold segmentation dyn_threshold.
Here I have tried to use histogram tool to do threshold directly. Because the gray value of the region of interest is very close to that of the background, the effect of threshold segmentation is very poor. So if we encounter similar regions of interest and background fusion is very large, we can try the method of this routine.

mean_image (Image, ImageMean, 211, 211)
dyn_threshold (Image, ImageMean, RegionDynThresh, 15, 'dark')

At this point we get

Next is the conventional way of thinking - break up

connection (RegionDynThresh, ConnectedRegions)

After scattering, the selection area is selected. The selection area is selected according to the given shape. The possible shapes are'max_area','rectangle 1','rectangle 2', and the region is selected according to the maximum area.

select_shape_std (ConnectedRegions, SelectedRegions, 'max_area', 0)

At this point, the maximum region is as follows, that is, the region of interest to us.

In this paper, we use the method of generating contour region gen_contour_region_xld to draw the boundary contour of the selected region. Then we use circle to fit fit fit_circle_contour_xld for the generated boundary contour, and return the center coordinates (Row, Column), radius Radius of the circle.

gen_contour_region_xld (SelectedRegions, Contours, 'border')
fit_circle_contour_xld (Contours, 'ahuber', -1, 0, 0, 3, 2, Row, Column, Radius, StartPhi, EndPhi, PointOrder)

Then two circles are generated according to the coordinates of the center of the circle, i. e. inner circle and outer circle. Finally, the circle region is obtained by making a difference.

gen_circle (CircleO, Row, Column, Radius - 5)
gen_circle (CircleI, Row, Column, Radius - 30)
difference (CircleO, CircleI, Ring)

Step 2: Coordinate transformation

The operator polar_trans_image_ext which converts polar coordinates to rectangular coordinates. In this operator, the central coordinates of polar coordinates are input, and the angles of starting and ending transformation are as follows.

polar_trans_image_ext (Image, ImagePolar, Row, Column, 0, rad(360), Radius - 30, Radius - 5, WidthP, HeightP, 'bilinear')

Get the conversion result

By rotating the conversion result 180 degrees, we can get the part we need.

dev_open_window (0, 0, WidthP, HeightP, 'black', WindowHandle2)
rotate_image (ImagePolar, ImageRotate, 180, 'constant')

Here, WidthP and HeightP are predefined, that is, the width and height of the region of interest after flattening.

The last step is to segment the numbers and use OCR classifier to recognize them.
The first half of the segmentation is similar to the first part, and the second part is a common form of intersection. Then, classifiers are created, classifiers are trained and test parts are created, without further elaboration.

Complete procedure:

* 
* This example demonstrates how to perform OCR
* of symbols printed along a non-linear pattern.
* In particular, this example reads the characters
* printed on a CD
* 
dev_update_off ()
dev_close_window ()
WidthP := 900
HeightP := 20
read_image (Image, 'ocr/cd_print')
get_image_size (Image, Width, Height)
dev_open_window (HeightP + 60, 0, Width * 2 / 3, Height * 2 / 3, 'black', WindowHandle)
set_display_font (WindowHandle, 16, 'mono', 'true', 'false')
* 
* Show original image
dev_display (Image)
disp_message (WindowHandle, 'Read the number on the outer ring', 'window', 12, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
* 
* Segment disc in which the characters have been printed
mean_image (Image, ImageMean, 211, 211)
dyn_threshold (Image, ImageMean, RegionDynThresh, 15, 'dark')
connection (RegionDynThresh, ConnectedRegions)
select_shape_std (ConnectedRegions, SelectedRegions, 'max_area', 0)
gen_contour_region_xld (SelectedRegions, Contours, 'border')
fit_circle_contour_xld (Contours, 'ahuber', -1, 0, 0, 3, 2, Row, Column, Radius, StartPhi, EndPhi, PointOrder)
gen_circle (CircleO, Row, Column, Radius - 5)
gen_circle (CircleI, Row, Column, Radius - 30)
difference (CircleO, CircleI, Ring)
* 
dev_set_draw ('margin')
dev_set_color ('green')
dev_set_line_width (3)
dev_display (Ring)
Message := '1. Segment ring'
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
* 
* Rectify the region through a polar transformation
* so that the characters now are aligned along an
* horizontal line
polar_trans_image_ext (Image, ImagePolar, Row, Column, 0, rad(360), Radius - 30, Radius - 5, WidthP, HeightP, 'bilinear')
dev_open_window (0, 0, WidthP, HeightP, 'black', WindowHandle2)
rotate_image (ImagePolar, ImageRotate, 180, 'constant')
* 
* Segment the characters
mean_image (ImageRotate, ImageMeanRotate, 51, 9)
dyn_threshold (ImageRotate, ImageMeanRotate, RegionDynThreshChar, 5, 'dark')
connection (RegionDynThreshChar, ConnectedRegions1)
select_shape (ConnectedRegions1, SelectedRegions, ['area','width'], 'and', [30,4], [150,10])
sort_region (SelectedRegions, SortedRegions, 'character', 'false', 'column')
* Remove distractors which happen to have similar dimensions to the characters.
* From all the candidate regions pickup those consisting of dark regions
* on light background
threshold (ImageMeanRotate, Region, 90, 255)
intersection (SelectedRegions, Region, RegionIntersection)
* Filter out resulting empty regions
area_center (RegionIntersection, Area, Row1, Column1)
select_mask_obj (RegionIntersection, Characters, Area [>] 0)
* 
dev_display (ImageRotate)
Message := [Message,'2. Calculate polar transform']
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
* 
* Read out
read_ocr_class_mlp ('Industrial_0-9A-Z_NoRej', OCRHandle)
sort_region (Characters, SortedRegions, 'character', 'true', 'row')
do_ocr_multi_class_mlp (SortedRegions, ImageRotate, OCRHandle, Class, Confidence)
* Correct zeros that are mistaken as capital O's.
* In a more general situation one may use a more
* complex regular expression or else the operator
* do_ocr_word_mlp()
tuple_regexp_replace (sum(Class), 'O', '0', Result)
* 
dev_set_colored (6)
dev_set_draw ('fill')
dev_display (RegionIntersection)
Message := [Message,'3. Segment and read text']
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_message (WindowHandle, Result, 'image', Height / 2 - 20, Width / 2 - 150, 'black', 'true')
text_line_slant routine font tilt

As shown in the figure, how should we recognize an oblique character by OCR?

We are Halcon Learning Notes (V) Geometric Positioning + Affine + License Plate Recognition In this paper, we use the rotation matrix to do affine transformation to correct the skewed characters. The same is true for the skewed characters. The difference is that we use text_line_slant operator to get the skewed angle SlantAngle of the characters in the region. Let's see how it works.
First step: affine transformation
First we get the tilt angle.
text_line_slant (Image, Image, 50, rad(-45), rad(45), SlantAngle)
Then the affine matrix hom_mat2d_slant is obtained according to the homogeneous unit matrix hom_mat2d_identity. That is to say, the slant character is corrected by rotating the SlantAngle angle on the unit matrix.

hom_mat2d_identity (HomMat2DIdentity)
hom_mat2d_slant (HomMat2DIdentity, -SlantAngle, 'x', 0, 0, HomMat2DSlant)
affine_trans_image (Image, ImageRectified, HomMat2DSlant, 'constant', 'true')

At this point, we get the image after the pendulum.

Step 2: Threshold segmentation of characters
Firstly, the characters are segmented. Because the background is quite different from the characters, we can use histogram assistant to thresholding.
threshold (ImageRectified, Region, 0, 100)
obtain

The last three characters were observed to be cohesive, so they were disconnected by eroding erosion_circle
erosion_circle (Region, RegionErosion, 3)

Expansion after disconnection to restore basic shape
dilation_rectangle1 (RegionErosion, RegionDilation, 1, 20)


Routine Operations - Disperse, Not More
connection (RegionDilation, ConnectedRegions)
Routine operation - finding intersection
intersection (ConnectedRegions, Region, RegionIntersection)

At this time, we find that some characters are broken into multiple connected domains. What can we do? partition_dynamic width is used to divide them into matrices.
partition_dynamic (RegionDilation, Characters, 100, 20)

At this point, all characters are connected to a single domain.

Next is the routine operation, sorting - reading OCR classifier - Testing
No more elaboration
Attach complete code

dev_update_off ()
read_image (Image, 'dot_print_slanted')

get_image_size (Image, Width, Height)
dev_close_window ()
dev_open_window (0, 0, Width , Height , 'black', WindowHandle)
dev_set_draw ('margin')
dev_set_colored (12)
dev_set_line_width (3)

* Correct slant
text_line_slant (Image, Image, 50, rad(-45), rad(45), SlantAngle)
hom_mat2d_identity (HomMat2DIdentity)
hom_mat2d_slant (HomMat2DIdentity, -SlantAngle, 'x', 0, 0, HomMat2DSlant)
affine_trans_image (Image, ImageRectified, HomMat2DSlant, 'constant', 'true')
   
threshold (ImageRectified, Region, 0, 100)
erosion_circle (Region, RegionErosion, 3)
dilation_rectangle1 (RegionErosion, RegionDilation, 1, 20)
connection (RegionDilation, ConnectedRegions)
intersection (ConnectedRegions, Region, RegionIntersection)

partition_dynamic (RegionDilation, Characters, 100, 20)

sort_region (Characters, SortedRegions, 'character', 'true', 'row')

read_ocr_class_mlp ('Industrial_0-9A-Z_NoRej', OCRHandle)
do_ocr_multi_class_mlp (SortedRegions, ImageRectified, OCRHandle, Class, Confidence)

Posted on Tue, 13 Aug 2019 20:46:25 -0700 by sasi