Only these three steps are needed, and image recognition can be realized with Java

Recently, I have studied how to simulate the behavior of browser with Java at my leisure. I encountered the problem of ID code when I logged in the experiment. So I searched for the image ID code about Java on the Internet. Because the relevant articles searched on the Internet are not suitable for my configuration, so I opened this blog to record the process and solution of mining pit.

For image recognition, TESSERACT-OCR can be used. However, this method needs to download the software and install the environment on the computer. The portability is not high. To use Tess4J, you only need to download the relevant Jar package, import the project, and then package the project to run everywhere.

First of all, let's talk about my computer and JDK version

  • Computer: MacBook
  • JDK version: 1.8

Let's talk about the steps

  1. Introducing Tess4J Jar package
  2. Using brew to install testeract
  3. Download language pack

It only needs the above three simple steps to use Java for picture verification code recognition on the local machine. Next, we will discuss these three processes in detail.

Introduction of Tess4J

If it's Maven, just introduce it below


<dependency> 
 <groupid>net.sourceforge.tess4j</groupid> 
 <artifactid>tess4j</artifactid> 
 <version>3.2.1</version> 
</dependency>

If it's Gradle

compile 'net.sourceforge.tess4j:tess4j:3.2.1'

Using brew to install testeract

Directly use the command to install

brew install tesseractt

However, when using brew, I encountered the problem of slow download. I checked the download image that needs to be replaced.

# Step 1
cd "$(brew --repo)"
git remote set-url origin https://mirrors.tuna.tsinghua.edu.cn/git/homebrew/brew.git

# Step 2
cd "$(brew --repo)/Library/Taps/homebrew/homebrew-core"
git remote set-url origin https://mirrors.tuna.tsinghua.edu.cn/git/homebrew/homebrew-core.git

#Step 3
brew update

Note that you need to wait for a while because the resource is to be updated.

After the update, use brew update. brew install is much faster. It won't be stuck for half a day. The replacement image is complete.

If you want to go back to the original

cd "$(brew --repo)"
git remote set-url origin https://github.com/Homebrew/brew.git
 
cd "$(brew --repo)/Library/Taps/homebrew/homebrew-core"
git remote set-url origin https://github.com/Homebrew/homebrew-core
 
brew update

Download language pack

Language pack download address , download the language pack from GitHub and unzip it to a location. Then write the following code.

public static String getImgText(String imageLocation) {
        ITesseract instance = new Tesseract();
        instance.setDatapath("The path of the stored language pack");
        try
        {
            String imgText = instance.doOCR(new File(imageLocation));
            return imgText;
        }
        catch (TesseractException e)
        {
            e.getMessage();
            return "Error while reading image";
        }
    }

    public static void main(String[] args) {

        System.out.println(getImgText("Image address to identify"));
    }

Next we can use Java for image recognition. For example, the following picture

We can see that the output is

Later, it was found that this project could not be used as an identification verification code, because now the verification code is basically hollow or irregular, and Java can't recognize it, so we need to find another way to identify it.

Code address involved in the project

Code address involved in the project

Code address involved in the project

Tags: Programming brew git Java github

Posted on Sun, 10 May 2020 02:52:06 -0700 by livepjam