This is a quick tutorial to explain how to extract text from images using regular expressions. Imgregex is a free online tool where you can upload a photo and extract machine-readable text from it and then you can simply grep it. By grep-ing means, here you can use regular expressions to extract matching text. With this, you can extract phone numbers, emails from the image easily. One of the best use case of this tool is in extraction of contact information from business card images. There is an API for this as well but for now that is a work in progress.
There are many online OCR tools to extract text from various images, but they don’t offer you an option to filter the extracted text. For that, you will need to do the extra work to get the final data. But here you don’t need to do that. The Imgregex website only takes an input image along with a regular expression and produce the final result in JSON format. You can use any regex to extract text having any pattern with ease. However, don’t consider this to be 100 percent accurate for every image. The text extraction accuracy depends on the condition of the input image.
How to Extract Text from Images using Regular Expressions?
For now, you can use this website without any registration. You just access the main homepage of the website here and then simply start using it. Here you simply upload or drag drop the input image from which you want to extract text. Next, you have to specify the regular expression to define a matching pattern to extract text. Regular expressions are easy to learn but require practice, you can learn RegEx on this website.
After you have specified the regular expression to extract the text based on a pattern, you can simply hit the “SEND” button and see the output in JSON. The output is displayed on the right pane as you can see below. In my case, I uploaded a business card and specified a regular expression of extract USA phone number and it did a pretty good jib there.
At this point, now you know what this website is all about and how it works. You only give it input image to extract text along with a regular expression. After that, you can simply get the result and do what you want. The JSON results that it creates can also be converted to other via formats as CSV using some JSON to CSV converters.
Final thoughts
If you are looking for a way to extract text from image by filtering it then you have come to the right place. Just use the website I have mentioned here and then simply do whatever you want. Give it a clean image for high accuracy and then get the output in a couple of seconds. It is simple as that.