Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding multilanguage selection #418

Open
vivadavid opened this issue Jan 20, 2024 · 3 comments
Open

Adding multilanguage selection #418

vivadavid opened this issue Jan 20, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@vivadavid
Copy link

Hello,

I'd like to suggest adding the possibility of selecting more than one OCR language.

Talking about my personal experience, the text of the vast majority of my images is in English or Spanish, so it'd save me time if I could just have both Spanish and English permanently selected unless I'm sure that I'll stick to one language for a good while.

In many cases, switching from one language to another for each individual image might not be worth it, and this multilanguage support would prove even more useful when using the Extract Images from Files in Folder tool, as you may be dealing with many different images, each in a particular language.

I can also think of a scenario where a number of images contain multilingual text, but this might not happen frequently.

As I rarely use a different language from Spanish and English, this feature would be good enough for me, but another (complementary) approach to consider would be to add an automatic mode where the OCR engine detects the language. I'm not sure that I have seen this in Tesseract, but I saw it in Whisper the other day and I thought it was a good idea.

Anyway, these are my suggestions, hoping that you may take them into account for a future release.

Thank you for your time.

@vivadavid vivadavid added the enhancement New feature or request label Jan 20, 2024
@morozover
Copy link

I need this feature too 👍

@TheJoeFin
Copy link
Owner

TheJoeFin commented Apr 7, 2024

Just to layout what is possible vs what would be ideal. Tesseract could do multi-language, but the Windows OCR API cannot do multi-language. So if this feature was implemented it would only be possible during FullScreen Grab and batch processing through the Edit Text Window.

@vivadavid
Copy link
Author

Hi! That would be perfect for me, and probably great for many users too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants