README.md 2.66 KB
Newer Older
Vít Novotný's avatar
Vít Novotný committed
1
2
3
4
5
6
7
This is the tool for the optical character recognition of medieval texts, which
was developed as a part of [the AHISTO TACR Eta grant project][7], number TL03000365.

## Installation

To install the tool, perform the following steps:

8
1. [Install Python][4], [Docker][5], and [NVIDIA Container Toolkit][9].
Vít Novotný's avatar
Vít Novotný committed
9
10
11
2. Run `make build` from the command line to build the Docker images required by the tool.
3. [Create and activate a Python virtualenv][6] and run `make install` from the command line to install the tool.

Vít Novotný's avatar
Vít Novotný committed
12
13
## Usage

Vít Novotný's avatar
Vít Novotný committed
14
15
16
To quickly recognize text in all images in directory input/ and save the result
to directory output/:

Vít Novotný's avatar
Vít Novotný committed
17
    ahisto-ocr input/ output/
Vít Novotný's avatar
Vít Novotný committed
18
19
20
21

To achieve the best results, you should also enable image super-resolution
(requires GPU) and Google Vision AI (requires paid account):

Vít Novotný's avatar
Vít Novotný committed
22
    ahisto-ocr --super-resolution --google-vision-ai --gpus 10,11,12 --google-api-key key_file input/ output/
Vít Novotný's avatar
Vít Novotný committed
23

24
25
26
27
28
29
30
31
32
33
Here is example output of the tool for two images:

    2022-03-01 08:50:21,494   Copying 2 input images from input/ to Docker volume 2c4abf61c7
    2022-03-01 08:50:21,609   Pre-processing the input images using super-resolution
    2022-03-01 08:50:33,551   Running the first pass of Tesseract with all languages
    2022-03-01 08:51:15,292   Running the second pass of Tesseract with selected languages
    2022-03-01 08:51:56,883   Running Google Vision AI
    2022-03-01 08:51:59,430   Combining Tesseract with Google Vision AI
    2022-03-01 08:52:00,545   Copying 2 OCR texts from Docker volume 9fcf8bcc80 to output/

Vít Novotný's avatar
Vít Novotný committed
34
Run `ahisto-ocr --help` from the command line for more information.
Vít Novotný's avatar
Vít Novotný committed
35

Vít Novotný's avatar
Vít Novotný committed
36
## More Information
Vít Novotný's avatar
Vít Novotný committed
37
38
39
40
41
42
43
44
45
46
47
48

The development of the tool has been documented in the following two conference articles:

- [When Tesseract Does It Alone: Optical Character Recognition of Medieval Texts][8]
- [When Tesseract Brings Friends: Layout Analysis, Language Identification, and
   Super-Resolution for the Optical Character Recognition of Medieval Texts][3]

## Notes

File [`when-tesseract-brings-friends.ipynb`][1] with OCR evaluation experiments
from [the RASLAN 2021 article When Tesseract Brings Friends][3] is available in
[the `ahisto-ocr-eval` repository][2].
Vít Novotný's avatar
Vít Novotný committed
49
50
51

 [1]: https://gitlab.fi.muni.cz/xnovot32/ahisto-ocr-eval/-/blob/master/docs/when-tesseract-brings-friends.ipynb
 [2]: https://gitlab.fi.muni.cz/xnovot32/ahisto-ocr-eval
Vít Novotný's avatar
Vít Novotný committed
52
53
54
55
56
57
 [3]: https://nlp.fi.muni.cz/raslan/2021/paper10.pdf
 [4]: https://www.python.org/downloads/
 [5]: https://docs.docker.com/engine/install/
 [6]: https://docs.python.org/3/library/venv.html
 [7]: https://starfos.tacr.cz/en/project/TL03000365
 [8]: https://nlp.fi.muni.cz/raslan/2020/paper1.pdf
58
 [9]: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker