Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
O
Optical Character Recognition Experiments
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
nlp
ahisto-modules
Optical Character Recognition Experiments
Graph
bfd9301f545c3a6cfe43a40a0ed46068e89f4cf6
Select Git revision
No matching results
An error occurred while fetching branches. Retry the search.
An error occurred while fetching tags. Retry the search.
You can move around the graph by using the arrow keys.
Begin with the selected commit
Created with Raphaël 2.2.0
6
Sep
30
Aug
26
24
2
Mar
26
Feb
25
12
13
Dec
6
15
Nov
13
12
11
4
Sep
3
2
26
Aug
25
24
20
19
18
17
29
Jul
28
27
26
23
19
14
13
5
2
1
26
Jun
21
16
11
27
May
20
28
Apr
11
8
5
31
Mar
25
19
8
4
16
Feb
31
Jan
2
26
Nov
24
20
19
14
9
31
Oct
24
15
2
21
Sep
10
8
6
3
2
24
Aug
22
12
26
Jul
19
14
11
7
4
27
Jun
Add ALTO output for PERO OCR
manylang
manylang
Set up Docker image deployment
Make `*.ground-truth` files more human-readable
Ensure `*.ground-truth` always exists after combining
Use PERO OCR basenames for output
Do not copy /pero-ocr/*.txt to /output of PERO OCR unless the page is single-column
Add `scripts/combine_tesseract_with_pero_ocr_docker.py`
Add scripts.combine_tesseract_with_google_docker
Add scripts.combine_tesseract_with_google_docker
master
master
Update Dockerfile
Update Dockerfile
Harden scripts.extract_detected_languages against flat directory structures
Harden scripts.extract_detected_languages against flat directory structures
Update Dockerfile
Update Dockerfile
Update Dockerfile
Update Dockerfile
Update Dockerfile
Update Dockerfile
Update Dockerfile
Update Dockerfile
Fix Dockerfile
Fix Dockerfile
Fix Dockerfile
Fix Dockerfile
Add Dockerfile
Add Dockerfile
Add best-ocr-texts directory
Subdivide LINDAT dataset into smaller archives
Produce better annotated HOCR files using templates
Update notebook images for the RASLAN paper
Update notebook images for the RASLAN paper
Replace CDBT6 with CDB VI in notebook
Emphasize best results in notebook
Report Accuracy@1 in notebook instead of Spearman's rho
Fix type and style errors
Import packages more lazily to guard against SIGSEGVs
Fix type and style errors
Import packages more lazily to guard against SIGSEGVs
Add scripts.produce_lindat_dataset
Loading