Skip to content
Snippets Groups Projects
Commit 2305a53f authored by Vít Starý Novotný's avatar Vít Starý Novotný
Browse files

Harden scripts.extract_detected_languages against flat directory structures

parent 12beefbb
No related branches found
No related tags found
No related merge requests found
Pipeline #
...@@ -27,11 +27,15 @@ THRESHOLD = float(sys.argv[5]) / 100.0 ...@@ -27,11 +27,15 @@ THRESHOLD = float(sys.argv[5]) / 100.0
def get_languages_worker(filename): def get_languages_worker(filename):
basename = str(INPUT_OCR_ROOT / filename.parent / filename.stem) basename = INPUT_OCR_ROOT / filename.stem
try: try:
languages = read_page_languages(basename, DETECTED_LANGUAGES, algorithm='OLDA') languages = read_page_languages(str(basename), DETECTED_LANGUAGES, algorithm='OLDA')
except IOError: except IOError:
return 'not-exists' basename = INPUT_OCR_ROOT / filename.parent / filename.stem
try:
languages = read_page_languages(str(basename), DETECTED_LANGUAGES, algorithm='OLDA')
except IOError:
return 'not-exists'
return (filename, languages) return (filename, languages)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment