Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
Michal Štefánik
ARQMath-eval
Commits
86b78c09
Commit
86b78c09
authored
May 12, 2020
by
Vít Novotný
Browse files
Revert "Evaluate math concatenation"
parent
97b2d1a0
Pipeline
#60695
failed with stage
Changes
10
Pipelines
1
Expand all
Hide whitespace changes
Inline
Side-by-side
task1-votes/xnovot32/LEGEND.md
View file @
86b78c09
...
...
@@ -2,7 +2,6 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters:
-
Dataset:
-
arxmliv, 08, 2019, no-problem – the no
\_
problem subset (150,701 documents) of
[
the arXMLiv 08.2019 dataset
][
arxmliv-08-2019
]
-
concat-math – whether adjacent math tokens are contatenated into mathematical expressions
-
phrases – how many times
[
collocation detection
][]
and bigram merging are iteratively applied to the corpus:
-
0 – the text and math tokens in the corpus are unchanged,
-
N –
[
collocation detection
][]
and bigram merging are iteratively applied to both text and math tokens in the corpus N times
...
...
task1-votes/xnovot32/README.md
View file @
86b78c09
...
...
@@ -4,13 +4,12 @@ underscores (`_`) replaced with a comma and a space for improved readability.
| nDCG | Result name |
|------|:------------|
| 0.7613 | infix, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7612 | prefix, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7607 | slt, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7606 | opt, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7603 | infix, concat-math=True, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7602 | latex, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7600 | nomath, concat-math=False, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7613 | infix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7612 | prefix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7607 | slt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7606 | opt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7602 | latex, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
| 0.7600 | nomath, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 |
|
*0.7578*
|
*random*
|
## Legend
...
...
@@ -19,7 +18,6 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters:
-
Dataset:
-
arxmliv, 08, 2019, no-problem – the no
\_
problem subset (150,701 documents) of
[
the arXMLiv 08.2019 dataset
][
arxmliv-08-2019
]
-
concat-math – whether adjacent math tokens are contatenated into mathematical expressions
-
phrases – how many times
[
collocation detection
][]
and bigram merging are iteratively applied to the corpus:
-
0 – the text and math tokens in the corpus are unchanged,
-
N –
[
collocation detection
][]
and bigram merging are iteratively applied to both text and math tokens in the corpus N times
...
...
task1-votes/xnovot32/infix_concat-math=True_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
deleted
100644 → 0
View file @
97b2d1a0
This diff is collapsed.
Click to expand it.
task1-votes/xnovot32/infix_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/infix_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xnovot32/latex_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/latex_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xnovot32/nomath_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/nomath_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xnovot32/opt_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/opt_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xnovot32/prefix_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/prefix_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xnovot32/slt_
concat-math=False_
phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
→
task1-votes/xnovot32/slt_phrases=0_alpha=0.05_bucket=2000000_iter=5_max-n=6_min-alpha=0_min-count=5_min-n=3_negative=5_sample=0.0001_sg=1_size=300_window=5_workers=64_dominant=True_nonzero-limit=100_symmetric=True_exponent=4.0_threshold=-1.0.tsv
View file @
86b78c09
File moved
task1-votes/xstefan3/README.md
View file @
86b78c09
...
...
@@ -4,8 +4,6 @@ underscores (`_`) replaced with a comma and a space for improved readability.
| nDCG | Result name |
|------|:------------|
| 0.7796 | sbert, validation, html-removal, exid9 |
| 0.7653 | sbert, validation, nopreproc, exid4 |
| 0.7651 | sbert, validation, vit, preproc, prefix, exid26 |
| 0.7603 | sbert, validation, no-token-type, datav1.0, exid25 |
| 0.7602 | sbert, validation, prefix, datav1.0, exid23 |
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment