Loading task1-votes/README.md +2 −2 Original line number Diff line number Diff line Loading @@ -3,6 +3,6 @@ This table contains the best result for every user on the *task1-votes* task. | nDCG | Result name | User | |:-----|:------------|------| | 0.7796 | sbert, validation, html-removal, exid9 | xstefan3 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | xnovot32 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8, workers=64 | ayetiran | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | xnovot32 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8 | ayetiran | | *0.7578* | *random* | *Mr. Random* | task1-votes/ayetiran/README.md +2 −4 Original line number Diff line number Diff line Loading @@ -4,8 +4,8 @@ underscores (`_`) replaced with a comma and a space for improved readability. | nDCG | Result name | |------|:------------| | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8, workers=64 | | 0.7579 | prefix, phrases=2, alpha=0.05, dm=1, dm-concat=1, epochs=5, hs=1, min-alpha=0, min-count=5, vector-size=400, window=4, workers=64 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8 | | 0.7579 | prefix, phrases=2, alpha=0.05, dm=1, dm-concat=1, epochs=5, hs=1, min-alpha=0, min-count=5, vector-size=400, window=4 | | *0.7578* | *random* | ## Legend Loading @@ -32,10 +32,8 @@ The [Formula2Vec system][scm-at-arqmath] recogizes the following parameters: - min-count – the minimum term frequency - vector-size – vector dimensions - window – window size - workers – the number of threads used for [hogwild][] - epochs – the number of epochs [arxmliv-08-2019]: https://sigmathling.kwarc.info/resources/arxmliv-dataset-082019/ [collocation detection]: https://radimrehurek.com/gensim/models/phrases.html [hogwild]: https://papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent [scm-at-arqmath]: https://gitlab.fi.muni.cz/xnovot32/scm-at-arqmath (Soft Cosine Measure at ARQMath) task1-votes/ayetiran/prefix_phrases=2_alpha=0.05_dm=1_dm-concat=1_epochs=5_hs=1_min-alpha=0_min-count=5_vector-size=400_window=4_workers=64.tsv→task1-votes/ayetiran/prefix_phrases=2_alpha=0.05_dm=1_dm-concat=1_epochs=5_hs=1_min-alpha=0_min-count=5_vector-size=400_window=4.tsv +0 −0 File moved. View file task1-votes/ayetiran/prefix_phrases=2_alpha=0.1_dm=0_dm-concat=1_epochs=5_hs=0_min-alpha=0_min-count=5_negative=12_vector-size=300_window=8_workers=64.tsv→task1-votes/ayetiran/prefix_phrases=2_alpha=0.1_dm=0_dm-concat=1_epochs=5_hs=0_min-alpha=0_min-count=5_negative=12_vector-size=300_window=8.tsv +0 −0 File moved. View file task1-votes/xnovot32/README.md +37 −39 Original line number Diff line number Diff line Loading @@ -4,43 +4,43 @@ underscores (`_`) replaced with a comma and a space for improved readability. | nDCG | Result name | |------|:------------| | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=6, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=5, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=1600, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=1, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=0, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=400, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | infix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=10, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=1600, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=1000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=3, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=8000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=4000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=4, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7607 | slt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7606 | opt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7602 | latex, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7600 | nomath, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7598 | nomath, phrases=1, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7596 | nomath, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=6, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=5, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=1600, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=1, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=0, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=400, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | infix, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=10, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=1600, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=1M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=3, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=8M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=4M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=4, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7607 | slt, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7606 | opt, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7602 | latex, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7600 | nomath, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7598 | nomath, phrases=1, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7596 | nomath, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | *0.7578* | *random* | ## Legend Loading Loading @@ -70,7 +70,6 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters: - sg – the skipgram model - size – vector dimensions - window – window size - workers – the number of threads used for [hogwild][] - Soft Cosine Measure: - dominant – whether the term similarity matrix will be strongly diagonally dominant - nonzero-limit – the maximum number of non-zero elements outside the diagonal in a single column of the term similarity matrix Loading @@ -80,6 +79,5 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters: [arxmliv-08-2019]: https://sigmathling.kwarc.info/resources/arxmliv-dataset-082019/ [collocation detection]: https://radimrehurek.com/gensim/models/phrases.html [hogwild]: https://papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent [scm-at-arqmath]: https://gitlab.fi.muni.cz/xnovot32/scm-at-arqmath (Soft Cosine Measure at ARQMath) [term similarity matrix formula]: https://arxiv.org/pdf/2003.05019.pdf#page=4 Loading
task1-votes/README.md +2 −2 Original line number Diff line number Diff line Loading @@ -3,6 +3,6 @@ This table contains the best result for every user on the *task1-votes* task. | nDCG | Result name | User | |:-----|:------------|------| | 0.7796 | sbert, validation, html-removal, exid9 | xstefan3 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | xnovot32 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8, workers=64 | ayetiran | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | xnovot32 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8 | ayetiran | | *0.7578* | *random* | *Mr. Random* |
task1-votes/ayetiran/README.md +2 −4 Original line number Diff line number Diff line Loading @@ -4,8 +4,8 @@ underscores (`_`) replaced with a comma and a space for improved readability. | nDCG | Result name | |------|:------------| | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8, workers=64 | | 0.7579 | prefix, phrases=2, alpha=0.05, dm=1, dm-concat=1, epochs=5, hs=1, min-alpha=0, min-count=5, vector-size=400, window=4, workers=64 | | 0.7604 | prefix, phrases=2, alpha=0.1, dm=0, dm-concat=1, epochs=5, hs=0, min-alpha=0, min-count=5, negative=12, vector-size=300, window=8 | | 0.7579 | prefix, phrases=2, alpha=0.05, dm=1, dm-concat=1, epochs=5, hs=1, min-alpha=0, min-count=5, vector-size=400, window=4 | | *0.7578* | *random* | ## Legend Loading @@ -32,10 +32,8 @@ The [Formula2Vec system][scm-at-arqmath] recogizes the following parameters: - min-count – the minimum term frequency - vector-size – vector dimensions - window – window size - workers – the number of threads used for [hogwild][] - epochs – the number of epochs [arxmliv-08-2019]: https://sigmathling.kwarc.info/resources/arxmliv-dataset-082019/ [collocation detection]: https://radimrehurek.com/gensim/models/phrases.html [hogwild]: https://papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent [scm-at-arqmath]: https://gitlab.fi.muni.cz/xnovot32/scm-at-arqmath (Soft Cosine Measure at ARQMath)
task1-votes/ayetiran/prefix_phrases=2_alpha=0.05_dm=1_dm-concat=1_epochs=5_hs=1_min-alpha=0_min-count=5_vector-size=400_window=4_workers=64.tsv→task1-votes/ayetiran/prefix_phrases=2_alpha=0.05_dm=1_dm-concat=1_epochs=5_hs=1_min-alpha=0_min-count=5_vector-size=400_window=4.tsv +0 −0 File moved. View file
task1-votes/ayetiran/prefix_phrases=2_alpha=0.1_dm=0_dm-concat=1_epochs=5_hs=0_min-alpha=0_min-count=5_negative=12_vector-size=300_window=8_workers=64.tsv→task1-votes/ayetiran/prefix_phrases=2_alpha=0.1_dm=0_dm-concat=1_epochs=5_hs=0_min-alpha=0_min-count=5_negative=12_vector-size=300_window=8.tsv +0 −0 File moved. View file
task1-votes/xnovot32/README.md +37 −39 Original line number Diff line number Diff line Loading @@ -4,43 +4,43 @@ underscores (`_`) replaced with a comma and a space for improved readability. | nDCG | Result name | |------|:------------| | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=6, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=5, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=1600, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=1, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=800, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=0, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=400, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | infix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=10, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=1600, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=1000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=3, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=8000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=4000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=4, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=False, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7607 | slt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7606 | opt, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7602 | latex, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7600 | nomath, phrases=0, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7598 | nomath, phrases=1, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7596 | nomath, phrases=2, alpha=0.05, bucket=2000000, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, workers=64, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7614 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=6, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=5, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=1600, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=1, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=800, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=0, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=400, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=50, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7613 | infix, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7613 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=10, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=1600, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=1M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=400, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=3, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=50, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7612 | prefix, phrases=2, alpha=0.05, bucket=8M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=4M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7611 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=200, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=200, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=4, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7610 | prefix, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=False, nonzero-limit=100, symmetric=False, exponent=4.0, threshold=-1.0 | | 0.7607 | slt, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7606 | opt, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7602 | latex, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7600 | nomath, phrases=0, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7598 | nomath, phrases=1, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | 0.7596 | nomath, phrases=2, alpha=0.05, bucket=2M, iter=5, max-n=6, min-alpha=0, min-count=5, min-n=3, negative=5, sample=0.0001, sg=1, size=300, window=5, dominant=True, nonzero-limit=100, symmetric=True, exponent=4.0, threshold=-1.0 | | *0.7578* | *random* | ## Legend Loading Loading @@ -70,7 +70,6 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters: - sg – the skipgram model - size – vector dimensions - window – window size - workers – the number of threads used for [hogwild][] - Soft Cosine Measure: - dominant – whether the term similarity matrix will be strongly diagonally dominant - nonzero-limit – the maximum number of non-zero elements outside the diagonal in a single column of the term similarity matrix Loading @@ -80,6 +79,5 @@ The [SCM system][scm-at-arqmath] recogizes the following parameters: [arxmliv-08-2019]: https://sigmathling.kwarc.info/resources/arxmliv-dataset-082019/ [collocation detection]: https://radimrehurek.com/gensim/models/phrases.html [hogwild]: https://papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent [scm-at-arqmath]: https://gitlab.fi.muni.cz/xnovot32/scm-at-arqmath (Soft Cosine Measure at ARQMath) [term similarity matrix formula]: https://arxiv.org/pdf/2003.05019.pdf#page=4