Guess the Poster CCC-WorldClouds-Part 3

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by Laskos »

chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by Laskos »

chrisw wrote: Thu Oct 10, 2019 10:21 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
Errrmm, no longer sure I have the indexing system for the CTF posts, or, rather, I might have mixed everything such that I no longer know whose wordcloud is whose. Will check a bit more tomorrow, and might redo the CTF ones
Ah, ok.
chrisw
Posts: 4315
Joined: Tue Apr 03, 2012 4:28 pm

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by chrisw »

Laskos wrote: Thu Oct 10, 2019 10:26 pm
chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
It might not be you, I think I screwed up indexing to wordclouds when Ovyron suggested putting them all one site. Didn’t check carefully enough. So, I think the CTF ones mismatch with my lookup table as to which one is which. You might be 12 and you might not.
chrisw
Posts: 4315
Joined: Tue Apr 03, 2012 4:28 pm

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by chrisw »

Laskos wrote: Thu Oct 10, 2019 10:29 pm
chrisw wrote: Thu Oct 10, 2019 10:21 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
Errrmm, no longer sure I have the indexing system for the CTF posts, or, rather, I might have mixed everything such that I no longer know whose wordcloud is whose. Will check a bit more tomorrow, and might redo the CTF ones
Ah, ok.
Part of the problem is that wordcloud does some sort of randomisation for initial placement and it doesn’t appear to regenerate the exact same cloud each time. will RTFD now, better late than never
chrisw
Posts: 4315
Joined: Tue Apr 03, 2012 4:28 pm

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by chrisw »

Ok, sorry people, I finally RTFM and re-did everything now on Part 4. Which should be final version.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by MikeB »

Ovyron wrote: Thu Oct 10, 2019 5:23 pm Guessing:

13. Miguel A. Ballicora
14. MikeB
hahaha - tough one!

I love this - i think it's great!
Image
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by Laskos »

chrisw wrote: Thu Oct 10, 2019 10:31 pm
Laskos wrote: Thu Oct 10, 2019 10:26 pm
chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
It might not be you, I think I screwed up indexing to wordclouds when Ovyron suggested putting them all one site. Didn’t check carefully enough. So, I think the CTF ones mismatch with my lookup table as to which one is which. You might be 12 and you might not.

About "lexical complexity" you might find this python based analyzer interesting.
http://www.personal.psu.edu/xxl13/downloads/lca.html
I understand that your volumes are high and it might be very slow to analize the full databases, but you can trim them. To my surprise I found my "lexical sophistication" (LS1 and LS2) on this forum to be rather above the average compared to several other posters here, including native English speakers. But only on some 2-3 longer posts (probably unreliable statistic). I don't know what are these measures (LS1 and LS2, "lexical sophistication"), though. And it won't be very nice to post these results, as they are related to some language skills and some folks might find the results offending.
chrisw
Posts: 4315
Joined: Tue Apr 03, 2012 4:28 pm

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by chrisw »

Laskos wrote: Fri Oct 11, 2019 10:16 am
chrisw wrote: Thu Oct 10, 2019 10:31 pm
Laskos wrote: Thu Oct 10, 2019 10:26 pm
chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
It might not be you, I think I screwed up indexing to wordclouds when Ovyron suggested putting them all one site. Didn’t check carefully enough. So, I think the CTF ones mismatch with my lookup table as to which one is which. You might be 12 and you might not.

About "lexical complexity" you might find this python based analyzer interesting.
http://www.personal.psu.edu/xxl13/downloads/lca.html
I understand that your volumes are high and it might be very slow to analize the full databases, but you can trim them. To my surprise I found my "lexical sophistication" (LS1 and LS2) on this forum to be rather above the average compared to several other posters here, including native English speakers. But only on some 2-3 longer posts (probably unreliable statistic). I don't know what are these measures (LS1 and LS2, "lexical sophistication"), though. And it won't be very nice to post these results, as they are related to some language skills and some folks might find the results offending.
I'll take a look.
chrisw
Posts: 4315
Joined: Tue Apr 03, 2012 4:28 pm

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by chrisw »

Laskos wrote: Fri Oct 11, 2019 10:16 am
chrisw wrote: Thu Oct 10, 2019 10:31 pm
Laskos wrote: Thu Oct 10, 2019 10:26 pm
chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
It might not be you, I think I screwed up indexing to wordclouds when Ovyron suggested putting them all one site. Didn’t check carefully enough. So, I think the CTF ones mismatch with my lookup table as to which one is which. You might be 12 and you might not.

About "lexical complexity" you might find this python based analyzer interesting.
http://www.personal.psu.edu/xxl13/downloads/lca.html
I understand that your volumes are high and it might be very slow to analize the full databases, but you can trim them. To my surprise I found my "lexical sophistication" (LS1 and LS2) on this forum to be rather above the average compared to several other posters here, including native English speakers. But only on some 2-3 longer posts (probably unreliable statistic). I don't know what are these measures (LS1 and LS2, "lexical sophistication"), though. And it won't be very nice to post these results, as they are related to some language skills and some folks might find the results offending.
Useful stuff in there.
He's got a huge list of words, sorted by frequency and labelled, a very small random part:

political political JJ 4745
sense sense NN 4735
side side NN 4694
power power NN 4677
looking look VBG 4670
go go VBP 4669
use use VB 4667
law law NN 4664
especially especially RB 4655
fig fig NNP 4626

Seems to be 'word', 'some form of generalised stub word', J=adjective, N=noun, V=verb, G=gerund(?), R=adverb, not sure what the other symbols are.
Not sure how he either acquired the list, or else generated it. There's some tests for adverb = endswith('ly') and similar in the code, so maybe he generated it from a corpus and worked out what was what grammar-wise. Not looked enough to say, but probably the latter given the 'ly' search in the code.

Then there's another list, that looks like a numbers, symbols, misspellings list:
not sure what he does with that, NLP processors want words, they don't like numbers nor colloquialisms!

£900 NoC 1
£95 NoC 1
£950 NoC 1
£99 NoC 1
£995 NoC 1
£999 NoC 1
&rehy; Verb 1
× Prep 16
' Err 8
' Gen 479
'50 Num 1
'60 Num 2
'70 Num 3
'80 Num 2
'90 Num 2
'91 Num 1
'92 Num 1
'93 Num 1
'ad Verb 2
'appen Verb 1
'as Verb 1
'ave Verb 3
'cause Conj 1
'cos Conj 4
'er Det 1


And then he filters and pre-processes to this kind of text (before doing the numerical analysis?). There's a "look_VBG" in there, and serendipitously "looking" in the above table has the stub 'look' and VBG category (verb, gerund, I don't know what B stands for). P might stand for passive tense as in "remember_VBP", guessing here. Well, looks fun and interesting, although I have no idea how to interpret the number arrays which form the lexical complexity results output.


His_PRP$ mom_NN have_VBD her_PRP$ shoe_NN off_RP and_CC her_PRP$ foot_NN up_RP ._.
She_PRP be_VBD look_VBG through_IN a_DT catalog_NN of_IN baby_NN furniture_NN ._.
``_`` What_WP ?_. ''_''
``_`` You_PRP remember_VBP Melissa_NNP ?_.
Out_IN in_IN Bixby_NNP ,_, Oklahoma_NNP ?_. ''_''
``_`` Yes_UH ,_, I_PRP remember_VBP Melissa_NNP ._. ''_''
``_`` I_PRP just_RB find_VBD out_RP a_DT terrible_JJ ,_, terrible_JJ thingshe_NN '_PO give_VBG me_PRP something_NN for_IN Christmas_NNP ._. ''_''
``_`` How_WRB 'd_MD you_PRP find_VB that_DT out_RP ?_. ''_''
``_`` She_PRP tell_VBD me_PRP ._.
Here_RB it_PRP be_VBZ in_IN black_JJ and_CC white_JJ ._.
`_`` I_PRP finish_VBD your_PRP$ Christmas_NNP present_NN today_NN and_CC I_PRP KNOW_VBP '_'' know_VBP be_VBZ in_IN capital_NN letter_NN which_WDT means_VBZ ,_, unfortunately_RB ,_, that_IN it_PRP '_VBZ something_NN nice_JJ `_`` I_PRP KNOW_VBP you_PRP be_VBP go_VBG to_TO love_VB it_PRP ._. '_''
I_PRP be_VBP not_RB just_RB go_VBG to_TO like_VB it_PRP ,_, Mom_NN ,_, I_PRP be_VBP go_VBG to_TO love_VB it_PRP ._.
Love_NNP '_PO not_RB underline_VBN but_CC it_PRP might_MD as_RB well_RB be_VB ._. ''_''
``_`` So_RB ?_. ''_''
``_`` Mom_NN ,_, this_DT means_VBZ I_PRP have_VBP to_TO give_VB her_PRP$ something_NN and_CC it_PRP have_VBZ to_TO be_VB something_NN she_PRP will_MD love_VB ._. ''_''
``_`` Only_RB if_IN you_PRP want_VBP to_TO ._. ''_''
``_`` No_UH ,_, Mom_NN ,_, I_PRP have_VBP to_TO !_. ''_''
``_`` Send_VB her_PRP a_DT Christmas_NNP card_NN ._. ''_''
``_`` Mom_NN !_. ''_''
Bingo_NNP say_VBD ,_, genuinely_RB shock_JJ ._.
His_PRP$ mom_NN lean_VBD back_RB thoughtfully_RB ._.
``_`` She_PRP say_VBZ she_PRP just_RB finish_VBD it_PRP ._.
That_DT means_VBZ it_PRP '_VBZ something_NN she_PRP make_VBD herself_PRP ._. ''_''
``_`` Yes_UH ,_, yes_UH ._.
Go_VB on_IN ._. ''_''
His_PRP$ mom_NN sit_VBD up_RP ._.
``_`` Oh_UH ,_, Bingo_NNP ,_, do_VB you_PRP suppose_VB it_PRP could_MD be_VB homemade_NN fudge_VB ?_. ''_''
``_`` Of_IN course_NN not_RB ._. ''_''
``_`` Bingo_NNP ,_, lately_RB I_PRP have_VBP just_RB be_VBN crave_NN homemade_NN fudge_VBP ,_, the_DT kind_NN with_IN real_JJ butter_NN ._.
Have_VBP you_PRP get_VBN my_PRP$ Christmas_NNP present_NN yet_RB ?_. ''_''
``_`` No._NN ''_'' ``_`` Well_UH ,_, make_VB me_PRP some_DT fudge_VBP with_IN real_JJ butter_NN ._. ''_''
``_`` I_PRP will_MD make_VB your_PRP$ fudge_VB as_RB soon_RB as_IN I_PRP have_VBP figure_VBN out_RP what_WP to_TO do_VB about_IN Melissa_NNP ._. ''_''


Live and learn. All about suffix and prefix tagging, as in the above. Multi-language too it seems. I am basically a complete beginner in NLP, it's quite a large field.
https://www.cis.uni-muenchen.de/~schmid ... reeTagger/
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Guess the Poster CCC-WorldClouds-Part 3

Post by Laskos »

chrisw wrote: Fri Oct 11, 2019 12:01 pm
Laskos wrote: Fri Oct 11, 2019 10:16 am
chrisw wrote: Thu Oct 10, 2019 10:31 pm
Laskos wrote: Thu Oct 10, 2019 10:26 pm
chrisw wrote: Thu Oct 10, 2019 9:59 pm
Laskos wrote: Thu Oct 10, 2019 9:51 pm Is that (12) idiot me?
Damn, I seem completely retard on both CCC and CTF.
CTF 12 is indeed Laskos. Well done for spotting yourself. It’s an ok wordcloud, no?
Very nice! Spotting seems to me to be about some specific words, even less used (smaller), or stupidity, like mine. On CCC I have the most common Stockfish, Komodo, Houdini, the top engines. On CTF some America, Russia, Germany, million, etc. Indeed, my English is poor, few verbs, few uncommon nouns. I guess you eventually can do some basic "lexical complexity" analysis of some sort using your tools.
It might not be you, I think I screwed up indexing to wordclouds when Ovyron suggested putting them all one site. Didn’t check carefully enough. So, I think the CTF ones mismatch with my lookup table as to which one is which. You might be 12 and you might not.

About "lexical complexity" you might find this python based analyzer interesting.
http://www.personal.psu.edu/xxl13/downloads/lca.html
I understand that your volumes are high and it might be very slow to analize the full databases, but you can trim them. To my surprise I found my "lexical sophistication" (LS1 and LS2) on this forum to be rather above the average compared to several other posters here, including native English speakers. But only on some 2-3 longer posts (probably unreliable statistic). I don't know what are these measures (LS1 and LS2, "lexical sophistication"), though. And it won't be very nice to post these results, as they are related to some language skills and some folks might find the results offending.
Useful stuff in there.
He's got a huge list of words, sorted by frequency and labelled, a very small random part:

political political JJ 4745
sense sense NN 4735
side side NN 4694
power power NN 4677
looking look VBG 4670
go go VBP 4669
use use VB 4667
law law NN 4664
especially especially RB 4655
fig fig NNP 4626

Seems to be 'word', 'some form of generalised stub word', J=adjective, N=noun, V=verb, G=gerund(?), R=adverb, not sure what the other symbols are.
Not sure how he either acquired the list, or else generated it. There's some tests for adverb = endswith('ly') and similar in the code, so maybe he generated it from a corpus and worked out what was what grammar-wise. Not looked enough to say, but probably the latter given the 'ly' search in the code.

Then there's another list, that looks like a numbers, symbols, misspellings list:
not sure what he does with that, NLP processors want words, they don't like numbers nor colloquialisms!

£900 NoC 1
£95 NoC 1
£950 NoC 1
£99 NoC 1
£995 NoC 1
£999 NoC 1
&rehy; Verb 1
× Prep 16
' Err 8
' Gen 479
'50 Num 1
'60 Num 2
'70 Num 3
'80 Num 2
'90 Num 2
'91 Num 1
'92 Num 1
'93 Num 1
'ad Verb 2
'appen Verb 1
'as Verb 1
'ave Verb 3
'cause Conj 1
'cos Conj 4
'er Det 1


And then he filters and pre-processes to this kind of text (before doing the numerical analysis?). There's a "look_VBG" in there, and serendipitously "looking" in the above table has the stub 'look' and VBG category (verb, gerund, I don't know what B stands for). P might stand for passive tense as in "remember_VBP", guessing here. Well, looks fun and interesting, although I have no idea how to interpret the number arrays which form the lexical complexity results output.


His_PRP$ mom_NN have_VBD her_PRP$ shoe_NN off_RP and_CC her_PRP$ foot_NN up_RP ._.
She_PRP be_VBD look_VBG through_IN a_DT catalog_NN of_IN baby_NN furniture_NN ._.
``_`` What_WP ?_. ''_''
``_`` You_PRP remember_VBP Melissa_NNP ?_.
Out_IN in_IN Bixby_NNP ,_, Oklahoma_NNP ?_. ''_''
``_`` Yes_UH ,_, I_PRP remember_VBP Melissa_NNP ._. ''_''
``_`` I_PRP just_RB find_VBD out_RP a_DT terrible_JJ ,_, terrible_JJ thingshe_NN '_PO give_VBG me_PRP something_NN for_IN Christmas_NNP ._. ''_''
``_`` How_WRB 'd_MD you_PRP find_VB that_DT out_RP ?_. ''_''
``_`` She_PRP tell_VBD me_PRP ._.
Here_RB it_PRP be_VBZ in_IN black_JJ and_CC white_JJ ._.
`_`` I_PRP finish_VBD your_PRP$ Christmas_NNP present_NN today_NN and_CC I_PRP KNOW_VBP '_'' know_VBP be_VBZ in_IN capital_NN letter_NN which_WDT means_VBZ ,_, unfortunately_RB ,_, that_IN it_PRP '_VBZ something_NN nice_JJ `_`` I_PRP KNOW_VBP you_PRP be_VBP go_VBG to_TO love_VB it_PRP ._. '_''
I_PRP be_VBP not_RB just_RB go_VBG to_TO like_VB it_PRP ,_, Mom_NN ,_, I_PRP be_VBP go_VBG to_TO love_VB it_PRP ._.
Love_NNP '_PO not_RB underline_VBN but_CC it_PRP might_MD as_RB well_RB be_VB ._. ''_''
``_`` So_RB ?_. ''_''
``_`` Mom_NN ,_, this_DT means_VBZ I_PRP have_VBP to_TO give_VB her_PRP$ something_NN and_CC it_PRP have_VBZ to_TO be_VB something_NN she_PRP will_MD love_VB ._. ''_''
``_`` Only_RB if_IN you_PRP want_VBP to_TO ._. ''_''
``_`` No_UH ,_, Mom_NN ,_, I_PRP have_VBP to_TO !_. ''_''
``_`` Send_VB her_PRP a_DT Christmas_NNP card_NN ._. ''_''
``_`` Mom_NN !_. ''_''
Bingo_NNP say_VBD ,_, genuinely_RB shock_JJ ._.
His_PRP$ mom_NN lean_VBD back_RB thoughtfully_RB ._.
``_`` She_PRP say_VBZ she_PRP just_RB finish_VBD it_PRP ._.
That_DT means_VBZ it_PRP '_VBZ something_NN she_PRP make_VBD herself_PRP ._. ''_''
``_`` Yes_UH ,_, yes_UH ._.
Go_VB on_IN ._. ''_''
His_PRP$ mom_NN sit_VBD up_RP ._.
``_`` Oh_UH ,_, Bingo_NNP ,_, do_VB you_PRP suppose_VB it_PRP could_MD be_VB homemade_NN fudge_VB ?_. ''_''
``_`` Of_IN course_NN not_RB ._. ''_''
``_`` Bingo_NNP ,_, lately_RB I_PRP have_VBP just_RB be_VBN crave_NN homemade_NN fudge_VBP ,_, the_DT kind_NN with_IN real_JJ butter_NN ._.
Have_VBP you_PRP get_VBN my_PRP$ Christmas_NNP present_NN yet_RB ?_. ''_''
``_`` No._NN ''_'' ``_`` Well_UH ,_, make_VB me_PRP some_DT fudge_VBP with_IN real_JJ butter_NN ._. ''_''
``_`` I_PRP will_MD make_VB your_PRP$ fudge_VB as_RB soon_RB as_IN I_PRP have_VBP figure_VBN out_RP what_WP to_TO do_VB about_IN Melissa_NNP ._. ''_''


Live and learn. All about suffix and prefix tagging, as in the above. Multi-language too it seems. I am basically a complete beginner in NLP, it's quite a large field.
https://www.cis.uni-muenchen.de/~schmid ... reeTagger/

I hardly understand much of NLP.