CCC old archives utility

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
JuLieN
Posts: 2949
Joined: Mon May 05, 2008 12:16 pm
Location: Bordeaux (France)
Full name: Julien Marcel

Re: SMALL? 145MB - for 30 lines?

Post by JuLieN »

Paloma wrote:
Rebel wrote:Wrote a small Utility that searches all the available CCC posts
....
Source code included (just 20-30 lines of code) in case someone feels the need to add more search flexibility.
SMALL? 145MB - for 30 lines?

Something is Wrong !!
The tool is small and simple, but it comes with all the old CCC's posts into a textfile that is 145MB big. :)
"The only good bug is a dead bug." (Don Dailey)
[Blog: http://tinyurl.com/predateur ] [Facebook: http://tinyurl.com/fbpredateur ] [MacEngines: http://tinyurl.com/macengines ]
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCC old archives utility

Post by Rebel »

JuLieN wrote:Excellent!!! Thanks Ed! :D
You are welcome. Is there not an option to store the current CCC database to a textfile?
User avatar
JuLieN
Posts: 2949
Joined: Mon May 05, 2008 12:16 pm
Location: Bordeaux (France)
Full name: Julien Marcel

Re: CCC old archives utility

Post by JuLieN »

Rebel wrote:
JuLieN wrote:Excellent!!! Thanks Ed! :D
You are welcome. Is there not an option to store the current CCC database to a textfile?
All fora have such an option (if not in textfile format, at least a way to backup its databases). But only the admin can do it. But the size of Talkchess' databases has been estimated to something around 400 GB. You'd want only the technical subforum I guess? :)
"The only good bug is a dead bug." (Don Dailey)
[Blog: http://tinyurl.com/predateur ] [Facebook: http://tinyurl.com/fbpredateur ] [MacEngines: http://tinyurl.com/macengines ]
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCC old archives utility

Post by Rebel »

JuLieN wrote:
Rebel wrote:
JuLieN wrote:Excellent!!! Thanks Ed! :D
You are welcome. Is there not an option to store the current CCC database to a textfile?
All fora have such an option (if not in textfile format, at least a way to backup its databases). But only the admin can do it. But the size of Talkchess' databases has been estimated to something around 400 GB. You'd want only the technical subforum I guess? :)
400 Gb, are you sure? The 1997-2006 period is only 1 Gb.
User avatar
JuLieN
Posts: 2949
Joined: Mon May 05, 2008 12:16 pm
Location: Bordeaux (France)
Full name: Julien Marcel

Re: CCC old archives utility

Post by JuLieN »

Rebel wrote:
JuLieN wrote:
Rebel wrote:
JuLieN wrote:Excellent!!! Thanks Ed! :D
You are welcome. Is there not an option to store the current CCC database to a textfile?
All fora have such an option (if not in textfile format, at least a way to backup its databases). But only the admin can do it. But the size of Talkchess' databases has been estimated to something around 400 GB. You'd want only the technical subforum I guess? :)
400 Gb, are you sure? The 1997-2006 period is only 1 Gb.
Taht's what I've been told. A forum's databases is not only the raw text of the threads: it's a lot more informations. If you extract only the relevant informations it'll probably be much smaller.
"The only good bug is a dead bug." (Don Dailey)
[Blog: http://tinyurl.com/predateur ] [Facebook: http://tinyurl.com/fbpredateur ] [MacEngines: http://tinyurl.com/macengines ]
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCC old archives utility

Post by Rebel »

Added a number of new features to the Utility that searches all the available CCC posts (474,745) from 1997-2006.

Features

1. Search on username
2. Search on subject
3. Search on both (combine 1 and 2)
4. Member overview
5. Thread overview

Examples of (4)
http://www.top-5000.nl/ccc/members1.htm (sorted on date)
http://www.top-5000.nl/ccc/members2.htm (sorted on popularity)

Examples of (5)
http://www.top-5000.nl/ccc/subject1.htm (sorted on date)
http://www.top-5000.nl/ccc/subject2.htm (sorted on popularity)

Instead of the big (145 Mb) download again (because the database has been improved) you can download the utility (just 42 Kb) seperately.
http://www.top-5000.nl/ccc/ccc.zip
Paloma
Posts: 1167
Joined: Thu Dec 25, 2008 9:07 pm
Full name: Herbert L

Re: CCC old archives utility

Post by Paloma »

Hi Ed,
many Thanks for this great Utility!

btw.
CCC utility searches the entire contents of the 474.745 postings
made on CCC (Talkchess) during the start of its existence (Oct 2007)
till March 2006 when the forum moved to PHP.


should be Oct 1997

Thanks again
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: CCC old archives utility

Post by Vinvin »

Hello Ed, thanks for this nice tool !!
I look at this page :
Rebel wrote:...Examples of (5) http://www.top-5000.nl/ccc/subject1.htm (sorted on date)...
It's written "Total Subjects 104088" but when I copy-past the text in my text editor, I got 29408 lines. How is it possible ?
Paloma
Posts: 1167
Joined: Thu Dec 25, 2008 9:07 pm
Full name: Herbert L

Re: CCC old archives utility

Post by Paloma »

Vincent, I think therefore:

The CCC database contains over 100,000 unique threads. Most threads
contain 1 or 2 postings. To avoid an extreme long HTML output file
the mimimum postings in a thread is set to 5. Still the list will be
xxxxx threads long.
Paloma
Posts: 1167
Joined: Thu Dec 25, 2008 9:07 pm
Full name: Herbert L

Re: CCC old archives utility

Post by Paloma »

Why are the members1.htm different?

My generated output begin with July 1998:

Code: Select all

Eran                                   532   joined July 01, 1998
jonathan Baxter                         64   joined July 01, 1998
Danniel Corbit                         243   joined July 02, 1998
Roberto Waldteufel                     346   joined July 02, 1998
Shaun Graham                           183   joined July 02, 1998
Amir Ban                              1343   joined July 02, 1998
blass uri                             4676   joined July 02, 1998
Mark Young                           2476   joined July 02, 1998
Steven Schwartz                      1504   joined July 02, 1998
Tony Hedlund                         2736   joined July 02, 1998
Tim Mann                              138   joined July 02, 1998
Bruce Moreland                       3968   joined July 02, 1998
Ed Schröder                          3292   joined July 02, 1998
Don Prohaska                          128   joined July 02, 1998
Robert Henry Durrett                  382   joined July 02, 1998
Fernando Villegas                    4458   joined July 02, 1998
Robert Hyatt                        24290   joined July 02, 1998
Len Spencer                            25   joined July 02, 1998
Christophe Theron                    6125   joined July 02, 1998
Inmann Werner                         310   joined July 03, 1998
Joe McCarron                           76   joined July 03, 1998
Peter Klausler                         42   joined July 03, 1998
Howard Exner                         1283   joined July 03, 1998
Trefor Deane                           43   joined July 04, 1998
Steven J. Edwards                      72   joined July 04, 1998
James Long                            203   joined July 04, 1998
Leon Stancliff                         86   joined July 04, 1998
Chris Whittington                  405   joined September 22, 1997
Enrique Irazoqui                  2248   joined September 22, 1997
Dirk Frickenschmidt                178   joined September 23, 1997
Jonathan Schaeffer                  10   joined September 23, 1997
Thorsten Czub                     5125   joined September 23, 1997
Tim Mirabile                       308   joined September 23, 1997
......

Yours begin with Sep.1997

Code: Select all

Chris Whittington                  432   joined September 22, 1997
Enrique Irazoqui                  2248   joined September 22, 1997
Ed Schröder                       3293   joined September 22, 1997
Dirk Frickenschmidt                178   joined September 23, 1997
Jonathan Schaeffer                  10   joined September 23, 1997
Thorsten Czub                     5125   joined September 23, 1997
Tim Mirabile                       315   joined September 23, 1997
Bruce Moreland                    3968   joined September 23, 1997
Robert Hyatt                     24309   joined September 26, 1997
Ernst A. Heinz                     591   joined October 03, 1997
Bernhard Sadlowski                   3   joined October 03, 1997
Amir Ban                          1343   joined October 04, 1997
Robert Sullivan                     14   joined October 05, 1997
Michael Borgstaedt (GOLIATH CHESS)      54   joined October 05, 1997
Tom King                            57   joined October 05, 1997
Fernando Villegas                 4458   joined October 05, 1997
Christophe Theron                 6101   joined October 05, 1997
Graham Laight                     1008   joined October 06, 1997
Ren Wu                             117   joined October 06, 1997
Walter Ravenek                       8   joined October 07, 1997
Ulrich Tuerke                     1110   joined October 07, 1997
Peter  Fendrich                     15   joined October 07, 1997
......