Cataloguing the guestbook

Want to change something on the site? Or add new features?

Cataloguing the guestbook

Postby aradesh » Mon Mar 26, 2012 12:49 am

There's been a fair bit of talk about the guestbook and the forums lately, and I've thought before it would be nice - generally for looking things up and for historical purposes - to put the contents of the guestbook into a nice database format.

I'd be willing to do the work for this - but I have almost no experience with databases and this sort of thing.

Could I have some advice on what would be a nice standardized format to store all the posts from the GB, so that for instance it would be easy to program a website to do things like searching with conditions on the name, contents of the post, restrictions on time period, etc? i.e. I might for instance wish to see every post Kamil made on the GB to follow his progress from beginner to pro.

I could probably hack some tables and rudimentary things together in python but likely it would be inflexible and not particularly useful for other people.

What do you guys think?
aradesh
 
Posts: 90
Joined: Sat Aug 29, 2009 3:37 pm

Re: Cataloguing the guestbook

Postby Tommy » Mon Mar 26, 2012 11:03 am

Sounds interesting!

The HTML from http://www.minesweeper.info/downloads/Guestbook11.html, for instance, should be easy enough to parse.

Writing the script that parses the old GB HTML files, and stores them in some database, would already be huge. Damien would have to write the frontend, since that would be hosted on ms.info. But that isn't much work, just a bit of SQL would be fast enough for our purposes (ie, I don't think we need to bother with a full-text index yet, we can just use LIKE).
Don't anthropomorphize computers - they don't like it.
User avatar
Tommy
 
Posts: 223
Joined: Mon Dec 01, 2008 9:22 pm
Location: Vienna

Re: Cataloguing the guestbook

Postby thefinerminer » Wed Mar 28, 2012 11:07 pm

What a great idea!

Many years ago, when I knew nothing about programming, I started copying and pasting posts into Excel (and adding dates, name of player, category for easy sorting). I got bored after about 500 entries. Parsing it should be pretty simple, although:

1) The 2009-2011 guestbook (I have it zipped but haven't uploaded yet) includes replies to posts, so the parse code will need to be modified from what is used to parse 2000-2008.
2) I would like a database column for playerid added to the code. After the guestbook is parsed, a specialist (me?) can sort the posts by name and add in the playerid to non-spam posts. This may be useful as I know everyone's nicknames and can identify who did what (to help future people quote them for articles etc or to understand who was arguing with who).
thefinerminer
Site Admin
 
Posts: 113
Joined: Tue Jan 08, 2008 3:33 pm
Location: UK, Scotland

Re: Cataloguing the guestbook

Postby aradesh » Thu Mar 29, 2012 1:01 am

thefinerminer: that would be awesome. and a terribly tedious job for you! but it would be cool - in that you'd be able to figure out when the same person has used a different name and connect them together with the same playerid.

also if i write a script to get all the current GB and comments, you can restart the GB with 0 pages.
aradesh
 
Posts: 90
Joined: Sat Aug 29, 2009 3:37 pm

Re: Cataloguing the guestbook

Postby EWQMinesweeper » Thu Mar 29, 2012 9:02 am

hehehe - i recall the one or the other post with a different nickname. i guess i might be able to help there too.

maybe we should make a list of columns that will be helpful in the future when searching a certain gb post. what do you think?
„Das perlt jetzt aber richtig über, ma sagn. Mach ma' noch'n Bier! Wie heißt das? Biddä! Bidddää! Biddddäää! Reiner Weltladen!“
EWQMinesweeper
 
Posts: 410
Joined: Sun Nov 30, 2008 11:50 pm


Return to Website Comments and Changes

Who is online

Users browsing this forum: No registered users and 1 guest

cron