Cataloguing the guestbook

Want to change something on the site? Or add new features?
Post Reply
aradesh
Posts: 87
Joined: Sat Aug 29, 2009 3:37 pm

Cataloguing the guestbook

Post by aradesh »

There's been a fair bit of talk about the guestbook and the forums lately, and I've thought before it would be nice - generally for looking things up and for historical purposes - to put the contents of the guestbook into a nice database format.

I'd be willing to do the work for this - but I have almost no experience with databases and this sort of thing.

Could I have some advice on what would be a nice standardized format to store all the posts from the GB, so that for instance it would be easy to program a website to do things like searching with conditions on the name, contents of the post, restrictions on time period, etc? i.e. I might for instance wish to see every post Kamil made on the GB to follow his progress from beginner to pro.

I could probably hack some tables and rudimentary things together in python but likely it would be inflexible and not particularly useful for other people.

What do you guys think?
User avatar
Tommy
Posts: 257
Joined: Mon Dec 01, 2008 9:22 pm
Location: Vienna

Re: Cataloguing the guestbook

Post by Tommy »

Sounds interesting!

The HTML from http://www.minesweeper.info/downloads/Guestbook11.html, for instance, should be easy enough to parse.

Writing the script that parses the old GB HTML files, and stores them in some database, would already be huge. Damien would have to write the frontend, since that would be hosted on ms.info. But that isn't much work, just a bit of SQL would be fast enough for our purposes (ie, I don't think we need to bother with a full-text index yet, we can just use LIKE).
Don't anthropomorphize computers - they don't like it.
thefinerminer
Site Admin
Posts: 136
Joined: Tue Jan 08, 2008 3:33 pm
Location: UK, Scotland
Contact:

Re: Cataloguing the guestbook

Post by thefinerminer »

What a great idea!

Many years ago, when I knew nothing about programming, I started copying and pasting posts into Excel (and adding dates, name of player, category for easy sorting). I got bored after about 500 entries. Parsing it should be pretty simple, although:

1) The 2009-2011 guestbook (I have it zipped but haven't uploaded yet) includes replies to posts, so the parse code will need to be modified from what is used to parse 2000-2008.
2) I would like a database column for playerid added to the code. After the guestbook is parsed, a specialist (me?) can sort the posts by name and add in the playerid to non-spam posts. This may be useful as I know everyone's nicknames and can identify who did what (to help future people quote them for articles etc or to understand who was arguing with who).
aradesh
Posts: 87
Joined: Sat Aug 29, 2009 3:37 pm

Re: Cataloguing the guestbook

Post by aradesh »

thefinerminer: that would be awesome. and a terribly tedious job for you! but it would be cool - in that you'd be able to figure out when the same person has used a different name and connect them together with the same playerid.

also if i write a script to get all the current GB and comments, you can restart the GB with 0 pages.
EWQMinesweeper
Posts: 419
Joined: Sun Nov 30, 2008 11:50 pm

Re: Cataloguing the guestbook

Post by EWQMinesweeper »

hehehe - i recall the one or the other post with a different nickname. i guess i might be able to help there too.

maybe we should make a list of columns that will be helpful in the future when searching a certain gb post. what do you think?
„Das perlt jetzt aber richtig über, ma sagn. Mach ma' noch'n Bier! Wie heißt das? Biddä! Bidddää! Biddddäää! Reiner Weltladen!“
Post Reply