Post
by lpetrich » Thu Feb 25, 2010 12:14 pm
I've saved a few threads that I've posted in and collected the Political Compass scores, but the horrid search engine won't find all of my posts for me.
Since more than one person here is saving threads en masse, I suggest that we coordinate our actions and create lists of all the threads and posts that we had succeeded in saving. Not by hand, of course, but by writing programs that can walk the directories, and read and parse the files.
I've had some experience with doing similar sort of things with saved messageboard posts, though on a much smaller scale.
What one might do is make a giant list of forum ID's, thread ID's, and post ID's, like:
f <id>
t <id>
p <id>
for thread <id> in forum <id> and post <id> in thread <id>. The posts could have other identifying info, like username, date, etc., and the threads and forums their names. Once we get these humongous lists, it's possible to compare them and find out who has what.
I like to write in Python and run my programs off of the command line; I run AppleBSD a.k.a. NextStep a.k.a. MacOS X. I don't know how easy it might be in Windows to run on the command line something like
python (program) (args)
Finally, I have a lot of experience with vBulletin. It's payware ($195/license), it is written in PHP, and it uses a MySQL database to store its data. I've also seen vBulletin run with huge databases, like in IIDB, now FRDB -- about 6 gigabytes. vBulletin users have also written oodles of add-ons for it, like custom-BBCode tables and plugins that do the work of various template edits.