Thursday, June 17, 2010

Crawl #2 Coming Soon! (and a request)

Crawl #2 is a 2-parter: grab all profiles & then grab all portfolios. Mind you, that's not the full process. The pages still then need to be run through the processing code that pulls the data from the pages & loads it all into a database. This is NOT a short process, folks, with each half of the crawl taking 1-2 hours. Overall, you're looking at about a 5-6 hour process (currently). A retooling of the code will help shorten that timespan, but for now, it is what it is.

I'll be using this second crawl to produce member lists for all personal communities, as well as a list of private community names. I won't be publishing member lists of private communities, though I will provide them to the admins of those communities upon request. Please, though, don't request a member list if you don't have over 50 members; that's just lazy.

And now for the request: what data do you want me to compile here? Different statistics about communities? Perhaps connections between followers/follow-backs/investors? The sky's the limit, so let's hear it! You'll be helping to actively contribute to the game because I'll be using your suggestions to identify ways that the EA developers can change their code to better enable the retrieval of the necessary data!

1 comment:

  1. I love that you are doing this, and of course especially that you are willing to share.

    I'm interested in the basics: how many active players are there (activity in the last three days), how many quiet accounts are there (say activity in the last month but not in the last three days) and how many total accounts are there. And, of course, are these numbers/percentages trending up or down.

    I think we're all interested in information like this...

    Thanks!

    (e)Smoodle

    ReplyDelete