Twitter is weird. I have to say, I do not understand Twitter. I was doing some research with it and I discovered something slightly surprising. Check out this ugly command line. (Or safely ignore it.)
for N in {1..1000}; \
do ( \
T=$(sort -R $WORDS | grep -v "'"| head -n1)$(( $RANDOM % 10 ));\
set -o pipefail; \
echo -n ${T}: ; \
if wget -qO- https://twitter.com/${T} \
| sed -n -e "/Account suspended/s/^.*$/suspended/p" \
-e "/>Joined/s/^.*$/active/p"; \
then true; \
else echo "none"; fi ;) \
| tee -a result ; done
It produces a random word plucked from my Linux spelling dictionary
(WORDS=/usr/share/dict/words
).
Then that word is appended with a random digit (0-9). The result,
something like "plateau4" or "spryest9", is used as a Twitter account
which I then try to access. I check if the account is "active",
"suspended", or "none". None means that the random string turns out to
not really even be an account. Of course this implementation goes slow
because of the shuffling of the dictionary every time but I left that
alone to not give Twitter a reason to block me. I did this 1000 times.
Now is a good time to use my Unix pie chart trick with something like this.
cut -d: -f2 result | sort | uniq -c | pie
Which produces the following.
Are there really so many suspended accounts? I’m sure someone went through long ago with a script much like mine and made a bunch of accounts. (Don’t they have captchas or something?) But if Twitter could suspend them, presumably for having some sign that they were created by a bot, couldn’t they then release them back into the namespace pool? Seems strange to me. Although I was working on a different piece of research which I may write more about later, my original hypothesis, that "Twitter is weird", has not been falsified.