Quantifying Memory: Detecting bots

This is part 3 of the series about Nashi bots.
If doing this currently I would approach the problem differently, but to my knowledge NodeXL is still a viable way of accessing the Twitter API.

Part - theory
Part2 - the leaked email correspondence

Detecting Nashi’s bots

Kambulov was hired to create and populate a number of social network accounts, and he apparently completed his task efficiently enough. His motivation, though, was to deliver a set number of user profiles, rather than ensuring these appeared realist upon close inspection. Consequently, the Nashi bots are easy to identify. The creation of online accounts leaves a multitude of traces: when the account was created, what email address or phone-number was used to activate it, and of course links to other online users through correspondence or lists of ‘friends’ and ‘followers’. To give the semblance of being live accounts, followers had to be acquired for each bot. One way of doing this is by following a list of accounts, seeing which of these users ‘follow back’, and removing those which do not. The second way is to set the bot accounts to follow each other. One way the Nashi twitter bots functioned was by automatically reposting content originally hosted by Nashi commissars. Consequently bots may be identified by locating accounts that immediately reposted messages from Nashi Commisars. Exploring the followers of these users reveals a mixture predominantly made up of follow-back accounts and bots. Bot accounts may be identified by clusters in account details, for instance user-name patterns or date of creation. Consequently, whole networks may be unravelled by identifying one bot, downloading lists of their followers, and identifying clusters or patterns in user details. The details of twitter accounts may be accessed in numerous ways through the Twitter API. This is made relatively simple by a library for Python or package for R, but also in a user-friendly environment such as excel. Using the nodeXL plugin I was in a matter of minutes able to identify large botnet created in November-December 2011, linked to Kristina Potupchik and other Nashi commissars, and dormant until February 2012. A small part of the data is included below:

Notice how the usernames (Vertex) look randomly generated, how the ‘Location’ looks like it may be from a list of Russian names, while the ‘Joined Twitter’ times are very closely clustered.

Here is a clear example of efficient automation techniques being used. The simplest, most efficient automation techniques are both the most profitable for the person creating them, and the easiest to detect due to patterns left by lazy automation techniques. Many of the accounts I found in this network were well maintained, possibly using chatterbots or rewriters, but because they shared many characteristics with dormant, obviously fake accounts, they could easily be traced. Had Nashi themselves created powerful automation techniques, such patterns could have been removed altogether, and researchers would struggle to pinpoint fake accounts. They could also have isolated a set of bots used for spamming, from a set of bots they hoped would pass off as live users.

Conclusion

As Nashi expanded their online activity in an attempt to dominate an increasingly active online community they began to rely on technically skilled but less politically committed outsiders. The technical possibilities of macros and other forms of automation seem to have aided middlemen in defrauding their superiors, rather than in streamlining Nashi’s online activity. Considering that the higher levels of the pyramid exhibited the least understanding of automated techniques, there is no guarantee that mid-level actors who claimed to hire ‘internet lemmings’ actually did so.

Let’s return briefly to the robot analogy from the introduction: hiring bot services is like buying a robot pre-programmed to fulfil particular tasks. As the robot owner you are hopeful the product will work, but probably you don’t understand how it functions. If it malfunctions, you are unlikely to be able to fix it; if a task is too complicated for the robot, you won’t be able to reprogram it. Consequently, you will either have to buy a new robot, build a new robot, or forget the robot and pay someone to do the task. Nashi overwhelmingly favoured options one and three – they brought in technology from outside, and shaped it to be used through subordinates in campaigns where a few individuals simulated the activity of many.

In conclusion, there is no evidence that Nashi invested in creating sustainable, hard to detect bots. The correspondence indicates that the campaign was seen as a one off, and that it expanded a pre-existing program where activists were paid to promote Putin online. Nashi’s focus was to create and disseminate online content; in so doing they borrowed techniques from the internet shadow economy, but they wanted a high standard maintained in their online footprint, and consequently preferred to pay humans to simulate grass-root support for the organisation online. Bot activity was either unsanctioned, or provided by outsiders, which resulted in a diverse and poorly disguised online trace. Consequently it is easy to trace Nashi’s ‘dead souls’.

Quantifying Memory

HOME

|

ABOUT

|

POPULAR POSTS

|

CV

Detecting bots

1 comment:

HOME | ABOUT | POPULAR POSTS | CV

document.write(ssyby);

Detecting bots

1 comment:

HOME

|

ABOUT

|

POPULAR POSTS

|

CV