Challenges of Spam Detection in Social Networks

Social Apps

I have been working on detecting spammers on social networks nowadays. Social networks detect spam and spammers by using mostly users’ activities and spam reports. If you read spam and abuse policies (i.e. Twitter’s), they clearly say how an account flagged as spammer. But, in your own services, you do not have the power like theirs. Let’s see Twitter’s. The only usable conditions similar with Twitter are

  • if you have followed and/or unfollowed large amounts of users in a short time period, particularly by automated means (aggressive following or follower churn)
    my comment: we probably can’t measure because we can’t repeatedly update users’ information
  • if you repeatedly create false or misleading content in an attempt to bring attention to an account, service or link
    my comment: we can’t get old tweets and check links that user posted
  • if you post misleading links (e.g. affiliate links, links to malware/click jacking pages, etc.)
    my comment: we probably can’t decide the content if it’s misleading or not without using 3rd party services

As you can see, sadly they are not so usable for us. And since you can not monitor neither user activities nor users’ all posts including old ones you have to detect spam with what you have. So there are some strong challenges on it.

  • Social spammer behaviors change too fast.
  • A system that is capable of capturing most of the spams this month may fail to do so next month.
  • Spammers get smarter and can create new, more organic accounts to avoid being detected.
  • Once the spammers see that their fake accounts are caught by the system, and they come up with a new strategy to deceive the system.
  • The spammers do not form a cluster themselves and they are well integrated into a larger social network.
  • Each social network has it’s own spam culture. One’s solution usually does not fit into another’s.

As you see, it’s very hard to detect spam and spammers on social networks. The challenges are not simple things you can achieve with maths. The solution is a bit complicated and I will write about it later. Lastly my advice is that if you want to detect spammers on social networks, you should be a spammer first.

P.S. just kidding, don’t spam. Just do whatever you can do to know them very well and to think like them.

Where should we locate the complexity?

Everyone is talking about minimalism. The main idea is that we should design everything simple. But how? A big part of product designers making this by dropping some features out or splitting the main product to child products or to different versions. If you have not designed about 100s of features, this is probably not the right way. But if you have designed your product with enough features, how can you make your product simple? Is this a ui problem? Maybe, but it’s not enough. The only way to make things simpler is making the back-end complex and smart enough.

  • Make your features smart.
  • Design every feature to learn from its users.

Know that every action of the user is very valuable. Make their actions work for them. Use possibilities, use heuristics. And process all this complex information on your back-end to understand your users and to give them simplicity. No one sees your back-end. Do you even know how you can read this words or how can your visual cortex processes what you see now? Probably not. And also you may not care. So your users. They don’t care how complex your back-end. The only thing we all care is what we see and how easy it to use. So, hide your complexity into back-end. And use it smartly.

Cloud Service Provider Comparison Chart

Cloud Service ProvidersSince I have been using cloud computing engines of various service providers, I often need to compare specifications and hourly base prices of services. So I created a google sheet including 117 different server of 6 different service provider.

Click to view Google Spreadsheet Document

Please note that there may be additional fees except hourly prices like data transfer bandwidth, software license fees.. etc. Please check these fees before making a purchase.

Starting over

It’s been years since I have made my blog live. As every amateur developer who wanted to develop their own blogging system, I developed mine too. -Don’t judge me, it was 2004-. After years, I have changed everything to wordpress. But, because of not updating its core files and database for years, it’s now un-migratable and I don’t have time for that. So, I decided start over.

I will try to make daily updates and I hope I can.