Twitter has called upon the software application developer community to help in the global fight against hacking and spammers. The company has released its AnomalyDetection software tool to open source on the GitHub code repository. Twitter hopes that this open release will a) allow the community to learn from the software […]
Spikes and surges
When Twitter talks about ‘anomalies’ it is referring to spikes and surges of traffic on the network that can be caused by both legitimate and malicious activity.
“Both last year and this year, we saw a spike in the number of photos uploaded to Twitter on Christmas Eve, Christmas and New Year’s Eve (in other words, an anomaly occurred in the corresponding time series),” said Twitter, on the firm’s technical blog.
So while Christmas photo uploads spikes are a genuine discrete event for Twitter, the potential exists for similar unusual traffic surges caused by spam bots and hacking activity. With firms now increasingly operating big data analytics databases and real time network/cloud-based services, unwelcome (and unplanned) traffic surges can result in denial-of-service, website downtime and deeper offline problems.
Machine learning & algorithmic logic
Twitter’s AnomalyDetection is an open-source R statistical computing language package designed to automatically detects anomalies. It is built around algorithmic logic designed to accommodate for anomaly detection in the presence of seasonality and an underlying trend. Closely related to the discipline of machine learning, anomaly detection in this case employs ‘piecewise approximation’ – a mathematical function that enables the software to produce intelligent trend extraction from a set of traffic data.
According to Twitter, “Early detection of anomalies plays a key role in ensuring high-fidelity data is available to our own product teams and those of our data partners. This package helps us monitor spikes in user engagement on the platform surrounding holidays, major sporting events or during breaking news. The package can be used to find such bots or spam, as well as detect anomalies in system metrics after a new software release. We’re open-sourcing AnomalyDetection because we’d like the public community to evolve the package and learn from it as we have.”
When are you an anomaly?
If you want a down to Earth example of your own anomaly data behavior, consider the fact that your bank now requests that you tell it when you are abroad. If a user typically only ever uses his or her credit card within a 100-mile radius zone in the state of Maryland and only ever spends on food, petrol, entertainment, clothes and other sundries – then a statistical pattern in therefore established.
If that same user suddenly draws out cash in Amsterdam and then pays for an expensive dinner in Dubai, then anomaly alerts register at the bank and guess what happens to your credit card approval?
Page 1 / 2Continue