Sinkhole filtering is managed in two aspects. The first aspect is a list-based system in which there are pre-defined sources of fully qualified domain names that have been “blacklisted”. This blacklist is literally a list of sources that are known threats, ad servers, tracking systems, or general unwanted 3rd party services. The second aspect of filtering is through a machine learning algorithm called classification in which based upon a statistical probability from data points is able to determine an the class of the request resulting in the action of acceptance or denial.
Within our instance of the classification engine, a training set has already been provided, and the engine has been pre-taught. The classification engine uses a Bayesian method called Probabilistic Classification in which the engine is able to predict with certainty whether an outgoing request is associated with it's current data set or not.
As the request is received, the classification algorithm of Probabilistic Classification is able to turn the request into a data point to compare against it's own current data points. This is done be taking in a number of different variables such as the FQDN of the request, frequency, machine it is coming from, time of day, and others. Within a few milliseconds the engine is able to create a statistical probability as to whether or not request should be accepted or denied.