# Exercise 3 ## Analyzing Darkspace Evolution >>> Check results from [rep-14] again. Are they correlated? Think for a second about the possible meaning of the analyzed time series being correlated. What could be the reason why the drop in the number of unique IP sources after Jan 16 does not cause a proportional drop in the other signals? >>> The results are mostly either strongly or somewhat correlated. Looking at the different correlations, it could be that the drop happened because someone was scanning the network or performing some kind of attack on a lot of different hosts. This hypothesis is supported by the high correlation of unique destination IPs with the amount of packets and the amount of bytes sent. It follows that, since the unique source IPs dropped, one IP address had a lot of outflow of traffic to a lot of unique destination IPs. >>> Check results from [rep-15] again. Do the results make sense for you? Would you expect a different ratio in a normal network (no darkspace)? >>> In a normal network I would expect the ratio to be much closer to one, albeit still higher than one. Thinking about my traffic at home, most requests have a response associated with them and thus the ratio should be much closer to one. This ratio is easily offset by doing a horizontal scan on the network for example. >>> You used the median in [rep-15], but you could have used the mean. Does it make any difference? What's better in your opinion? When to use mean and when median? Can you figure out pros and cons for both measures of central tendency? >>> The median definitely makes more sense in this case since it has a strong rejection of outliers. The traffic data is very diverse and spread out, meaning that the mean would look very different from the median.