diff --git a/ex3/README.md b/ex3/README.md index 82f5bd0..a4b248b 100644 --- a/ex3/README.md +++ b/ex3/README.md @@ -37,3 +37,65 @@ median? Can you figure out pros and cons for both measures of central tendency? The median definitely makes more sense in this case since it has a strong rejection of outliers. The traffic data is very diverse and spread out, meaning that the mean would look very different from the median. + +## Analyzing a Short Darkspace Period + +>>> +Do values in Table A and Table B coincide? If not, why? +>>> + +The values mostly coincide, except for the sums of course. This is to be +expected since both datasets are from the same timeframe. The standard deviation +of bytes is higher on the daily table, because there are probably times outside +of this particular month where a lot of bytes were sent, which causes the +standard deviation to be higher. + +>>> +Histograms, but particularly box plots, corresponding to hourly counts might +differ from the equivalent histograms and box plots calculated with daily +averaged data. Do you know why? Can you find an explanation? +>>> + +Having more fine-grained data with the hourly plots, also results in more +striking differences in the box plots especially. It takes usually less data to +elongate the whiskers of box plots because spikes in traffic are more +pronounced. + +>>> +Make sure that you are familiar with the three main protocols appearing in the +`team13_protocol.csv` file. You should know their definition and what they are +used for. +>>> + +The different protocols are: + +* ICMP (Internet Control Message Protocol) with Identifier 1 +* TCP (Transmission Control Protocol) with Identifier 6 +* UDP (User Datagram Protocol) with Identifier 17 + +ICMP is mostly used for error reporting. Devices send ICMP packets for example +to make sure that a particular host is reachable or to alert the sending device +that a packet was too large for the receiver. ICMP can also be abused in DDoS +attacks where victims are flooded with packets or pinged to death. + +TCP is the backbone of the internet as all HTTP(S) packets are sent over TCP. It +is connection-oriented as it establishes a session between client and server. +TCP is well-suited for applications that require packets to be sent in order and +where dropped packets are not wanted. + +UDP is the opposite of TCP as it only operates connectionless. There is no +session between client and server established. Due to this property, it lends +itself well for applications such as VoIP, where data has to be sent quickly and +we do not care much about out-of-orderness or dropped packets. + +>>> +Did you get negative values in [rep-19]? Can you figure out why? And why not in +the case of packets? +>>> + +The negative values come from the fact that some source IPs appear multiple +times in different protocols (ICMP, TCP and UDP). The same goes for the +destination IPs. Adding those together gives a percentage higher than 100%. +Thus, the percentage of IPs _not_ belonging to these protocols must be smaller +than 0%. With packets it is not possible that they belong to multiple protocols +at once. Packets can only either be sent over ICMP, TCP or UDP.