Answer questions for ex3 part 33

This commit is contained in:
Tobias Eidelpes 2021-05-21 14:37:00 +02:00
parent f75ef9c63c
commit 2c4f7827b9

View File

@ -37,3 +37,65 @@ median? Can you figure out pros and cons for both measures of central tendency?
The median definitely makes more sense in this case since it has a strong The median definitely makes more sense in this case since it has a strong
rejection of outliers. The traffic data is very diverse and spread out, meaning rejection of outliers. The traffic data is very diverse and spread out, meaning
that the mean would look very different from the median. that the mean would look very different from the median.
## Analyzing a Short Darkspace Period
>>>
Do values in Table A and Table B coincide? If not, why?
>>>
The values mostly coincide, except for the sums of course. This is to be
expected since both datasets are from the same timeframe. The standard deviation
of bytes is higher on the daily table, because there are probably times outside
of this particular month where a lot of bytes were sent, which causes the
standard deviation to be higher.
>>>
Histograms, but particularly box plots, corresponding to hourly counts might
differ from the equivalent histograms and box plots calculated with daily
averaged data. Do you know why? Can you find an explanation?
>>>
Having more fine-grained data with the hourly plots, also results in more
striking differences in the box plots especially. It takes usually less data to
elongate the whiskers of box plots because spikes in traffic are more
pronounced.
>>>
Make sure that you are familiar with the three main protocols appearing in the
`team13_protocol.csv` file. You should know their definition and what they are
used for.
>>>
The different protocols are:
* ICMP (Internet Control Message Protocol) with Identifier 1
* TCP (Transmission Control Protocol) with Identifier 6
* UDP (User Datagram Protocol) with Identifier 17
ICMP is mostly used for error reporting. Devices send ICMP packets for example
to make sure that a particular host is reachable or to alert the sending device
that a packet was too large for the receiver. ICMP can also be abused in DDoS
attacks where victims are flooded with packets or pinged to death.
TCP is the backbone of the internet as all HTTP(S) packets are sent over TCP. It
is connection-oriented as it establishes a session between client and server.
TCP is well-suited for applications that require packets to be sent in order and
where dropped packets are not wanted.
UDP is the opposite of TCP as it only operates connectionless. There is no
session between client and server established. Due to this property, it lends
itself well for applications such as VoIP, where data has to be sent quickly and
we do not care much about out-of-orderness or dropped packets.
>>>
Did you get negative values in [rep-19]? Can you figure out why? And why not in
the case of packets?
>>>
The negative values come from the fact that some source IPs appear multiple
times in different protocols (ICMP, TCP and UDP). The same goes for the
destination IPs. Adding those together gives a percentage higher than 100%.
Thus, the percentage of IPs _not_ belonging to these protocols must be smaller
than 0%. With packets it is not possible that they belong to multiple protocols
at once. Packets can only either be sent over ICMP, TCP or UDP.