Analysis and Insights of things asked in Ques. 3
Observations and Explanations:
There were numerous objects downloaded from both
nytimes.com and vox.com. But vox.com had a lot of image files to download and thus the total size of objects downloaded from
vox.com (around 25 MB) is very large compared to nytimes.com (2 MB).
o Almost all the image files were downloaded from hostname:
cdn0.vox-cdn.com, cdn1.vox-cdn.com, cdn2.vox-cdn.com &
Out of these four, three of them (0, 2, 3) had a common IP
address. Thus most of the image traffic was sent to a
(After analyzing the pcap file--) All the four hostnames are the alias of another domain namely
“ddrgqsxlcy7wq.cloudfront.net”. Thus these resource record in the DNS are CNAME (canonical name) type record.
Although this particular domain had multiple IP addresses
assigned (to balance the load may be, as most of the query
done on this domain are for images which are huge in size
and thus can slow down the network), the DNS response for
three (cdn0, cdn2, cdn3) returned the same IP (126.96.36.199)
most of the times (was diff. for cdn2 & cdn0 in few cases)
while it was different for cdn1.
Ideally it should have been that DNS followed the Round
Robin configuration for returning the IP addresses so that
there was no risk of skewing the load between target
servers. Can also in a way help in fault-tolerance on network systems.
o For cdn1 two different IPs (188.8.131.52 & 184.108.40.206)
were returned. Both had equal load as 3 objects were
downloaded from one and rest three form other. This is an
example of load balancing by using Round Robin
configuration in DNS response.
o Note: The table would have multiple entries if there are
multiple IPs returned for same domain (like for cdn1). This is done just so that it can be inferred as to which all domains have been assigned multiple IPs and the distribution of
objects downloaded from these domains.
In the Screenshot above, cdn2 has a different IP
(220.127.116.11) in one case and cdn0 also has a different IP
(18.104.22.168) for three different objects.
o Now, because we parse har file w.r.t host name and pcap
w.r.t IP, all three of cdn0, cdn2, cdn3 have same TCP
connections in the table as they have same IPs.
o Better way to read this would be to consider only non-zero download size connections for each of these three domains.
Haven’t implemented it to avoid irregularity in the table. It sort of gets implemented while making the download tree,
so no problems there.
o As expected, the connections are exhaustive and no two
domains have same tuple of (src.port and dst.port).
o Any inconsistency in the table is mostly due to mismatches in har file and the pcap file.
E.g. 1:- total size of objects downloaded according to data
of Wireshark is less for ping.chartbeat.com. The har file
recorded 4 objects while pcap has data of only one. All these four are different objects.
E.g. 2:- The opposite is also observed i.e. there are cases
where Wireshark captures a HTTP GET request but there
isn’t a separate entry in the har file corresponding to that request. It just shows up in the referred objects by other
entity. So maybe Firebug missed something there!!
This happens for cdn0 (/community_logos/52517/voxv.png)
and thus total number of objects downloaded according to
pcap (15 sum of all non-zero size objects across all IPs (cdn0 has two as mentioned in first screenshot)) and har
(13) are different.
Strange thing happens here, this same object is requested
twice from two different TCP connections acc. to the pcap
file but there is no single corresponding request in the har file.
E.g. 3:- There are cases where har dump has https packets
but there are no corresponding application layer packets in
the Wireshark dump. One such instance is that of
youtube.com. The har file has an entry with URL both...
Please join StudyMode to read the full document