Invited talk
Statistics of Internet traffic volumes
This talk is based around the Transactions on Networking paper. We use 232 traffic traces to establish that for "mid-large" internet link (backbone links or ingress/egress links from reasonable sized institutions) the traffic is well-modelled by a log-normal distribution.
The associated paper is here:
https://arxiv.org/abs/2007.10150
Raphtory: A new tool for large temporal networks applied to the far right social network Gab
Mixed and time varying models for evolving complex networks
This paper is a presentation of the FETA framework and new work with Naomi Arnold on time varying models.
TCP in the wild
This talk is essentially the same as that delivered in Cambridge two months earlier (alas no progress on this research for that period).
The research is based on two papers:
A longitudinal analysis of Internet rate limitations -- http://www.richardclegg.org/tcp_rate_infocom_2014
and
On the relationship between fundamental measurements in TCP flows -- http://www.richardclegg.org/tcp_limitations_icc_2013
The essential findings are that TCP is not working as we expect. The expected correlation between throughput and packet loss is not found. The correlation with delay (RTT) is as expected -- throughput proportional to 1/delay. A high correlation with flow length is found -- longer flows have higher throughput. However, this may be a sampling error due to the restricted length of the samples used.
The TCP flows studied are broken down by assumed cause where TCP mechanisms are thought not to be the primary cause for throughput:
1) Application limited -- an application decides to reduce its own flow by deliberately not sending data.
2) Host Window limited -- one or other host has a low maximum window size that restricts flow.
3) Advertised window limitation -- a middlebox or the receiver manipulates their advertised window size to reduce the flow.
More than half of TCP flows (and more than 80% of long flows) are limited by these mechanisms and not by traditional TCP mechanisms.
TCP in the Wild
This talk is an updated version of this talk at QMUL. The difference is two slides at the end which provide insight into the sampling issues related to the data.
The key message of this paper is that TCP/IP does not work in the real world as it is generally taught. The idea of a connection when one side sends data as fast as possible controlled by loss to fill a pipe is not what happens in the real world.
This work joins the two papers
A longitudinal analysis of Internet rate limitations (INFOCOM 2014)
and
On the relationship between fundamental measurements in TCP flows (ICC 2013)
The talk analyses passive traces with the aim of explaining what are the root causes of bandwidth on a connection. Theoretical results show that in equilibrium an unconstrained TCP flow has a bandwidth proportional to 1/RTT and 1/sqrt(p) where p is probability of packet loss. The experimental results here show different results, however. In particular, while the relationship with RTT is upheld, the relationship with loss is not found. A strong relationship with the length of flow is found. Longer flows have faster throughput in proportion to sqrt(L) where L is the length of the flow in packets.
A follow up analysis looks at the causes of throughput. It is found that less than half of flows are governed by loss. Flow bandwidth is very often governed by applications -- for example you tube deliberately throttles traffic so that users do not download too far ahead. Some flows are governed by operating system restrictions which do not scale window sizes. Some flows are governed by middleboxes which manipulate the window size. It is these restrictions which, the network studied, are the primary method which restricts bandwidth on connections.
Talk to Streaming Analytics, Applications and Theory
Invited talk to the SAAT conference at Bournemouth describes briefly how the FETA model for graph evolution might be used in a streaming environment.
OpenFlow in the Access -- pushing OpenFlow to the last mile
Presentation to International Workshop for Trends in Future Communications, Campinas, Brazil http://futurenets.cpqd.com.br/
This presentation talks about UCL's work on Software defined networking (specifically OpenFlow) and, in particular, adding OpenFlow capabilities to GEPON (Gigabit Ethernet Passive Optical Networking) devices.
Studying TCP in the wild
An updated version of this talk was given at Cambridge and can be seen here
The key message of this paper is that TCP/IP does not work in the real world as it is generally taught. The idea of a connection when one side sends data as fast as possible controlled by loss to fill a pipe is not what happens in the real world.
This work joins the two papers
A longitudinal analysis of Internet rate limitations (INFOCOM 2014)
and
On the relationship between fundamental measurements in TCP flows (ICC 2013)
The talk analyses passive traces with the aim of explaining what are the root causes of bandwidth on a connection. Theoretical results show that in equilibrium an unconstrained TCP flow has a bandwidth proportional to 1/RTT and 1/sqrt(p) where p is probability of packet loss. The experimental results here show different results, however. In particular, while the relationship with RTT is upheld, the relationship with loss is not found. A strong relationship with the length of flow is found. Longer flows have faster throughput in proportion to sqrt(L) where L is the length of the flow in packets.
A follow up analysis looks at the causes of throughput. It is found that less than half of flows are governed by loss. Flow bandwidth is very often governed by applications -- for example you tube deliberately throttles traffic so that users do not download too far ahead. Some flows are governed by operating system restrictions which do not scale window sizes. Some flows are governed by middleboxes which manipulate the window size. It is these restrictions which, the network studied, are the primary method which restricts bandwidth on connections.
Likelihood based framework for evolving graphs
This talk is the latest of my talks about FETA the framework for evolving topology analysis. This uses updated notation. The core of the work is a likelihood based model which can assess how likely it is that observations of the evolution of a graph arise from a particular probabilistic model, for example a model such as the Barabassi-Albert preferential attachment model. Analysis is given to data from Facebook and from Enron as well as from artificial models.
This talk gives some initial thoughts on how to make a synthetic network model from private contact networks.