Sort by keyRSS ICONSort by date

as112

2006-05-24 03:56:54 -0700

One of two known extant global anycasting projects. It's good to keep track of this stuff because anycasting throws off topology measurements.

@misc{as112,
    rating = {1},
    author = {AS112},
    title = {{AS112 Project Homepage}},
    keywords = {anycast, bogon, bgp, area exam},
    howpublished = {}
}

Gong:1997

2006-04-26 21:08:57 -0700

Woo! An acknowledgement of the layered nature of the internet and the fact that this layering may wierd any game theory applied to it. They talk about how telcos and ISPs and individual may compete, and how we need a triumvirate of modeling, measurement, and theory to drive any actual conclusions.

They also mention that the length of time that a contracts commits a company is considered a curcial element when pricing bandwidth. They end by mentioning three promising areas: * Empirical study of how contract length affects bandwidth prices * Game theory of models that take into account contract duration and sunk costs * experimental approaches (Plott, Sugiyama, and Elbaz. 1994. "Economics of scale, natural monopoly, and imperfect competition in an experimental market". Southern Economic Journal (October): 261-287)

When competing at the bottom layer, you get lots of sunk costs, so you want long term contracts (10 years and the like) and you are willing to reduce prices to get them.

When ISPs compete, they want to lock customers in (1-2 years) to ensure a steady revenue stream so that they can continue paying off their long term telco contracts.

Endpoint businesses can aggregate their streams to act as resellers these relationships are marriages of convenience and tend to be ephemeral.

From a perspective of policy, you don't necessarily want competition at every layer. Telcos going out of business and coming into business is a big deal and there are lots of barriers to entry. There is an especially prescient quote from (Noam 1994 - Beyond liberalization II. The impending doom of common carriage; Telecommunications Policy, 435-452): "The long term result might be a gradual disinvestment in networks, the re-establishment of monopoly, or price cartels and oligopolistic pricing". This sounds a lot like our current situation. Must read more by Noam.

@inbook{Gong:1997,
    chapter = {The Economics of Layered Networks},
    publisher = {MIT Press},
    rating = {4},
    author = {Jiong Gong and Padmanabhan Srinagesh},
    title = {Internet Economics},
    year = {1997},
    keywords = {internet, economics, area exam},
    pages = {63-75}
}

Aiello:2000

2006-04-25 18:14:42 -0700

A paper on the evolution of random graphs. It attempts to survey a bunch of naturally occurring power law graphs and develop some kind of model for power law graphs which allows them, with the most minimal number of tweaks, to generate graphs that look like the web, a call graph, or the AS graph.

This looks a lot like Anand's(sp?) reasearch. Not yet done reading. This is a tough one.

Abstract:

We propose a random graph model which is a special case of sparse random graphs with given degree sequences. This model involves only a small number of parameters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. Furthermore, from these parameters, various properties of the graph can be derived. For example, for certain ranges of the parameters, we will compute the expected distribution of the sizes of the connected components which almost surely occur with high probability. We will illustrate the consistency of our model with the behavior of some massive graphs derived from data in telecommunications. We will also discuss the threshold function, the giant component, and the evolution of random graphs in this model.

@inproceedings{Aiello:2000,
    author = {William Aiello and Fan Chung and Linyuan Lu},
    title = {A Random Graph Model for Massive Graphs},
    year = {2000},
    keywords = {graph theory, area exam, power law},
    booktitle = {STOC}
}

Achlioptas:2005

2006-04-25 18:04:06 -0700

Traceroute sampling creates a shortest path tree in a process that the authors model as a BFS search. I'm not sure that their model is correct - this paper is dense and requires a few more readings - but if it is, then they do an excellent job of figuring out what the traceroute actually captures.

Their equations are huge and hairy, however, and I feel like I'm missing a bunch of the background.

Abstract:

Understanding the structure of the Internet graph is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining its graph structure is a surprisingly difficult task, as edges cannot be explicitly queried. Instead, empirical studies rely on traceroutes to build what are essentially single-source, all-destinations, shortest-path trees. These trees only sample a fraction of the network's edges, and a recent paper by Lakhina et al. found empirically that the resulting sample is intrinsically biased. For instance, the observed degree distribution under traceroute sampling exhibits a power law even when the underlying degree distribution is Poisson. In this paper, we study the bias of traceroute sampling systematically, and, for every general class of underlying degree distributions, calculate the likely observed distributions explicitly. To do this, we use a continuous-time realization of the process of exposing the BFS tree of a random graph with a given degree distribution, calculate the expected degree distribution of the tree, and show that it is sharply concentrated. As example applications of our machinery, we show how traceroute sampling finds power-law degree distributions in both epsilon-regular and Poisson-distributed random graphs. Thus, our work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions.

@inproceedings{Achlioptas:2005,
    rating = {4},
    author = {Dimitris Achlioptas and Aaron Clauset and David Kempe and Cristopher Moore},
    title = {On the Bias of Traceroute Sampling; or, Power-law Degree Distributions in Regular Graphs},
    month = {May},
    url = {http://arxiv.org/PS_cache/cond-mat/pdf/0503/0503087.pdf},
    year = {2005},
    keywords = {networks, graph, traceroute, modeling, measurement, area exam},
    booktitle = {Symposium on Theory of Computing}
}

Aharoni:2000

2006-03-10 10:45:01 -0800

How do students think about data structures, and how can we get to think about them correctly (whatever that means) and effectively?

In math people abstract for increased power to the detriment of understandability. When people get confused about an abstract concept, they are given a concrete example.

In computer science we abstract for increased clarity. When people are confused, they are given metaphors and perhaps examples of areas of usage, but never actual code. Code is too confusing.

This article is very snarky about 'programming language thinking'. They get cranky when people mention the idea of memory or pointers, but it's not at all clear that the mention of these things indicates incorrect thinking or even insufficient abstraction of the appropriate ideas. But maybe it does. At the very least, they should be able to think at a language-free level, and to give them that ability, we have to create situations where they are forced to do so.

Using the examples of students thinking about arrays vs thinking about trees and lists, they conclude that the reason students can think dynamically about trees and lists, but don't see arrays as extendable is because they all use C and Java. Since arrays are built in to the language as nonextendable, they see the concepts of array ans a nonextendable thing (issues about realloc() aside)

To remedy this, they propose teaching students data structures by, in part, getting them to program in an environment with data structures that have an opaque implementation. This will (hopefully) force them to think about data structures at the appropriate level of abstraction. Then, later, they can be asked to program in languages where the implementation is known.

This ends up meaning that data structures courses should have two phases - first, solve problems using data structures without ever knowing their implementatoin, and second learn how these object that you have been using could be implemented and when you would want a given implementation. This jives well with a top-down approach to algorithms and data structures, where you present an algorithm as a set of steps involving smaller subtasks that are themselves unknown, and then you solve the subtasks. By approaching problems and data structures and everything else in this way, we force students to think high level because they don't know enough about the domain to "cheat" and start thinking about some low level implementation.

Abstract:

A research that has just recently been finished, investigated thinking processes that occur in the minds of students dealing with data structures. The research findings are pointed out in this paper, and two of them are elaborated. One is the phenomenon of programming-context thinking. This type of thinking stems from comparatively low level of abstraction gained by students in a d~a structures course. Programming-context thinking is the eause of other phenomena found in the research, and one such phenomenon- perception of a data structure as static or dynamic -- is also elaborated. Implications for data structures instruction are discussed.

Apart from presenting the research results, this paper serves as an example of cognitive research -- a kind of research that is still not broadly enough done in Computer Science Education. It is one purpose of this paper to manifest the need for more such research.

@inproceedings{Aharoni:2000,
    publisher = {ACM Press},
    doi = {http://doi.acm.org/10.1145/330908.331804},
    isbn = {1-58113-213-1},
    author = {Dan Aharoni},
    rating = {4},
    title = {Cogito, Ergo sum! cognitive processes of students dealing with data structures},
    year = {2000},
    pages = {26--30},
    location = {Austin, Texas, United States},
    address = {New York, NY, USA},
    keywords = {education, modeling, data structures},
    booktitle = {SIGCSE '00: Proceedings of the thirty-first SIGCSE technical symposium on Computer science education}
}

MacKie-Mason:1997

2006-02-11 18:36:04 -0800

An explanation of what the (1997) internet is for economists. They talk about the various connection technologies, explain circuit switching vs packet switching, and basically present the whole internet as a relatively done deal. Good overview with enough technical detail and a nice emphasis on the layering approach to services that the internet takes.

They do actually mention the fact that congestion in the internet isn't a problem until it is a huge problem - there is no gradual packet loss (modulo RED, but I don't think that was around then), instead you see a network that is just fine dramatically switch into a network with very high latency and alarming packet loss rates. This congestion can then be almost completely eliminated with just a small amount of increased provisioning.

@inbook{MacKie-Mason:1997,
    chapter = {Economic FAQs About the Internet},
    publisher = {MIT Press},
    rating = {3},
    author = {Jeffrey K. MacKie-Mason and Hal R. Varian},
    title = {Internet Economics},
    year = {1997},
    keywords = {internet, economics, FAQ},
    pages = {27-62}
}

McKnight:1997

2006-02-11 18:29:57 -0800

A somewhat dated introduction to the economics of interconnected networks. They talk a lot about how flat rate pricing is eventually going to give way to some kind of "usage-sensitive" pricing, although perhaps not linear. They seem still slightly trapped in a telco model, and I think they underestimate the cost of tracking all of these transactions.

Still, it remains a nice overview. It's aimed more at economists, but in doing so it manages to give a nice history of who paid for the development of the internet. They coverall of the charge models for internet access, and are pretty thorough. But it seems like technology has moved much faster than they anticipated. They talk about data rates of 56kbps on ISDN lines. And in that model, it makes sense to look forward and see a future of scarce bandwidth even in the core.

Moore's Law seems to have taken care of that concern, however. Basically, they are a little too dismissive of the "just overprovision by a bunch and stop worrying about it" model.

This book is also notably pre-boom in that it contains a lot of wondering how commerce on the internet might occur.

They also taalk a lot about how multicast and RSVP and digital cash have a few issues that need to be worked out, but they anticipate that these hiccups will be overcome and that the technologies will be widespread very soon. :-)

@inbook{McKnight:1997,
    chapter = {An Introduction to internet Economics},
    publisher = {MIT Press},
    rating = {3},
    author = {Lee W. McKnight and JosephP. Bailey},
    title = {Internet Economics},
    year = {1997},
    keywords = {internet, economics, overview},
    pages = {3 - 24}
}

Greenberg:2005

2006-02-11 13:14:28 -0800

This paper generated enough controversy in review that they actually had a public review printed on the page before the paper. It talks about a new design space for debugging and monitoring networks - the "4D Approach". Those D's are: decision, dissemination, discovery, and data.

This paper is very theoretical for how applied its proposals are. And they use buzzwords like enterprize. On the other hand, a new kind of command and control infrastructure for IP networks would be nice - IGP/OSPF/ISIS/eBGP/iBGP/MPLS/whatever is a hugeass mess. It's possible, however, that the very messiness is what is keeping the internet an anarchic freeforall instead of an ITU/PTTlike locked down platform.

A lot of their objections to the current system and the problems they raise seem to be related to the fact that routers and switches use IP to set up their IP routing, and almost every intuitive technique uses some form of sideband communication. They also relate an example all about the enforcement of routing policy at the IP layer.

I'm getting a strong "magic dust" feel from their proposals. They seem to be articulating some form of language that expresses abstract networking goals that will then automagically deploy and properly dynamically maintain all required filters and links. They also talk about things like "a network wide view" which flies in the face of all my intuitions of the lack of simultanaity in distributed systems.

The intentions are good, but I can't bring myself to drink even a little bit of their koolaid. This whole approach smacks of "build a second internet to maintain this one" and "add a second layer of indirection". Also, their talk of a language to express network design goals seems a bit odd. Our network design goals are limited by the available protocols - which means that, depending on protocol issues and language design, you will either be able to express impossible goals or not be able to express possible goals. Or the language will be exactly equivalent to router configs and packet filters, which sort of seems like a drasticly narrowed vision of what they are expressing.

@article{Greenberg:2005,
    volume = {35},
    publisher = {ACM Press},
    doi = {http://doi.acm.org/10.1145/1096536.1096541},
    rating = {2},
    author = {Albert Greenberg and Gisli Hjalmtysson and David A. Maltz and Andy Myers and Jennifer Rexford and Geoffrey Xie and Hong Yan and Jibin Zhan and Hui Zhang},
    journal = {SIGCOMM Comput. Commun. Rev.},
    issn = {0146-4833},
    year = {2005},
    number = {5},
    address = {New York, NY, USA},
    keywords = {networks, management, opinion},
    title = {A clean slate 4D approach to network control and management},
    pages = {41--54}
}

rfc4264

2006-01-17 19:45:00 -0800

It's possible for multiple BGP speakers to get into a situation where, no matter how long you wait, the routing never settles down. This RFC details the theory and practice of how that can happen. It's also a major argument for more programming languages/type safety people to be involved in protocol design.

@misc{rfc4264,
    author = {T. Griffin and G. Huston},
    title = {{BGP Wedgies}},
    month = {November},
    keywords = {rfc, bgp},
    year = {2005},
    howpublished = {Internet Engineering Task Force: RFC 4264}
}

Abarbanel:2004

2006-01-17 19:37:41 -0800

A presentation describing Connexion by Boieing - a nice hack on the BGP infrastructure that allows planes to transition from continent to continent pretty easily, but is almost indistinguishable from a poorly run network that is also being subject to attacks.

Also, interestingly, it allows you to track international flights by watching prefixes migrate from AS to AS.

@conference{Abarbanel:2004,
    rating = {3},
    author = {Ben Abarbanel},
    booktitle = {NANOG 31},
    title = {Global Network Mobility},
    month = {May},
    year = {2004},
    keywords = {presentation, NANOG}
}

Aaronson:2005

2006-01-17 19:31:27 -0800

A physicist had recently claimed that he had evidence that NP was in P because soap bubbles naturally set up a minimal steiner tree between sticks places on two opposing pains of glass dipped in soapy water. This drove Scott Aaronsen to make a physical steiner tree instance that was *actually hard* (that, incidentally didn't get solved quickly) and to muse on the wierder aspects of the intersection of computational complexity and physics.

@article{Aaronson:2005,
    volume = {36},
    publisher = {ACM Press},
    doi = {http://doi.acm.org/10.1145/1052796.1052804},
    rating = {2},
    author = {Scott Aaronson},
    journal = {SIGACT News},
    issn = {0163-5700},
    year = {2005},
    number = {1},
    address = {New York, NY, USA},
    keywords = {complexity, NP-Complete},
    title = {Guest Column: NP-complete problems and physical reality},
    pages = {30--52}
}

Aaronson:2003

2006-01-17 19:12:15 -0800

A crash course in formal logic and the notion of independence for the complexity theorist. A nice read where Scott Aaronsen looks at the history of the idea of independence, what it would mean for NP Completeness to be independent of the ZFC axioms, and how likely that possibility is.

Abstract:

This is a survey about the title question, written for people who (like the author) see logic as forbidding, esoteric, and remote from their usual concerns. Beginning with a crash course on Zermelo-Fraenkel set theory, it discusses oracle independence; natural proofs; independence results of Razborov, Raz, DeMillo-Lipton, Sazanov, and others; and obstacles to proving P vs. NP independent of strong logical theories. It ends with some philosophical musings on when one should expect a mathematical question to have a definite answer.

@article{Aaronson:2003,
    volume = {81},
    rating = {3},
    author = {Scott Aaronson},
    journal = {Bulletin of the EATCS},
    title = {Is P Versus NP Formally Independent?},
    month = {October},
    year = {2003},
    keywords = {complexity, NP-Complete}
}

Gabriel:1996

2005-11-16 16:46:54 -0800

Richard P. Gabriel muses on patterns in software, and it contains a very interesting introduction written by none other than Christopher Alexander.

@book{Gabriel:1996,
    publisher = {Oxford University Press},
    author = {Richard P. Gabriel},
    title = {Patterns of Software},
    year = {1996}
}

Roughgarden:2005

2005-11-16 16:41:23 -0800

A readable adaptation of the author's PhD thesis into a book talking all about how un-uptimal networks can be if we have no central control and allow all agents to make locally optimal (read: envy-free) decisions.

@book{Roughgarden:2005,
    publisher = {MIT Press},
    rating = {5},
    author = {Tim Roughgarden},
    title = {Selfish Routing and the Price of Anarchy},
    year = {2005},
    keywords = {selfish, routing, game, theory, graph}
}

Geerdes:2002

2005-11-16 16:36:18 -0800

This one is a (quite impressive) thesis, which means that it is really long and this is only a cursory overview. It discusses basically all of the issues, costs, and benefits behind making cell phones into call routers as well as simple endpoints.

They discuss using a relaying approach in cellular networks instead having everyone always talk to the base station, and one conclusion is

for medium-sized networks relaying can improve the overall data throughput by 10-15 percent on the average. Under total user bandwidth fairness relaying can improve the data throughput by 20-40 percent on the average. All in all, there is a considerable benefit in using relaying, even in arbitrary networks.
they then go further and discuss when relaying is particularly beneficial. The eventual upshot is that if we could coordinate things centrally, then we could improve everyone by a little, and the worst cases by a lot. This isn't particulalrly suprising or great. What is nice is that he cites evidence that a decentralized approach could also yield great benefits.

This one is interesting to some degree because of its technical content, but probably more as a good model of how to proceed and write about this kind of thing. He has a nice mix of graph theory, statistics, and simulation, and it all comes together pretty well in the end.

@phdthesis{Geerdes:2002,
    school = {Technische Universitat Berlin},
    author = {Hans-Florian Geerdes},
    title = {Capacity Improvements in TDMA-based Cellular Networks by Relaying and Flexible Transmission Scheduling},
    year = {2002},
    keywords = {networks, graph, theory, wireless, relaying}
}

Diestel:2000

2005-11-16 16:35:34 -0800

A thorough intro to graph theory exponded by Andrzej as the thing to read.

@book{Diestel:2000,
    publisher = {Springer},
    rating = {4},
    author = {Reinhard Diestel},
    title = {Graph Theory},
    url = {http://www.math.uni-hamburg.de/home/diestel/books/graph.theory/},
    year = {2000},
    keywords = {graph, theory}
}

Pierce:1991

2005-11-16 16:31:27 -0800

Just what it says - an intro to category theory for computer scientists. It reads well, and I feel like category theory must be somehow connected to my stuff. It makes my brain tingle, but I have no better explanation.

@book{Pierce:1991,
    publisher = {MIT Press},
    isbn = {0-262-66071-7},
    author = {Benjamin C. Pierce},
    url = {http://portal.acm.org/citation.cfm?id=119358},
    title = {Basic category theory for computer scientists},
    year = {1991},
    address = {Cambridge, MA, USA},
    keywords = {math, category, theory}
}

Hain:2005

2005-11-16 16:23:51 -0800

They figure that we have about 5 years before we run out of IPv4 space to allocate. These numbers are derived by looking at the rateof drain from the RIRs instead of Geoff Huston's work looking at the rate at which new stuff is announced on the net.

@article{Hain:2005,
    volume = {8},
    rating = {3},
    author = {Tony Hain},
    journal = {The Internet Protocol Journal},
    title = {A Pragmatic Report on IPv4 Address Space Consumption},
    number = {3},
    month = {September},
    year = {2005},
    keywords = {ip, ipv4, policy, allocation}
}

Labovitz:1999

2005-11-09 18:00:53 -0800

"Death of the Internet Predicted" was the cry in 1999 as BGP started becoming chatty and kooky looking. So they looked at BGP traffic at a few ASs and tried to find out where the chatty instabilities were coming from. It turns out that most ASs and prefixes are very stable, and that it was a very few relatively unpopular and edge-ish ASs that were generating most of the traffic.

In other words, BGP traffic was flaky because there were some edge ASs that were flaky looking.

Abstract:

This paper examines the network routing messages ex- changed between core Internet backbone routers. Internet routing in- stability, or the rapid fluctuation of network reachability information, is an important problem currently facing the Internet engineering commu- nity. High levels of network instability can lead to packet loss, increased network latency and time to convergence. At the extreme, high levels of routing instability have led to the loss of internal connectivity in wide- area, national networks. In an earlier study of inter-domain routing, we described widespread, significant pathological behaviors in the routing information exchanged between backbone service providers at the major U.S. public Internet exchange points. These pathologies included several orders of magnitude more routing updates in the Internet core than an- ticipated, large numbers of duplicate routing messages, and unexpected frequency components between routing instability events. The work de- scribed in this paper extends our earlier analysis by identifying the origins of several of these observed pathological Internet routing behaviors. We show that as a result of specific router vendor software changes suggested by our earlier analysis, the volume of Internet routing updates has de- creased by an order of magnitude. We also describe additional router software changes that can decrease the volume of routing updates ex- changed in the Internet core by an additional 30 percent or more. We conclude with a discussion of trends in the evolution of Internet architec- ture and policy that may lead to a rise in Internet routing instability.

@article{Labovitz:1999,
    rating = {4},
    author = {Craig Labovitz and G. Robert Malan and Farnam Jahanian},
    journal = {IEEE Networks?},
    title = {Origins of Internet Routing Instability},
    year = {1999},
    keywords = {networks, instability, routing, bgp}
}

Zhao:2002

2005-11-09 18:00:26 -0800

A protocol enhancement to BGP to separate out valid and invalid MOAS.

They protect agains invalid MOAS by noting that a router can lie to a bunch of ASs, but there will almost always be a preponderance of correct information out there. If we could somehow get everyone together to vote on the 'correct' way of doing things, then we would, with high probability, eliminate the invalid announcements.

In doing this, they advocate allowing one of the final octets in the community attribute of a BGP message to be an announcement that AS x can also announce the prefix. Then, once a conflict is found, you do a DNS lookup for a special record type.

This is a pretty nicely distributed approach - the DNS lookup being the main point of contention - and they found that even with partial deployment they could significantly reduce the number of add routes being accepted. The main intuition is that a hijacked router can inject false routes, but it can't effectively prevent correct routes from being propagated.

Abstract:

Network measurement has shown that a specific IP address prefix may be announced by more than one autonomous system (AS), a phenomenon commonly referred to as Multiple Origin AS, or MOAS. MOAS can be due to either operational need to support multi-homing, or false route announcements due to configuration or implementation errors, or even by intentional attacks. Packets following such bogus routes will be either dropped or, in the case of an intentional attack, delivered to a machine of the attacker's choosing.

This paper presents a protocol enhancement to BGP which enables BGP to detect bogus route announcements from false origins. Rather than imposing cryptography-based authentication and encryption to secure routing message exchanges, our solution makes use of the rich connectivity among ASes that exists in the Internet. Simulation results show that this simple solution can effectively detect false routing announcements even in the presence of multiple compromised routers, become more robust in larger topologies, and can substantially reduce the impact of false routing announcements even with a partial deployment.

@inproceedings{Zhao:2002,
    author = {Xiaoliang Zhao and Dan Pei and Lan Wang and Dan Massey and Allison Mankin and S. Felix Wu and Lixia Zhang},
    title = {Detection of Invalid Routing Announcement in the Internet},
    year = {2002},
    keywords = {bgp, security, moas, detection, protocol, networks, routing, landmark},
    booktitle = {Proceedings of the Internationl Conference on Dependable Systems and Networks}
}

Zhao:2001

2005-11-09 18:00:26 -0800

So, BGP kind of assumes that a prefix is mapped to only one ASN within a network. But, we know from measuring the internet that that assumption does nto always hold.

It is occassionally true that we can see a prefix being advertised as belonging to both ASN x and ASN y - usually these advertisements are relatively restricted, however. So, why to these Multiple Origin AS (MOAS) announcements occur? Are they always malicious? Why would they ever be useful?

These MOAS seem to be long lived - eliminating one catastrophic routing even on April 7, 1998, the average conflict is expected to last around 30 days, with some being seemingly terminal. Most of these conflicts occur in /24s, but there are more /24s out there, and most of the routing table is /24s, so this is kind of obvious.

We can divide these up into 3 cases, by the relation between the advertised paths:

1) if one path is a truncation of the other, then an AS is announcing itself as a terminal AS out of one interface, and as a transit AS out of another interface.

2) If they start the same but end differently, then an AS in the middle is announcing different routes to different peers.

3) The paths are completely distinct.

The distinct paths case is the most common, and is also the hardest to explain away. The main explanation seems to be the idea that the offending prefix is multi-homed and staticly routed.

Surprisingly, they said that NONE of the conflicts was due to anycasting.

Misconfiguration seems to be the primary cause by sheer quantity, because a single fat fingering can cause thousands of MOAS conflicts. Faulty aggregation is another cause.

But they don't really say much beyond 'this is what happens' and 'this is what we think happens'. They theorize about malicious ASs, but it seems unlikely that people were worrying aabout malicious ASs in 1998.

Abstract:

This paper presents a detailed study of BGP Multiple Ori- gin AS (MOAS) conflicts observed in the Internet. A MOAS conflict occurs when a particular prefix appears to originate from more than one AS. We analyzed data from archived BGP routing tables over 1279 days. Most of the conflicts were short-lived, lasting only a small number of days. The potential causes for the MOAS conflicts and impact on BGP fault-tolerance are discussed in detail.

@inproceedings{Zhao:2001,
    author = {Xiaoliang Zhao and Dan Pei and Lan Wang and Dan Massey and Allison Mankin and S. Felix Wu and Lixia Zhang},
    title = {An Analysis of BGP Multiple Origin AS (MOAS) Conflicts},
    url = {citeseer.ist.psu.edu/zhao01analysis.html},
    year = {2001},
    keywords = {bgp, moas, networks, analysis},
    booktitle = {Proceedings of the ACM SIGCOMM Internet Measurement Workshop}
}

Zhou:2004

2005-11-09 18:00:26 -0800

A really short paper - more of an extended abstract or tiny little fact finding.

They showed that preferential attachment isn't quite the right model for the internet, and in particular, that Barabasi's model overestimates the robustness of thr AS graph. Unfortunately, they make the implicit assumption that the AS graph they got from Routeviews is complete, and they don't say how they post-processed it to get a connected graph - presumably the simply made all links bidirectional, but if so that's an unstated assumption.

Abstract:

A comparison between the topological properties of the measured Internet topology, at the autonomous system level (AS graph), and the equivalent graphs generated by two different power law topology generators is presented. Only one of the synthetic generators reproduces the tier connectivity of the AS graph

@article{Zhou:2004,
    volume = {40},
    author = {Shi Zhou and R. J. Mondragon},
    journal = {IEEE Electronic Letters},
    title = {Redundancy and Robustness of the AS-Level Internet Topology and ts Models},
    pages = {151},
    year = {2004},
    keywords = {bgp, routeviews, as graph, power law, completeness}
}

Tsuchiya:1988

2005-11-09 18:00:26 -0800

In this paper the author proposes an idea called "landmark routing" as a technique for routing in very large networks. In particular, it seems like a good technique for routing in very large ad-hoc networks, or for routing in a zero-configuration environment.

In landmark routing a node aims at the landmark that it knows about that is nearest to the desired destination. This is not surprising, and is also exactly how tree-based routing works. But with landmark routing, every node in the network is also a router, and if the message should happen upon some node that knows more about where the destination is, that node is free to route the packet better.

It's a lot like driving from point a to point b by asking directions. At first you know that you should "aim towards the mountains", but as you get closer, the people know better and better directions to where you want to go, and, should you happen upon a shortcut, they'll tell you to take it.

The "mountains" or "the big tree by the pond" in this case are called landmark nodes. There is a heirarchy of landmarks, and all the landmarks on a particular subnet know how to get to each of their sibling, immediate neighbors, and immediate children. All high-level landmarks are automatically mid-level landmarks and low-level landmarks as well.

I really liked the intuitive appeal of this routing scheme, but the paper has a few flaws - in particular, issues of "who becomes the new landmark" are not addressed, and routing table explosion for the nodes high in the landmark heirarchy seems a too-likely possibility. Which is okay, I suppose. I kind of wish these ideas had become more mainstream, but it turns out that one of the major problems in a scheme like this is resource location. We want a URL, how do we figure out what destination address to go to?

@article{Tsuchiya:1988,
    volume = {18},
    publisher = {ACM Press},
    doi = {http://doi.acm.org/10.1145/52325.52329},
    author = {P. F. Tsuchiya},
    journal = {SIGCOMM Comput. Commun. Rev.},
    issn = {0146-4833},
    number = {4},
    year = {1988},
    keywords = {networks, routing, landmark},
    title = {The landmark hierarchy: a new hierarchy for routing in very large networks},
    pages = {35--42}
}

Prechelt:2000

2005-11-09 18:00:25 -0800

A bunch of programming languages are tested by having a large group of people code up solutions in their own favorite languages. Time until completion, execution time, and correctness of all the programs are then judged.

They find that lines of code per hour is constant across languages, and that the differences in speed are greater on an intra-language basis than on an inter-language basis. That is, some people write slow code everywhere, and some people write fast code everywhere. But the scripting languages were a little slower on average, but used no more than half as many lines to get the job done. In my mind, much like in Paul Graham's, programmer time is the most precious resource of all. This study shows that programmer productivity is directly proportional to the number of lines of code a program takes, and that the scripting languages take fewer lines of code. The implications for your own situation are left as an exercise for the reader.

@article{Prechelt:2000,
    author = {Lutz Prechelt},
    journal = {IEEE Computer},
    title = {An Empirical Comparison of Seven Programming Languages},
    month = {October},
    year = {2000},
    keywords = {programming, languages, comparison, speed, syntax, coding}
}

Stoica:2001

2005-11-09 18:00:25 -0800

Every now and then I get discouraged and think that all of the money invested in computer science is a waste except for those funds that are sent to the MIT LCS. This is one of the papers that causes me to think that. Chord is a distributed hash table that has O(lg n) lookups, and each node has to maintain, along with its share of the hash space, O(lg n) state. Not only that, but the main concept behind it is really simple.

In this paper, the authors develop the chord protocol, and then analyze it really thoroughly via theory, simulation, and actual implementation and experimentation. They find that, under certain assumptions of network requirements, Chord is an extremely simple and efficient scheme that has a low overhead and good long-term stability.

This paper was really well written, and was good and clear. I liked it a lot.

I already knew about the Chord protocol before I read this paper due to conversations with many other people about peer to peer networks and distributed hash tables. It seems to me that the academic community and the world at large are asking for very different things from their peer to peer networks, but that might be because I'm not fully involved in the p2p community on either side.

@inproceedings{Stoica:2001,
    author = {Ion Stoica and Robert Morris and David Karger and Frans Kaashoek and Hari Balakrishnan},
    booktitle = {Proceedings of the 2001 ACM SIGCOMM Conference},
    title = {Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications},
    url = {citeseer.ist.psu.edu/stoica01chord.html},
    year = {2001},
    keywords = {networks, p2p, chord},
    pages = {149--160}
}

Shenker:1995

2005-11-09 18:00:25 -0800

Multimedia and its unique demands force us to think about newer, better networks that could provide the support we would like at the network level instead of at the application level. This paper looks at what that network might be.

Again, I have to take off both my idealism cap and my deployability cap, but once I do that, I can begin to evaluate this paper on its merits (hopefully).

Interestingly enough, aside from the focus on a new model for the internet, the first thing I noticed about this paper was its chatty style. It reads like a pleasant and refined conversation with or seminar by Scott Shenker. It was a pleasant change from some of the more hardcore Theorem-Proof-Lemma-Proof-Corrollary papers I read concurrently with it.

I liked his proposed model for utility, and I also liked that it explained why best-effort scaled so well and did not need admission control, but why multicast and multimedia QOS streams do need admission control. It was pleasantly akin to how we can solve a lot of problems efficiently if you allow that the solutions be floating point numbers, but once you require integer answers, the problems are all NP-Complete. This happens with flow problems, with linear programming, and it's nice to see that this notion of integral cutoff - a flow is either at 100 percent utility or 0 percent - also makes things difficult in other areas.

I also liked how, by drawing different utility curves for various applications he was able to predict which services would actually benefit from over-provisioning, and which ones would simply still be impossible or, at least, very difficult.

I still don't think that the QOS and admission control and differentiated services ideas are deployable on a cross-domain basis in the real internet, nor do I particularly want them to be, but many ideas and insights in the paper were revealing in a nice way. His writing style made a lot of ideas that were hazy come into much sharper focus.

@article{Shenker:1995,
    volume = {13},
    author = {Shenker, Scott},
    journal = {IEEE Journal on Selected Areas in Communication},
    title = {Fundamental Design Issues for the Future Internet},
    number = {7},
    month = {September},
    url = {citeseer.ist.psu.edu/shenker95fundamental.html},
    year = {1995},
    keywords = {networks, multimedia}
}

Padhye:1998

2005-11-09 18:00:24 -0800

The authors of this paper construct a theoretical model of TCP's performance and then validate it empirically. This paper is/was a particularly hard slog. I get the feeling that most readers probably skimmed the hard bits and then cut to the chase. I had to continually stifle the urge to skip a lot of the equations.

The main idea was to construct a model of TCP throughput and how it would respond to a given set of network conditions. They take into account both fast restransmit and timeout behaviours. As near as I could tell, all of the statistics was really solid - they have a few questionable assumptions about losses, in that they assume loss events are independent, but I can't even begin to imagine the hairy kind of model required to deal with the idea that a loss at time t makes subsequent losses more likely, because congestion doesn't go away instantly.

Their model assumes relatively stable network conditions, that is, it doesn't deal with reaction speed so much as the final steady state of the network. This is a nice first step for showing TCP-friendliness, but we also need to talk about TCP's behaviour in dynamic conditions to ensure a TCP-friendly flow. It also doesn't deal with the slow-start technique that TCP uses.

They then evaluated their model experimentally and discovered that, yes, their assumptions were pretty good for long-lived flows across the internet between a wide selection of sites.

@article{Padhye:1998,
    author = {J. Padhye and V. Firoiu and D. Towsley and J. Krusoe},
    journal = {Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication},
    title = {Modeling TCP Throughput: A Simple Model and its Empirical Validation},
    url = {citeseer.csail.mit.edu/padhye98modeling.html},
    year = {1998},
    keywords = {networks, TCP, modeling},
    pages = {303--314}
}

McCanne:1996

2005-11-09 18:00:24 -0800

They bring together a lot of previous research and propose that, instead of sending out one large stream of video, we should have many multicast trees, and a layered video codec, so that you subscribe to the most basic tree if you can't support anything better, and then you subscribe to progressively better trees until you max out your current capabilities. If network congestion increases, then you can unsubscribe from these trees in reverse order.

It seems interesting, but as always, it violates the end to end argument by arguing that ISPs should turn on multicast in their routers. I believe Reza is doing research that is essentially extending this to multicast over unicast and seeing whether or not the benefits are still maintained.

The quality of their service is directly dependent on the quality of each ISPs multicast implementation. If join and leave are not fast enough, then bad juju results. I was also a bit put off by their handwaving about group size and rate convergence, and the fact that people need to cooperate. As we're seeing with bittorrent, there's always a few people willing to get malicious clients to mess with people, so it seems that protocols that require cooperation must be robust to the occasional cheater.

@inproceedings{McCanne:1996,
    rating = {3},
    author = {Steven McCanne and Van Jacobson and Martin Vetterli},
    booktitle = {SICGOMM},
    title = {Receiver-driven Layered Multicast},
    year = {1996},
    keywords = {networks, multicast, multimedia, protocol, adaptivity}
}

Langheinrich:2002

2005-11-09 18:00:24 -0800

A really nice summary of privacy issues in ubiquitous computing.

@inproceedings{Langheinrich:2002,
    rating = {4},
    author = {Marc Langheinrich},
    booktitle = {UBICOMP},
    title = {Privacy Invasions in Ubiquitous Computing},
    year = {2002},
    keywords = {privacy, ubiquitous, ubicomp, surveillance}
}

Hadas:2001

2005-11-09 18:00:23 -0800

This is one I actually helped write with Ran, Jeff, Greg, and Jascha. It's pretty well written, and it details why solving the broadcast problem in a heterogeneous network of workstations is actually NP-Complete. Then we prove that simply telling the fastest node who has not yet received the message will get within an additive factor of optimal, or a multiplicative factor of 2, whichever is smaller.

@article{Hadas:2001,
    volume = {61},
    author = {R. Hadas and J. Hartline and P. Boothe and G. Rae and J. Swisher},
    journal = {Journal of Parallel and Distributed Computing},
    title = {On Multicast Algorithms for Heterogeneous Networks of Workstations},
    year = {2001},
    keywords = {broadcast, networks, workstations, theory, NP-Complete},
    pages = {1665-1679}
}

Karrenberg:2005

2005-11-09 18:00:23 -0800

Daniel Karrenberg makes a convincing case that anycast is still a good idea, and that anycast clusters provide better uptimes than unicasted clusters do.

@conference{Karrenberg:2005,
    author = {Daniel Karrenberg},
    booktitle = {NANOG (presentation slides)},
    title = {DNS Anycast Stability: A Closer Look at DNSMON Data},
    month = {March},
    year = {2005},
    keywords = {nanog, presentation, slides, anycast, DNS}
}

Floyd:1995

2005-11-09 18:00:22 -0800

Multicast is a hard problem to get right - once you start allowing more requirements on the network than best effort, it seems like the dam has burst and every requirement must be accommodated. They propose that, rather than trying to meet every requirement with one protocol, they allow the application to specify what requirements it needs the multicast session to meet. That way, applications that don't need certain requirements are not forcing the network to meet unnecessary demands.

They design the most minimal multicast framework, and then say that everything else is left up to the application. This seems like a really good approach to me - if anything were to convince people to turn on multicast on their routers, this idea of it not requiring a lot of configuration and network load should be a big part of it.

They discuss why you would want to specify the chunks you are missing in terms of what the application needs, rather than TCP sequence numbers, what the absolute minimum requirements for multicast are, and then they discuss an implementation and what its successes have been.

That implementation was for a distributed whiteboard, which sounds like it was a distributed collaborative vector graphics package, which is nice because that means that most of the primitives will fit on a single packet, so you don't have to worry too much about reconstruction and the like. In essence, much like ray-tracing is the easiest thing for parallel computers to do, this seems like the easiest multicast application to write. So if this didn't work, then their framework would truly be useless.

Since it seems to work well, they then experiment with several topologies for their multicast structures and eventually seem to settle on trees of bounded degree as the best approach.

@inproceedings{Floyd:1995,
    rating = {3},
    author = {Sally Floyd and Van Jacobson and Steven McCanne and Ching-Gung Liu and Lixia Zhang},
    booktitle = {SIGCOMM},
    title = {A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing},
    year = {1995},
    keywords = {multicast, networks, lightweight, protocol, scalable}
}

Brakmo:1994

2005-11-09 18:00:21 -0800

This is the paper in which TCP Vegas is proposed. Vegas has an almost entirely novel way of pre-detecting congestion and backing off early. If everyone used it, then the network would be less congested and still be able to maintain an equivalent throughput.

Vegas hgas a bunch of new stuff, but the most interesting to me is their congestion avoidance scheme by continually examining the RTT of every packet and looking at the variance of the packet arrival rate as evidence of queueing. If this level of jitter gets too high, then Vegas backs off. Note that this backing off occurs before any packets are lost. A pretty neat thing - no longer is dropping/marking packets the only method for signaling flows, now you can vary the inter-packet gap too!

Vegas also responds to dup-acks in a novel manner, made possible due to their continual clocking of RTT. If the dup-ACK would have caused a timeout if the timers had been set appropriately, then Vegas begin retransmitting as if a timeout had occurred. Thus they manage to implement an initial timer solely in software, not needing to rely on the coarse timeing available from most hardware.

They then validate Vegas under a number of usage scenarios, and come to the conclusion that it works. Unfortunately, TCP Reno flows tend to grab more than their share of the bandwidth, because of their more aggressive methods of flow control. The question of whether Vegas increased efficiency is made useless because Reno flows grab so much more bandwidth is not addressed.

@inproceedings{Brakmo:1994,
    author = {Lawrence S. Brakmo and Sean W. O'Malley and Larry L. Peterson},
    booktitle = {SIGCOMM},
    title = {TCP Vegas: New Techniques for Congestion Detection and Avoidance},
    url = {citeseer.ist.psu.edu/article/brakmo94tcp.html},
    year = {1994},
    keywords = {networks, TCP},
    pages = {24-35}
}

Chu:2000

2005-11-09 18:00:21 -0800

In this paper the authors make the case that the network is the wrong place to put multicast functionality, and that it would be better to put it at the edges of the network. The argument should be pretty familiar to everyone who knows about the end-to-end argument, but it was novel at the time because everyone had been assuming that IP-level multicast was the way to go.

They develop an application level multicast, show that it can be progressively deployed, and then test it to see whether the performance gains from putting multicast at the IP layer justify the increased complications. They then introduce the concepts of stress and stretch in an effort to actually measure how well or poorly their multicast scheme (Narada) compares to IP multicast in terms of efficiency. Stretch is meant to encapsulate the increase in delay seem by the clients, and stress is meant to encapsulate the extra bandwidth used on each link by the clients.

They found that they increased the RTT by a factor of less than 2, and that the same data traveled to different hosts over the same link no more than 4 or 5 times. This, in my mind and clearly the minds of the authors, is close enough in efficiency that the advantages in debuggability and deployability gained from putting multicast at the application level is a clear win.

@inproceedings{Chu:2000,
    author = {Yang-Hua Chu and Sanjay G. Rao and Hui Zhang},
    booktitle = {Measurement and Modeling of Computer Systems},
    title = {A case for end system multicast},
    url = {citeseer.ist.psu.edu/chu01case.html},
    year = {2000},
    keywords = {networks, multicast, end-to-end},
    pages = {1-12}
}

Clark:1992

2005-11-09 18:00:21 -0800

In this paper, the authors talk about how to design a packet network architecture that could support real-time services. This is one of those papers that I have to remove my idealism hat for a bit to talk about, because I really like the social effects that stem from universal routing and undifferentiated best-effort.

Not only that, I also have to remove my practicality hat, because internet-wide QOS seems pretty undeployable.

So, moving forward...

What would such a network look like? What kinds of guarantees could it accommodate? Can we have hard real-time requirements? Soft real-time? What about delay?

They deal with each of these worries in turn, usually using the ideas of per-flow queues instead of per-interface all-for-one style queues. Hard real time in this network would then be pretty easy, but every router would probably have a lot of wasted bandwidth that was promised to a host, but isn't currently required.

This leads to the idea of over-promising (analogous to soft real-time) by making assumptions about actual traffic use. Kind of like how people design freeways by assuming that everyone at point A won't actually want to go to point B at the same time. The same ideas of per-flow buckets is used again, but this time an over-promising algorithm is also put into the mix.

They then apply these ideas to the delay problem after noting that not all flows need low delay. The delay problem can be dealt with in several ways, but the most interesting one was to set a requested-delay header in a packet and then have each router decrement that header by the amount of time that packet spent waiting in the queue. Then we would have priority-based routing in which packets with a low requested-delay got promoted in the queue in front of packets that don't need the speed. You need to be careful with this scheme to make sure that no packet starves, but that is dealt with in their described Weighted Fair Queueing (WFQ) algorithm.

All-in-all an interesting networks paper, but not one with much relevance to today's internet.

@inproceedings{Clark:1992,
    author = {David D. Clark and Scott Shenker and Lixia Zhang},
    booktitle = {SIGCOMM},
    title = {Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism},
    url = {citeseer.ist.psu.edu/article/clark92supporting.html},
    year = {1992},
    keywords = {networks, real-time, integrated, services},
    pages = {14--26}
}

Clark:2005

2005-11-09 18:00:21 -0800

A bunch of the formeost networking researchers detailing where they think the internet and everything else is going, and where research energies would best be spent. A fascinating read.

@article{Clark:2005,
    rating = {4},
    author = {David D. Clark and Craig Partridge and Robert T. Braden (chair) and Bruce Davie and Sally Floyd and Van Jacobson and Dina Katabi and Greg Minshall and K.K. Ramakrishnan and Timothy Roscoe and Ion Stoica and John Wroclawski and Lixia Zhang},
    journal = {Report of a working session of the End-to-End Research Group},
    title = {Making the world (of communications) a different place},
    month = {January},
    year = {2005},
    keywords = {end-to-end, networks, roadmap}
}

Clarke:2001

2005-11-09 18:00:21 -0800

This paper, by Ian Clarke et al (the creator of Freenet) details how they built Freenet, what its design goals are, and how it works.

Freenet is a peer to peer protocol that, as much as it is possible, allows people to download information from the network without letting an attacker know exactly what it was they downloaded at all, and without disclosing exactly where in the network that particular information was stored. Their design goals include across the board anonymity and deniability, as well as a few stabs at efficiency and robustness in the storage of popular data.

Freenet sounds great on paper, but it seems like it is optimized towards storing files that are relatively cheap to transfer from point to point. Their content clustering mechanism works by semi-randomly distributing files around, and this kind of mechanism works better when the files are relatively small instead of many megabytes. Unfortunately for the freenet project, most of the files that people want to share securely and anonymously are exactly multi-megabyte files.

Their analysis of their own system is spot-on, however, and the security aspects of the protocol seem very well done. I also like that they analyzed their overlay in the context of small-world networks, because that is something that more of the people doing p2p stuff need to do.

@article{Clarke:2001,
    volume = {2009},
    rating = {4},
    author = {Ian Clarke and Oskar Sandberg and Brandon Wiley and Theodore W. Hong},
    journal = {Lecture Notes in Computer Science},
    title = {Freenet: A Distributed Anonymous Information Storage and Retrieval System},
    year = {2001},
    keywords = {p2p, freenet, networks, systems, privacy}
}

Andersen:2001

2005-11-09 18:00:20 -0800

If we were willing to do away with things like scalability, then how reliable could we make a network that is based on a less-reliable infrastructure? That is exactly the question answered by RON. They propose a scheme in which all hosts constantly monitor all other hosts, and provide routing for all other hosts in the network, should they so desire.

This system, which scales with O(n^2) (although in their paper they make a quadratic graph with 3 points - a major no-no), is pretty clearly unfit for large groups, so they want to ask several questions: can RON be more reliable than the underlying internet, and what is the cost of RON in terms of bandwidth for a given group size.

To answer this, they implemented RON and then measured the results! They found that they could be more reliable than the internet - sometimes by a whole lot - and that their bandwidth usage was high enough that their scheme was really only practical for groups of size 30 or less.

While reading this paper, I could practically hear the authors glee in getting rid of many of the restrictions that network researchers have to deal with constantly, particularly the issue of scalability.

This paper had lots of relly interesting implications. In particular, I thought it was neat that RON could be used to allow the user to choose their own routing protocols. I thought it was also neat that, even for end systems onthe internet, for any property you might want to measure - delay, bandwidth, etc - the triangle property does NOT hold, and by participating in one of these resilient subversive overlays you can actually do better by almost any metric you might care about, except overhead efficiency.

@inproceedings{Andersen:2001,
    author = {David G. Andersen and Hari Balakrishnan and M. Frans Kaashoek and Robert Morris},
    booktitle = {Symposium on Operating Systems Principles},
    title = {Resilient Overlay Networks},
    url = {citeseer.ist.psu.edu/andersen01resilient.html},
    year = {2001},
    keywords = {resilient, overlay, networks, RON},
    pages = {131-145}
}

Balakrishnan:1997

2005-11-09 18:00:20 -0800

This paper asks how we might deal with asymmetries in the network when the reverse path of a TCP flow is the bottleneck, due to either transient network conditions, or simply because the reverse link is not as well-provisioned as the forward link. It introduces several ideas - ACK compression, ACK filtering, TCP Sender adaptation, and ACK reconstruction.

Some of these ideas can be piggy-backed on top of each other, and it then validates as many combinations of these as makes sense in an effort to figure out which ideas give the most bang for the buck.

It turns out that (counter to my intuition) ACK filtering combined with sender adaptation is the big winner.

I liked the way this paper really read like the author was figuring things out as he wrote it. He presented a myriad of options and gradually whittled them down. It both made the discovery process clear and made clear what ideas the author had tried and subsequently discarded or turned down, so that later researchers won't get caught following those dead ends.

@inproceedings{Balakrishnan:1997,
    author = {Hari Balakrishnan and Venkata N. Padmanabhan and Randy H. Katz},
    booktitle = {Mobile Computing and Networking},
    title = {The Effects of Asymmetry on {TCP} Performance},
    url = {citeseer.ist.psu.edu/article/balakrishnan97effects.html},
    year = {1997},
    keywords = {networks, TCP},
    pages = {77-89}
}

Agarwal:2004

2005-11-09 18:00:19 -0800

In this paper the authors analyze how BGP updates can affect an ISPs traffic matrix. They found that, despite lots of BGP updates, the inbound traffic from a particular location generally came in from only a particular border router. This is probably because traffic on the internet is not from all points equally - they remark that .05% of BGP updates affect 80% of the traffic, with the implication that 99.95% of BGP updates affect, at most, the other 20%.

Previously, it was shown that BGP didn't dramatically effect outbound traffic in an eBGP sense. This is the first time people have studied iBGP's effect on routing. Despite the fact that their conclusions aren't earthshaking - they conclude that things mostly work, and are more stable than the pessimists among us would have us believe and that the sky is not falling - their methods are extremely solid and interesting. Analyzing iBGP traffic is a mostly untackled problem, so it's nice to have examples to work from.

Abstract:

Recent work in network traffic matrix estimation has focused on generating router-to-router or PoP-to-PoP (Point-of-Presence) traf- fic matrices within an ISP backbone from network link load data. However, these estimation techniques have not considered the im- pact of inter-domain routing changes in BGP (Border Gateway Pro- tocol). BGP routing changes have the potential to introduce signif- icant errors in estimated traffic matrices by causing traffic shifts between egress routers or PoPs within a single backbone network. We present a methodology to correlate BGP routing table changes with packet traces in order to analyze how BGP dynamics affect traffic fan-out within a large ``tier-1'' network. Despite an average of 133 BGP routing updates per minute, we find that BGP routing changes do not cause more than 0.03% of ingress traffic to shift between egress PoPs. This limited impact is mostly due to the rela- tive stability of network prefixes that receive the majority of traffic -- 0.05% of BGP routing table changes affect intra-domain routes for prefixes that carry 80% of the traffic. Thus our work validates an important assumption underlying existing techniques for traffic matrix estimation in large IP networks.

@inproceedings{Agarwal:2004,
    author = {Sharad Agarwal and Chen-Nee Chuah and Supratik Bhattacharyya and Christophe Diot},
    title = {The Impact of BGP Dynamics on Intra-Domain Traffic},
    year = {2004},
    keywords = {bgp, networks, traffic, analysis, route, churn},
    booktitle = {SIGMETRICS/Performance}
}