Internet routing with Quagga/Linux on Free Internet eXchange Oslo (FIXO)

I run a small internet routing AS (autonomous system), mostly to keep my routing skills up to date (somewhat).

AS56809 get transit from Blix Solutions AS, and it is also present on Free Internet eXchange Oslo (FIXO for short). FIXO can be compared to the somewhat bigger NIX peering point also in Oslo.

If you want to run BGP with Quagga on FIXO, here are a few hints on how to do it. This is made with quagga 0.99.14 from debs on debian wheezy. Let me add that I’m fairly inexperienced at this stuff, so there might be bad advise in this post :)

Make some access lists your networks in it, handy for filtering what routes you will be sending to peers:

conf t
access-list as64496-networks permit 192.0.2.0/24
access-list as64496-networks deny deny

ipv6 prefix-list ipv6-as64496-networks seq 2 permit 2001:db8::/32
ipv6 prefix-list ipv6-as64496-networks seq 10 deny any

Basic quagga/BGP setup:

conf t
password <pw>
ip forwarding
ipv6 forwarding

router bgp 64496
bgp router-id x.x.x.x
network 192.0.2.0/24
address-family ipv6
network 2001:db8::/32

Make some peer groups that contain most of the boilerplate stuff:

router bgp 64496
neigh fixo-peers maximum-prefix 100
neigh fixo-peers soft-reconfiguration inbound
neigh fixo-peers distribute-list as64496-networks out

address-family ipv6
neighbor fixo-peers activate
neighbor fixo-peers soft-reconfiguration inbound
neighbor fixo-peers maximum-prefix 100
neighbor fixo-peers prefix-list ipv6-as64496-networks out

Adding a new IPv4 peering:

conf t
router bgp 64496
neigh 91.198.176.x remote-as yyyy
neigh 91.198.176.x peer-group fixo-peers
neigh 91.198.176.x desc foo@bar.com, +47 1234

Adding a new IPv6 peering:

conf t
router bgp 64496
neigh 2001:7f8:41:0:xxxx:1 remote-as yyyy
neigh 2001:7f8:41:0:xxxx:1 desc foo@bar.com, +47 1234

address-family ipv4 unicast
no neigh 2001:7f8:41:0:xxxx:1 activate
address-family ipv6
neigh 2001:7f8:41:0:xxxx:1 peer-group fixo-peers

Making sure it works

After you’ve set things up, you need to verify that the peering comes up (Established state), and that you’re sending (and receiving) the routes you intend. A common mistake is to send the peer your whole routing table, which is why the distribute-list/prefix-list is in there.

Some handy commands are:

# show how this peering is doing. you want Established.
show ip bgp neigh 91.198.176.xx
# show what routes you are sending to this peer, should be the 1-5 routes you have in the as64496-networks access list.
show ip bgp nei 91.198.176.xx advertised-routes
# what you are getting from the peer. Tab-completion in vtysh is a pain on this command.
show ip bgp neigh 91.198.176.xx received-routes

# reset the peering.
clear ip bgp neigh 91.198.176.xx

If you are running IPv6 peerings, you’ll soon notice that the CLI commands are quite incoherent.

# general overview is under the ordinary ip bgp listing:
show ip bgp nei 2001:7f8:41:0:xxxx:1

# but, if you want to see advertised (and received) routes, these are here:
show ipv6 bgp nei 2001:7f8:41:0:xxxx:1 advertised-routes

Route server

Fixo operates a set of route servers. This is in essence BGP routers that only contains routes, and passes leaves the next-server in announcements as they were. In essence it means that you peer with the route server and get all the routes of all the others that also peer with it. Saves you a lot of emails.

Setting up the peering is identical to any ipv4/ipv6 peering, except that you need to tag/mark the routes you want redistributed by it with a community.

route-map fixo-routeserver-out permit 10 
 set community 61300:61300

Quagga doesn’t allow you to set a route-map on a single member of a peer-group, so you will have to duplicate some config for these two peers.

neighbor 91.198.176.253 remote-as 61300
neighbor 91.198.176.253 description fixo-routeserver-1
neighbor 91.198.176.253 soft-reconfiguration inbound
neighbor 91.198.176.253 maximum-prefix 300
neighbor 91.198.176.253 distribute-list as64496-networks out
neighbor 91.198.176.253 route-map fixo-routeserver-out out

All done!

Posted in Uncategorized | Tagged , , , | Leave a comment

Guidelines for ticketing systems (rant)

If you are writing a ticketing/support request system, make sure to remember the following:

  • It is ok to mangle the subject line if you are the ticketing system.
  • Absolutely no UTF8 support. ø Ã¥ ought to do it.
  • Make sure to include all of the ticket information when notifying the requestor of ticket state change.
  • Do not indicate which part has actually changed, the requestor can guess this easily.
  • All time references should be in local time without any indication of time zone. The requestor will know where you do business and where the support staff is working from.
  • Only insane users has wider screens than 800 pixels. (Hello, osTicket.)

If you are writing the interface used by the support personnel:

  • Any web page must have at least 60 links with similar names.
  • Make a bad re-implementation of SQL as query language. Require this for any searches for old tickets. (Yes, I am looking at you Request Tracker.)
  • Text fields for writing customer replies in must not disable Ctrl-W or ESC, so that bash/vim users are rightfully punished when using muscle memory key bindings.
  • Page loads should take at the minimum 10 seconds.
  • Always show full ticket history, especially that the system has sent copies of email to all involved. The most recent information should be the last to load.
  • Use javascript to render the menu bar after everything else is done. This keeps the user from skipping to another ticket too soon.

It is the law that all ticketing systems must adhere to this golden standard.

Posted in stuff | Tagged , , , , , , | Leave a comment

HTTP Live Streaming (HLS) tools

Delivering video over HTTP is up and coming. The two protocols that seem to be winning (currently) are HLS and HDS. These are my notes from playing around with HLS.

If you want to play HLS on Linux, this python player works: http://gitorious.org/hls-player . Sometimes recent VCL 2.0.3 works, but not always (and the progress bar behaves very strangely..)

If you want to dump an HLS stream to a single .ts file, use this tool: https://github.com/osklil/hls-fetch

If you want to dump a HLS VOD (video on demand, premade) stream to local disk to play with; try out my quick hls-mirror script: https://gist.github.com/anonymous/4755742

If you want to segment some media file for later video-on-demand delivery, using bleeding edge avconv (with just libx264+lamemp3) from git:

avconv -i big_buck_bunny_720p_surround.avi -vcodec h264 -acodec mp3 -hls_time 10 -hls_list_size 999999999 foo/output.m3u8

hls_list_size seems to be how many items (of the n made) that should appear in the m3u8 file. Setting to 0 actually means an empty m3u8, so useful!

The outputted files works on my android 4.1 and on my desktop.

Why all this you ask? Because video over HTTP works very well through Varnish, even better with streaming (cut-through forwarding) enabled. Even on mediocre DSL you can make a continous stream/file set, push to a Varnish on AWS/whatever, and anyone can broadcast live to the world. Slightly delayed, but standards based, simple to do and simple to scale.

I think we’ve just seen the start of how cool (and disrupting) this will become.

Posted in Uncategorized | Tagged , , | 2 Comments

Varnish Cache development news

Short update on activities in Varnish cache development:

  • Work is being done on getting access to the request BODY inside Varnish. This feature has been requested for a while, particularly by the security.vcl guys. See daf0e2c.
  • The 2013Q1 Varnish Developer Day will be held in København February 4th.
  • I had my request approved and got req.request (which was really the HTTP method) renamed to req.method. Hurray. 0e9627d
  • Martin is steadily working on a new libvarnishapi, that makes writing varnish tools (top, stat, ncsa..) simpler. Better filtering language, more stable and better performance. Or something like that, the details are still fuzzy. :)
  • I’ve been running varnish trunk in “production” on hyse.org for a couple of weeks now. No ill effects in serving data to report. Works fine for my (minor) needs. One small note is that varnishlog fd/request grouping doesn’t work currently so you need “varnishlog -O” to get any useful output at all. The rest of the tools are probably broken as well. :)
  • On the stable side of Varnish, we (Varnish Software) have got some experience with 3.0.3 now. It seems pretty solid. Setting a backend administratively sick from varnishadm is useful.
  • The Varnish Agent 2.0 was released the other day. It is a RESTful interface in front of varnishadm, and lets you query/set stuff over HTTP. We will use it for our management tool (VAC), but it should be useful for other uses as well. Source and release announcement.
Posted in Uncategorized | Tagged , | 2 Comments

Varnish trick: Serve stale content while refetching

Here is a small trick we recently implemented for a customer:

The main premise was:

No clients should have to wait while the backend works. If a request is a miss, give the client a slightly stale/old page instead, and fetch a new page in the background.

Since Varnish and VCL is super configurable, we can do this with a VCL hack and a small helper process.

The flow is that a client requests something that just expired. In vcl_miss we notice this, and change the backend to a sick one. We also log the URL that just failed with std.log(), before restarting the request handling. Back in vcl_recv the usual sick-backend behaviour kicks in and the slightly stale graced object is given to the client.

Outside Varnish there is a small python script that tails varnishlog output for a special VCL_Log entry. When it picks it up, it sends a request for the same URL to the local Varnish. In vcl_recv we detect this client, set req.hash_always_miss to force a refetch and let the python script wait while the backend recreates the page.

All requests that come in while the refetch is underway will be served graced copies at full speed.

Your 95 percentile response time graphs will love this feature, and maybe even some of your users as well. Cool, huh?

More information about grace in Varnish:

https://www.varnish-software.com/static/book/Saving_a_request.html#core-grace-mechanisms

Posted in stuff | Tagged , , | 9 Comments

New Varnish VMOD: Softer purges / invalidations

With the new softpurge vmod you can do cache invalidation in Varnish that only affects objects if your backend is up. This means that you can purge all you want, and in the normal case everything works as expected, but if your backend/origin server is unhappy you can serve stale content instead of a 503.

It introduces softpurge.softpurge(); where you would normally use purge; (in vcl_hit and vcl_miss).

The trick is of course that we only reduce the TTL of cached objects (and their Vary-ants), but keep the grace period. This gives us the additional win of increased concurrency while your backend recreates the content after a purge, all but the first client requesting a page withing the page creation time is served the stale one, per usual varnish grace behaviour.

You can find the vmod on my github account. Please remember that this is an early version which has not seen heavy production use yet.

PS: We’re looking into doing softer bans (the other way of doing cache invalidation in Varnish) as well. This looks to be a bit more involved, so we (Varnish Software) are looking for sponsors for this task. Ping me (or Ruben) if this sounds interesting.

Posted in Uncategorized | Tagged , , , | 1 Comment

Varnishncsa and std.log()

New in Varnish 3.0.3rc1 is that you can put arbitrary log lines from VCL into the varnishncsa output. This can be used for funky stuff like logging the session cookie along with the request.

Let’s say you have the following VCL and have a session id as the single site cookie:

import std;

sub vcl_deliver {

std.log(“sessionid:” + regsub(req.http.cookie, “.*=”, “”);

}

Run varnishncsa as follows:

 # varnishncsa -F ‘%h %l %u %t “%r” %s %b “%{Referer}i” “%{User-agent}i %{VCL_Log:sessionid}x’

And now your access logs contains the session id/cookie string. Great stuff huh?

Posted in Uncategorized | Tagged , , | Leave a comment

Mobile device detection in Varnish

As part of my DAY JOB[tm] I’ve been working on device detection with Varnish. There is some content about this on blogs here and there, but no single place to get rulesets or VCL. We want to fix that by announcing a community updated VCL set for this. Check out the Github project:

https://github.com/varnish/varnish-devicedetect

It is a VCL set for Varnish which uses regular expressions to group clients into pools like PC, mobile-iphone, mobile-android, tablet-ipad and more based on their User-Agent.

With this you can serve per-device-type content directly from the Varnish cache, without hitting your backend web servers. It is super easy to install with nothing to compile. Just pull the files and include them in default.vcl.

The backend sees the first request like http://example.com/foo.html?devicetype=mobile-android, picks out the GET argument and produces the content that is best suite for small androids. Later requests are cached in Varnish. The main points are: 1) no need to redirect, saving probably a second in page load time since mobile networks are high latency and low bandwidth, 2) increased cache hit rate, which means you don’t have to pay for as many backends. 3) you don’t have to maintain the regular expressions all by yourself.

In addition to the regular expression set and the supporting VCL code to send headers/GET parameter to the backend, you get a system for overriding the detection. Go to /set_ua_device/mobile-android with your usual browser, and later requests will be served as if you had an android phone. Simple as PIE.

The regular expression set will evolve over time (by us, Varnish Software, when a customer asks for it), or by community input. For example, if anyone has a good suggestions on how to differentiate android tablets and android phones, your input is very welcome :)

Posted in Uncategorized | Tagged , , | Leave a comment

Detecting IP-over-DNS, part two.

As mentioned earlier, I did some work on detecting IP-over-DNS traffic as a part of my masters degree from NTNU in Communication Technology with focus on information security.

My final method was to, and this may be cheating, look at a pcap or live dump of packets and group per domain the client requested DNS answers from. In essence, utilize that all public IP-over-DNS implementations use a single domain name. The cheating part is because there is no reason for a IP-over-DNS implementation to do this.. I figure the less ethical of IP-over-DNS users can patch this easily.

After collecting n replies from the recursive DNS server to the client and compute the following metrics:

  • percentage of replies to the same domain (domain_max_percent)
  • bytes per second over the time period it took for n replies to be seen. (bps)
  • average number of queries seen per second in the time period. (qps)
  • mean packet size in the time period (mps)

The values I got the best results with were:

  • mps > 140 bytes
  • qps > 2.27
  • bps > 560 bytes
  • domain_max_percent >= 98%
  • n=70 packets

For each sample consisting of n packets from a single client, compute these. If any of the rules above are false, the client is not an IP-over-DNS client. (by this definition)

Early attempts used n=30 packets and mps > 240 bytes, but detection avoidance attempts with extremely low fragment size showed that n=70 packets and mps > 140 byte gave the best results.

Critics may point out that a client may send a lot of fake requests in addition to the IP-over-DNS traffic to a different domain and then skew the max_domain_percent below 98%. Yes, this is possible.

I have Python implementations of all of the methods attempted. I guess realtime detection is the most interesting, and I will put it up on my github account sometime in the near future after cleaning up the code a bit.

In the time since I did my prestudies last spring there seem to have been published a paper on IP-over-DNS detection. It’s on Arxiv, and they use the character distribution in the query strings to match IP-over-DNS clients. This sounds cool, and way better than the Kolmogorovcomplexity attempts that I did.

Posted in Uncategorized | Tagged , | Leave a comment

Detecting IP-over-DNS

I wrote my master’s thesis at the Norwegian University of Science and Technology (NTNU) during the spring of 2010. My assignment was “Covert channels in the Domain Name System”, but for the most part it is about detecting an IP-over-DNS client from observing its queries. It is published in Norwegian, and available over email if you’re interested.

To put it in Wikipedia terms, I believe this blog now contains original research. Crazy.

The setup was to look at packet dumps from a recursive DNS cache, or more precisely, the answers from the local DNS cache to the client. PCAPs were taken at one of the DNS-recursors at NTNU, and also packet dumps of ssh, web and keepalive traffic for nstx, iodine and TUNS were taken.

I was able to within 10-30 seconds detect a client running iodine among university background traffic with no false positives. It was implemented in python with impacket, and works in real time with negligible resource usage.

First of all, I was not able to get the following detection mechanisms to work very well:

  • client’s bandwidth usage per time unit. May work if you whitelist your email servers and use 30-60 seconds detection time, but not a very promising method.
  • Kolmogorov-complexity. Zip the data parts of the DNS response. The idea was that the complexity (==lower ability to compress it) was higher on IP-over-DNS traffic than on the usual DNS traffic. Didn’t quite work out, but may be feasible with more work.

What I didn’t try, but seemed cool:

  • autocorrelation of time between queries. It should be a quite different, since the client is polling constantly.
  • time series analysis with wavelets. Complicated math.

I also believe that most people interested in machine learning will find this a very simple task. I did not look into it, but probably should have.

Top tip for people (without training, like me) is to never never never attempt to use uniform sampling for this stuff. I wasted probably a month on figuring this out. You already have a perfect event based sample set, use it for what it is worth.

So, since this blog post is pretty long already, I’m going to save the good stuff for a followup post later this weekend.

Posted in Uncategorized | Tagged , | Leave a comment