Mobile device detection in Varnish

As part of my DAY JOB[tm] I’ve been working on device detection with Varnish. There is some content about this on blogs here and there, but no single place to get rulesets or VCL. We want to fix that by announcing a community updated VCL set for this. Check out the Github project:

https://github.com/varnish/varnish-devicedetect

It is a VCL set for Varnish which uses regular expressions to group clients into pools like PC, mobile-iphone, mobile-android, tablet-ipad and more based on their User-Agent.

With this you can serve per-device-type content directly from the Varnish cache, without hitting your backend web servers. It is super easy to install with nothing to compile. Just pull the files and include them in default.vcl.

The backend sees the first request like http://example.com/foo.html?devicetype=mobile-android, picks out the GET argument and produces the content that is best suite for small androids. Later requests are cached in Varnish. The main points are: 1) no need to redirect, saving probably a second in page load time since mobile networks are high latency and low bandwidth, 2) increased cache hit rate, which means you don’t have to pay for as many backends. 3) you don’t have to maintain the regular expressions all by yourself.

In addition to the regular expression set and the supporting VCL code to send headers/GET parameter to the backend, you get a system for overriding the detection. Go to /set_ua_device/mobile-android with your usual browser, and later requests will be served as if you had an android phone. Simple as PIE.

The regular expression set will evolve over time (by us, Varnish Software, when a customer asks for it), or by community input. For example, if anyone has a good suggestions on how to differentiate android tablets and android phones, your input is very welcome :)

Posted in Uncategorized | Leave a comment

Hosted Munin

New project; a hosted version of the munin (http://munin-monitoring.org/) server, running on: http://hostedmunin.com/

Munin is an open source solution for collecting metrics from servers and plotting them on trend graphs. It has a simplicity that beats all other monitoring solutions I’ve seen, including super easy plugin creation for adding new charts for your application.

This is for you if you:

  • have a couple of  machines / instances here and there (for v1.0), and just want monitoring to work.
  • want proper notifications and trending, without doing any configuring yourself.

The idea is that you can spend all your precious hours awake on making your application run supersmooth, instead of tuning monitoring systems.

Pricing looks like will be subscription based with a basic free package (1-3 servers, limited #plugins, but quite usable) and premium package with more nodes and extra features.

We’re in early beta, but have room for some more trial customers.If you want in, drop me an email.

Posted in Uncategorized | Tagged , , , | Leave a comment

Fredagsmorro fra urbandictionary.no (8.april)

tl;dr: her er de mest populære begrepene på urbandictionary.no denne uka: fettskattfeie overKjøtlefse, gnøkkasmultbag og naturligvis biffgardiner.

Videre er vinnerbegrepet for de som finner oss via google helt klart klatremus

En ukes tid etter lansering snakker vi:

  • 6200 hits
  • 500 besøk
  • 200% økning i antall begreper

Klemz til alle som bidrar og har bidratt med å lage den beste norske ordboka, dette blir bra. Tilbakemeldinger og lignende ondsinnet kritikk mottas naturligvis med takk på epost eller TWITAR (@lkarsten).

Posted in Uncategorized | Leave a comment

urbandictionary.no på lufta

Vinterens helgeprosjekt har vært å kode opp en norsk versjon av urbandictionary.com. Det har vi (BT+meg) nå begynt å få sånn passelig på lufta, så herved er http://urbandictionary.no/ lansert.

For de som måtte være ukjent med konseptet handler det om å lage en ordbok basert på slangord. Urbandictionary.com sier de har rundt 5 millioner innslag. Det er jo litt å strekke seg etter, når vi har rundt 100 begreper nå. :-)

planer for fremtiden:

  • noe api-greier slik at mobilklientene har noe å snakke med.
  • klienter for android/iphone.
  • nyhetsbrev med ukas ord?
  • kanskje selge noen kaffekopper for å dekke inn hostingutgiftene.

På den tekniske siden bruker vi Django med PostgreSQL i bunn. git til revisjonskontroll, redmine til issuetracking.

Vi trenger litt grafikk, om noen har lyst til å bidra. :-)

Posted in Uncategorized | Tagged , | 2 Comments

Internett i flyet hos Norwegian

Var ute og fløy i går, og fikk en av disse nye 737-800w -ene fra Norwegian. Nytt og spennende er gratis WLAN når man passerer 10k fot.

tl;dr -versjonen er at det virker omtrent som ISDN, men latensen via satellitt er ganske drøy. Man kommer ut på Internett i Tyskland, og round-trip til Norge er 1-2 sekunder. De har en linuxboks ombord som router IPv4 (ingen IPv6 å se), gjør dns/dhcp og har en transparent webcache (squid by the looks of it).

Som telematiker må det naturligvis undersøkes litt når sjansen bryr seg.

traceroute to www.vg.no (195.88.55.16), 64 hops max, 40 byte packets

1  192.168.33.1 (192.168.33.1)  1 ms  1 ms  1 ms

2  * * *

3  * * *

4  * * *

5  192.168.14.102 (192.168.14.102)  1188 ms  836 ms  961 ms

6  2.239.214.82.in-addr.arpa (82.214.239.2)  721 ms  685 ms  978 ms

7  cr-01-ge0-8.dw.direcpceu.com (62.128.191.190)  1101 ms  958 ms  799 ms

8  pr-02-ge0-1.dw.direcpceu.com (62.128.191.251)  740 ms  748 ms  733 ms

9  pos-2-1-a2.f.core.de.ignite.net (62.134.2.93)  774 ms  689 ms  733 ms

10  t2a7-ge7-0-0.de-fra.eu.bt.net (166.49.172.33)  679 ms  707 ms  714 ms

11  so-7-0-3.edge1.Frankfurt1.Level3.net (212.162.47.109)  685 ms  659 ms  671 ms

12  vlan99.csw4.Frankfurt1.Level3.net (4.68.23.254)  679 ms vlan69.csw1.Frankfurt1.Level3.net (4.68.23.62)  809 ms vlan79.csw2.Frankfurt1.Level3.net (4.68.23.126)  1161 ms

13  ae-72-72.ebr2.Frankfurt1.Level3.net (4.69.140.21)  1490 ms * ae-62-62.ebr2.Frankfurt1.Level3.net (4.69.140.17)  1646 ms

14  ae-48-48.ebr1.Dusseldorf1.Level3.net (4.69.143.177)  901 ms ae-47-47.ebr1.Dusseldorf1.Level3.net (4.69.143.173)  1166 ms ae-48-48.ebr1.Dusseldorf1.Level3.net (4.69.143.177)  1610 ms

15  * * *

16  ae-6-6.car1.Oslo1.Level3.net (4.69.143.225)  1278 ms  1403 ms  1440 ms

17  212.162.27.6 (212.162.27.6)  1361 ms  1306 ms  938 ms

18  te4-2-0.cr2.osls.no.catchbone.net (193.75.3.169) [MPLS: Label 301024 Exp 0]  793 ms  817 ms  790 ms

19  te7-1-0.cr2.xa19.no.catchbone.net (193.75.1.202) [MPLS: Label 308032 Exp 0]  777 ms  970 ms  1420 ms

20  te4-1-0.br1.xa19.no.catchbone.net (193.75.1.50)  890 ms  860 ms  958 ms

21  v4094.m323-rs1.net.kq.no (193.69.44.206)  1119 ms  1629 ms  1230 ms

22  www.vg.no (195.88.55.16)  970 ms  1377 ms  1485 ms

Nedlasting (hm, blir det ikke egentlig _opplasting_ når man er på 30k fot og jobber via en satellittlink?) av en ucachet 10MB stor fil gir 208KB/s.  Litt mistenkelige høyt, montro om målingen min er feil og den faktisk var cachet likevel.

 

Posted in Uncategorized | 2 Comments

Detecting IP-over-DNS, part two.

As mentioned earlier, I did some work on detecting IP-over-DNS traffic as a part of my masters degree from NTNU in Communication Technology with focus on information security.

My final method was to, and this may be cheating, look at a pcap or live dump of packets and group per domain the client requested DNS answers from. In essence, utilize that all public IP-over-DNS implementations use a single domain name. The cheating part is because there is no reason for a IP-over-DNS implementation to do this.. I figure the less ethical of IP-over-DNS users can patch this easily.

After collecting n replies from the recursive DNS server to the client and compute the following metrics:

  • percentage of replies to the same domain (domain_max_percent)
  • bytes per second over the time period it took for n replies to be seen. (bps)
  • average number of queries seen per second in the time period. (qps)
  • mean packet size in the time period (mps)

The values I got the best results with were:

  • mps > 140 bytes
  • qps > 2.27
  • bps > 560 bytes
  • domain_max_percent >= 98%
  • n=70 packets

For each sample consisting of n packets from a single client, compute these. If any of the rules above are false, the client is not an IP-over-DNS client. (by this definition)

Early attempts used n=30 packets and mps > 240 bytes, but detection avoidance attempts with extremely low fragment size showed that n=70 packets and mps > 140 byte gave the best results.

Critics may point out that a client may send a lot of fake requests in addition to the IP-over-DNS traffic to a different domain and then skew the max_domain_percent below 98%. Yes, this is possible.

I have Python implementations of all of the methods attempted. I guess realtime detection is the most interesting, and I will put it up on my github account sometime in the near future after cleaning up the code a bit.

In the time since I did my prestudies last spring there seem to have been published a paper on IP-over-DNS detection. It’s on Arxiv, and they use the character distribution in the query strings to match IP-over-DNS clients. This sounds cool, and way better than the Kolmogorovcomplexity attempts that I did.

Posted in Uncategorized | Tagged , | Leave a comment

den perfekte webstrategi

Clue (ordbokfolkene) har definitivt skjønt det.

Om du er litt usikker på denne nymotens webteknologien, gjør som du er vant til og klem det inn samme format som du er vant til. Windows-vinduer er jo best practice på web.

Posted in Uncategorized | Tagged , , | 1 Comment