Use GeoIP database within HAProxy

Introduction

Sometimes we need to know the country of the user using the application, for different purposes:

  • Automatically select the most appropriate language
  • Send a 302 to redirect the user to the closest POP from his location
  • Allow only a single country to browse the site, for legal reason
  • Block some countries we don’t do business with and which are the source of most web attacks

IP based location


To achieve this purpose, the most “reliable” information we have from a user is his IP address.
Well, not as reliable as we could hope:

  • not “reliable” because it’s easy to use a proxy installed in a foreign country to fake your IP address…
  • not “reliable” because GeoIP databases are not accurate.
  • not “reliable” because GeoIP databases rely on information provided by the ISP.
  • not “reliable” because any subnets can be routed from anywhere on earth.

When an ISP does a request for a new subnet to its local RIR, it has to tell the country where it will be used.
This country is supplied as a code composed by two letters, normalized by ISO 3166.

You can use whois tool can be used to know the country code of a IP address:

whois 1.1.1.1
[...]
country:        AU
[...]

geolocation definition

Well, it’s quite easy to understand: Geolocation is the process to link a third party to a geographicl location. In easier words: know the country of a client IP address.
On Internet, such base is called GeoIP.

geolocation databases


There are a few GeoIP databases available on the Internet, most of them uses IP ranges to link an IP address to its country code.
An IP range is simply a couple of IP addresses representing the first and the last IP address of a range.
NOTE: It might correspond to a real subnet, but in most cases, it doesn’t ;).

In example:

"1.1.2.0","1.1.63.255","16843264","16859135","CN","China"

What’s the issue with HAProxy then???


HAProxy can only use CIDR notation, with real subnets.
It means we’ll have to turn the IP ranges into CIDR notation. This is not a easy task since you must split the ip range in multiple subnets…
Once done, we’ll be able to configure HAProxy to use them in ACLs and do anything an ACL can do.

For example, the range above should be translated to the following subnets:

1.1.2.0/23 "CN"
1.1.4.0/22 "CN"
1.1.8.0/21 "CN"
1.1.16.0/20 "CN"
1.1.32.0/19 "CN"

Now, you can understand why GeoIP databases uses IP ranges: it takes fewer lines 🙂

iprange tool

To ease this job, Willy released a tool called iprange in HAProxy sources contrib directory.
You can see it here, in HAProxy’s git: http://www.haproxy.org/git/?p=haproxy.git;a=tree;f=contrib/iprange
It can be used to extract CIDR subnets from an IP range.

iprange installation


Just downlaod both Makefile and iprange.c then run make:

make
gcc -s -O3 -o iprange iprange.c

too complicated 🙂

iprange usage


iprange take a single incoming format, composed by 3 columns separated by commas:

  1. first IP
  2. Last IP
  3. Country code

For example:

"1.1.2.0","1.1.63.255","CN"

NOTE: in the example below, we’ll work with Maxmind Country code lite database.

The database looks like:

$ head GeoIPCountryWhois.csv
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
"1.1.0.0","1.1.0.255","16842752","16843007","CN","China"
"1.1.1.0","1.1.1.255","16843008","16843263","AU","Australia"

In order to make it compatible with iprange tool, use cut:

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | head
"1.0.0.0","1.0.0.255","AU"
"1.0.1.0","1.0.3.255","CN"
"1.0.4.0","1.0.7.255","AU"
"1.0.8.0","1.0.15.255","CN"
"1.0.16.0","1.0.31.255","JP"
"1.0.32.0","1.0.63.255","CN"
"1.0.64.0","1.0.127.255","JP"
"1.0.128.0","1.0.255.255","TH"
"1.1.0.0","1.1.0.255","CN"
"1.1.1.0","1.1.1.255","AU"

Now, you can use it with iprange:

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | head | ./iprange 
1.0.0.0/24 "AU"
1.0.1.0/24 "CN"
1.0.2.0/23 "CN"
1.0.4.0/22 "AU"
1.0.8.0/21 "CN"
1.0.16.0/20 "JP"
1.0.32.0/19 "CN"
1.0.64.0/18 "JP"
1.0.128.0/17 "TH"
1.1.0.0/24 "CN"
1.1.1.0/24 "AU"

Country codes and HAProxy ACLs


Now we’re ready to turn IP ranges into subnets associated to a country code.
Still need to be able to use it in HAProxy.
The easiest is to write all the subnets concerning a country code in a single file.

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | ./iprange | sed 's/"//g' 
| awk -F' ' '{ print $1 >> $2".subnets" }'

And the result is nice:

$ ls *.subnets
A1.subnets  AX.subnets  BW.subnets  CX.subnets  FJ.subnets  GR.subnets  IR.subnets  LA.subnets  ML.subnets  NF.subnets  PR.subnets  SI.subnets  TK.subnets  VE.subnets
A2.subnets  AZ.subnets  BY.subnets  CY.subnets  FK.subnets  GS.subnets  IS.subnets  LB.subnets  MM.subnets  NG.subnets  PS.subnets  SJ.subnets  TL.subnets  VG.subnets
AD.subnets  BA.subnets  BZ.subnets  CZ.subnets  FM.subnets  GT.subnets  IT.subnets  LC.subnets  MN.subnets  NI.subnets  PT.subnets  SK.subnets  TM.subnets  VI.subnets
AE.subnets  BB.subnets  CA.subnets  DE.subnets  FO.subnets  GU.subnets  JE.subnets  LI.subnets  MO.subnets  NL.subnets  PW.subnets  SL.subnets  TN.subnets  VN.subnets
AF.subnets  BD.subnets  CC.subnets  DJ.subnets  FR.subnets  GW.subnets  JM.subnets  LK.subnets  MP.subnets  NO.subnets  PY.subnets  SM.subnets  TO.subnets  VU.subnets
AG.subnets  BE.subnets  CD.subnets  DK.subnets  GA.subnets  GY.subnets  JO.subnets  LR.subnets  MQ.subnets  NP.subnets  QA.subnets  SN.subnets  TR.subnets  WF.subnets
AI.subnets  BF.subnets  CF.subnets  DM.subnets  GB.subnets  HK.subnets  JP.subnets  LS.subnets  MR.subnets  NR.subnets  RE.subnets  SO.subnets  TT.subnets  WS.subnets
AL.subnets  BG.subnets  CG.subnets  DO.subnets  GD.subnets  HN.subnets  KE.subnets  LT.subnets  MS.subnets  NU.subnets  RO.subnets  SR.subnets  TV.subnets  YE.subnets
AM.subnets  BH.subnets  CH.subnets  DZ.subnets  GE.subnets  HR.subnets  KG.subnets  LU.subnets  MT.subnets  NZ.subnets  RS.subnets  ST.subnets  TW.subnets  YT.subnets
AN.subnets  BI.subnets  CI.subnets  EC.subnets  GF.subnets  HT.subnets  KH.subnets  LV.subnets  MU.subnets  OM.subnets  RU.subnets  SV.subnets  TZ.subnets  ZA.subnets
AO.subnets  BJ.subnets  CK.subnets  EE.subnets  GG.subnets  HU.subnets  KI.subnets  LY.subnets  MV.subnets  PA.subnets  RW.subnets  SY.subnets  UA.subnets  ZM.subnets
AP.subnets  BM.subnets  CL.subnets  EG.subnets  GH.subnets  ID.subnets  KM.subnets  MA.subnets  MW.subnets  PE.subnets  SA.subnets  SZ.subnets  UG.subnets  ZW.subnets
AQ.subnets  BN.subnets  CM.subnets  EH.subnets  GI.subnets  IE.subnets  KN.subnets  MC.subnets  MX.subnets  PF.subnets  SB.subnets  TC.subnets  UM.subnets
AR.subnets  BO.subnets  CN.subnets  ER.subnets  GL.subnets  IL.subnets  KP.subnets  MD.subnets  MY.subnets  PG.subnets  SC.subnets  TD.subnets  US.subnets
AS.subnets  BR.subnets  CO.subnets  ES.subnets  GM.subnets  IM.subnets  KR.subnets  ME.subnets  MZ.subnets  PH.subnets  SD.subnets  TF.subnets  UY.subnets
AT.subnets  BS.subnets  CR.subnets  ET.subnets  GN.subnets  IN.subnets  KW.subnets  MG.subnets  NA.subnets  PK.subnets  SE.subnets  TG.subnets  UZ.subnets
AU.subnets  BT.subnets  CU.subnets  EU.subnets  GP.subnets  IO.subnets  KY.subnets  MH.subnets  NC.subnets  PL.subnets  SG.subnets  TH.subnets  VA.subnets
AW.subnets  BV.subnets  CV.subnets  FI.subnets  GQ.subnets  IQ.subnets  KZ.subnets  MK.subnets  NE.subnets  PM.subnets  SH.subnets  TJ.subnets  VC.subnets

Which makes subnets available for 246 countries !!!!!

In example, the subnets associated to AUstralia are:

$ cat AU.subnets 
1.0.0.0/24
1.0.4.0/22
1.1.1.0/24
[...]

The bash loop below prepares the ACLs in a file called haproxy.cfg:

$ for f in `ls *.subnets` ; do echo $f | 
awk -F'.' '{ print "acl "$1" src -f "$0 >> "haproxy.cfg" }' ; done
$ head haproxy.cfg 
acl src A1 -f A1.subnets
acl src A2 -f A2.subnets
acl src AD -f AD.subnets
acl src AE -f AE.subnets
acl src AF -f AF.subnets
acl src AG -f AG.subnets
acl src AI -f AI.subnets
acl src AL -f AL.subnets
acl src AM -f AM.subnets
acl src AN -f AN.subnets

That makes a lot of countries 🙂

Continent codes and HAProxy ACLs


Fortunately, we can summarizes it to continent. Copy and paste into a file the country code and continent relation from maxmind website: http://www.maxmind.com/app/country_continent.

The script below will create files with named with the continent name and containing the country codes related to it:

$ for c in `fgrep -v '-' country_continents.txt | sort -t',' -k 2` ; 
do echo $c | awk -F',' '{ print $1 >> $2".continent" }' ; done

We have now 7 new files:

$ ls *.continent
AF.continent  AN.continent  AS.continent  EU.continent  
NA.continent  OC.continent  SA.continent

Let’s have a look for countries in South America:

$ cat SA.continent 
AR
BO
BR
CL
CO
EC
FK
GF
GY
PE
PY
SR
UY
VE

Let’s aggregate subnets for each country in a continent into a single file:

$ for f in `ls *.continent` ; do for c in $(cat $f) ; 
do cat ${c}.subnets >> ${f%%.*}.subnets ; done ;  done

Now we can generate the HAProxy configuration file to use them:

$ for c in AF AN AS EU NA OC SA ; do 
echo acl $c src -f $c.subnets >> "haproxy.conf" ; done

Usage in HAProxy


Coming soon, an article giving some examples on how to use the files generated below to improve performance and security of your platforms.

Links

5 thoughts on “Use GeoIP database within HAProxy”

  1. Hi Baptiste, great article ! never too late to say that 😉

    I noticed some typos :
    1. “iprage usage” should be “iprange usage”
    2. the acls shown in the example after “$ head haproxy.cfg” are not valid : “acl A1 -f A1.subnets” should be “acl A1 src -f A1.subnets” (“src” is missing).

    cheers

  2. Hi, thanks for the article! Could you publish the list for chinese subnets? They are spamming my forum and i want to block them.

Leave a Reply

Your email address will not be published. Required fields are marked *