In this lab you will:
In this lab we will demonstrate how applications are scaled horizontally. Specifically we will run a web application built using the Flask Web Framework through a load balancer. We will use the very versatile Nginx as a load balancer. It is a very fast, reliable load balancer which comes with a variety of configurable algorithms for balancing load across servers on a network. We will spin up multiple instances of the same Flask web application behind the load balancer in so that we can get a clear picture of how traffic is distributed by a load-balancing application like Nginx.
+----LOAD BALANCER---+ +---CLIENT:BROWSER----+ ++++++++++++++
| Docker: nginx |---| host computer |---| Internet |
+------|http://nginx:80 | | http://localhost:80 | ++++++++++++++
| +--------------------+ +---------------------+
|
+----SERVER:Flask----+
| Docker: webapp |--+
| http://webapp:8080 | |
+--------------------+ |
| 1 or more instances|
+--------------------+
ist346-labs
PS > cd ist346-labs
PS ist346-labs> $Env:COMPOSE_CONVERT_WINDOWS_PATHS=1
PS ist346-labs> git pull origin master
lab-G
folder:PS ist346-labs> cd lab-G
PS ist346-labs\lab-G> docker-compose -f one.yml up -d
nginx
load balancer and three instances of your webapp
are up and running:
PS ist346-labs\lab-G> docker-compose -f one.yml ps
nginx
has tcp port 80
. The webapp
is running on tcp port 8080
, but only exposes itself to the proxy.http://localhost:80
and you will see the sample web application:NOTE: The webpage is designed to reload every 3 seconds so that you can see what happens on subsequent HTTP requests. No need to hit the refresh button in your browser!
There’s some really important information page of the Sample Web Application. It’s designed to help you understand the behavior of the load balancer.
webapp
docker container which served the page. Currently on page refresh we get the same host name each time. That’s because there is only one webapp
container running on the backend of the load balancer.webapp
container.Let’s scale our app to 3 instances and then observe what happens.
webapp
service so there are 3 instances (instead of 1), we must bring down the current application:
PS ist346-labs\lab-G> docker-compose -f one.yml down
PS ist346-labs\lab-G> docker-compose -f roundrobin.yml up -d --scale webapp=3
Refresh Now
link. You should now see 3 instances of the lab-g-_webapp
container. We have scaled the app.http://localhost:80
. Notice now when the page refreshes, on each request you get one of three different HOSTNAMEs and IP Addresses.roundrobin
algorithm. You can learn more about it here: https://en.wikipedia.org/wiki/Round-robin_scheduling.If your app is designed to scale horizontally, then you should be able to scale it infinitely. Large Hadoop clusters, a system designed to scale horizontally, have thousands of nodes. The problem, of course, is writing an app to scale horizontally is non a trivial task. The biggest issue around making data available to each node participating on the back end and dealing with updates to that data.
As we explained in the previous section the default load balancing algorithm uses round robin. Let’s explore two other load balancing algorithms leastconn
and ip_hash
.
The leastconn algorithm selects the instance with the least number of connections. If a node is busy serving a client, the next request will not use that node but instead select another available node. Let’s demonstrate this.
PS ist346-labs\lab-G> docker-compose -f roundrobin.yml down
PS ist346-labs\lab-G> docker-compose -f leastconn.yml up -d --scale webapp=3
http://localhost:80
at first glance, it seems to work the same as before, rotating evenly through each instance.CTRL
+ n
does this). And arrange the windows so you can see both at the same time. In the new window let’s request the Sample Web Application but instead use this url: http://localhost/slow/10
http://localhost
now only uses ONE or TWO of the three instances. The other instance is busy fulfilling the http://localhost/slow/10
request! When that request finishes, the other browser will once again use all three instances to fulfill requests.The uri
algorithm selects an instance based on a hash of the Uri (Uniform Resource Indicator). This algorithm differs from the leastconn
or roundrobin
, in that a given Uri will always map to the same instance. Let’s see a demo
PS ist346-labs\lab-G> docker-compose -f leastconn.yml down
PS ist346-labs\lab-G> docker-compose -f hash.yml up -d --scale webapp=3
http://localhost
notice how with every page load we get the same instance. This is because that Uri maps to the instance you see.http://localhost/a
you will get a different instance. If you re-load the page (you must do this manually) you will still get the same instance for this uri
.http://localhost/b
you will get yet another different instance. If you re-load the page (you must do this manually) you will still get the same instance for this uri
.http://localhost
, http://localhost/a
, or http://localhost/b
will always yield a response from the same instance. That’s how uri mapping works!There are other algorithms which in this manner. Consider their applications. Imagine distributing load based on geographical location, browser type, operating system, user attributes, etc… This offers a greater degree of flexibility for how we balance load.
This concludes our lab. Time for a tear down!
PS ist346-labs\lab-G> docker-compose -f hash.yml down
myservice
to 7
instances.leastconn
algorithm differ from the roundrobin
algorithm? How are they similar?uri
load balancer algorithm?