Proxy Servers

Protection Proxies

Can we think about any other aspects of systems that needs to be transparently hidden?

Or where exactly we would have heard about the term "Proxy" in a software enterprise ?

What about "Proxy Servers"?

Proxy servers are common in any IT company.

Most of the times when I ask my participants about "Proxy Servers", the usual answer that I get is .. Proxy servers are used to connect to the internet. I believe this is not the core reason why proxy servers are needed in an IT company.

Proxy Servers are basically a "Protection Proxy"

As we know a proxy can enable transparent hiding.

Let us consider the following example to understand protection proxy.

Lets say I am a very busy person and I donít want anyone and everyone to come and disturb me. Ie I want the access to me to be controlled. So how do I achieve the same? One way of enabling the same is by having a secretary so that any one who calls me will have to come through my secretary and the secretary forwards only the selected calls to me. This is called as explicit access control because the caller is aware that there is different person (secretary) who is explicitly controlling the access to Hemant.

But explicit access control is not usually preferred by people for two reasons. One all the callers have to know about the existence of my secretary and its reference (telephone number) and second many of the callers might get offended saying "What is this.. if we have to speak to Hemant we will have to speak to his secretary".

The second way of enabling access control is to control access to me but in a implicit manner without the caller being aware of the existence of another object controlling the access to me.

So how is it possible to ensure that we can implement implicit access control?

Donít we think , this is a problem of transparent hiding wherein we want to transparently hide the fact that some one is controlling the access to the target object? Donít we think a proxy can help us out in this situation ? I can simply introduce a level of indirection which will have the same interface as that of me, so that whenever the caller makes a call the call is intercepted by the proxy which in the preprocessing cycle will look into the details of the caller and then subsequently forward the call to me depending upon the credentials of the caller. In this case we have implicitly controlled access to me without the caller being aware that some other entity is controlling access to me.

So protection proxies are used to enable implicit access control or to transparently hide the fact that someone else is controlling the access to the target object.

The following figures shows the basic internet architecture

Figure- Figure

As we know any company who needs to have an internet presence needs to have a WebServer or an HttpServer which is registered with the DNS Server (Domain Name Server) using URL as the key and the IP Address of the HTTPServer as a value.

HTTPServer is a basic server process which can respond to HttpRequest.

On the client side whenever a person wants to access any website, he needs to type in the url of the corresponding site in the browser like www.rediff.com . The Browser is internally connected to the local DNS server, which in turn are connected to the other chain of DNS servers. The browser will query the chain of DNS Servers through the local DNS Server to find out the IP Address of the HTTPServer registered against the corresponding URL. Once the browser knows the IPAddress of the HTTPServer, the browser will open an HTTPConnection to the HTTPServer and send in a HTTPRequest . The HTTPServer can then process the HTTPRequest and send the result in the form of HTTPResponse. This is how basically how any internet enabled application works.

But let us understand publishing the IPAddress of the HTTPServer into the DNSServer is extremely risky. The reason being, once the IPAddress of the HTTPServer is published in the DNSServer, the IPAddress of your DNServer is now available to every one in this world and doing this is very risky. In todays world of hacking and unethical practices like information theft exposing the IPAdress of the HttpServer can be suicidal (Malacious crawlers only needs you IPAddress to steal data from your system.).So there are two problems out here , one we donít want to expose the IPAddress of the actual HTTPServer and second we donít want every one to interact with your systems directly which means we need to control the access to out actual HTTPServer to stop ill intentioned users to interact with our systems directly.

So again there are two ways to enable the same, either we can implement the same explicitly by introducing one more AdditionalHTTPServer with a different URL and publishing the same in the DNSServer and making every one know that if they want to interact with your system, the caller should first know the URL of the other AdditionalHTTPServer. They should first interact with the AdditionalHTTPServer and then the AdditionalHTTPServer will forward the call to the actual HTTPServer after checking the credentials in the preprocessing cycle. But I hope we realize it will be a complete chaos because every caller has to be now aware of the existence of AdditionalHTTPServer and its corresponding IPAddress.

So explicit access control seems to be bad idea, that means me should be able to control the access to the actual HTTPServer but in a implicit manner without the caller being aware of the same.

How do we enable this?

As we know , we need to have a level of indirection which will have the same interface as that of the actual HTTPServer.

Figure- Figure

We know world doesnít know Rediff Systems by its IPAddress, they know Rediff by its URL ie www.rediff.com so rediff can do is, it will introduce a ProxyHTTPServer in front of the actual HTTPServer and make the ProxyHTTPServer know the reference of the actual HTTPServer. Now instead of providing the IPAddress of the actual HTTPServer in to DNSServer it now provides the IPAddress of the ProxyHTTPServer into the DNSServer. Now whenever the user types in the url www.rediff.com , the browser queries the DNSServer and it gets the IPAddress of the ProxyHTTPServer instead of the actual HTTPServer. Now the client browser opens a HTTPConnection to the ProxyHTTPServer sending HTTPRequest. As the client only knows rediff throught its URL and not the IPAddress, the client feels he is speaking to the actual HTTPServer but he is not, he is speaking to the ProxyHTTPServer. Now in the preprocessing stage of the ProxyHTTPServer , your IS Team will subject this request to a lot of security checks (like checking the IP Address of the client from the request is coming, the Locale/ Country from which this request is coming, Whether IPAddress is blacklisted as a malicious client or How many times in the past the request has been received from the same IPAddress?) and if they feel there is anything suspicious they will not forward the request forward. Only when the IS team is sure about the credentials of the caller, it will forward the call to the actual HTTPServer which in turn may talk to the backend systems and return the results.

This is the very reason ProxyServers are needed in the IT Infrastructure as a security enhancing mechanism acting as a protection proxy or transparently hiding the fact that some external object is controlling the access to the target object.

Apart from acting as a protection proxy , Proxy Servers can also be customized to act as a load balancer to balance the load across clustered HTTPServers.

Figure- Figure

If some amount of staleness is acceptable in certain ser of data , the ProxyServer can also host a temporary cache , so that there is no need for the request to go back to the enterprise tier very other time, the requests can be catered at the level of the proxy servers, thus increasing the performance of the system.

Figure- Figure

Although proxy servers can be customized to act as a load balancers and as a temporary cache so as to increase the performance of the application but this should not be the only reason. Proxy Servers are primarily meant to act as a protection proxy.

There are a lot of other aspects of systems that needs to be transparently hidden , we will discuss the same when discussing the corresponding.

Disadvantages of using a proxy- As proxies are level of indirections meant for increasing the flexibility of any system, the obvious disadvantage is performance will be lower while going through the proxy.

 
Hemant Jha
Founder - VPlanSolutions
Researcher, Trainer

www.VPlanSolutions.co.in