RBleug


Regilero's blog; Mostly tech things about web stuff.

Understanding absolute and relative urls problems when playing with several domains from the same web application.
Understanding absolute and relative urls problems when playing with several domains from the same web application.

If you develop a web application you'll come soon at the moment of building your URLs.
The best thing to do, either, is handling relative URL.

If your app can behave with relative URL and handle a base_url prefix for sites installed in sub-directories (like '/myapp/is/here/') you'll be friend with sysadmins. -- if you use Zend Framework have a look at the setBaseUrl() router function--
But most app do not handle URLs this way, and absolute urls are used, sad world.
Then you'll find theses URL in HTML content, in css, in javascript, and every object generated by your web application.
That's hell for admins. So lets look deeper.

  • First moving your app on a different domain will maybe be harder (especially if you're a bad coder and this absolute URL is not written once in a config file).
  • Then serving your app with different domains in the same time won't be possible.
  • And harder, you'll break proxy rewriting.

Exemple and workarounds

The job of the proxy is to serve your web application on a different namespace. Maybe you have a myapp.localnetwork.net application, that every people in localnetwork.net will see.
The proxy will server myapp.publicdomain.com or even publicdomain.com/myapp/is/here.
And the job of the proxy is only handling HTTP headers.
That means it will only rewrite things like redirections (which is a code and url in headers of the response, not in the HTTP content).

browser
\->myapp.publicdomain.com (proxy)
\->myapp.anotherproxy.domain.com (proxy)
\->myapp.localnetwork.net

So you'll maybe ask the proxy maintainer to rewrite things on the content as well.
OK, this can be done with mod_proxy_html for example in Apache. But this as 2 drawbacks: * this module is parsing every content and rewriting it, that's really slower. * And this module cannot handle all content, it will maybe forget one of your unknow localnetwork.net url in a javascript file... big bugs.

Real app Solutions

So... if you either use absolute_url, then you should test if any proxy is between you and the guy requesting the page.
This is given by Apache to PHP in the HTTP_X_FORWARDED_HOST parameter (or something similar if you're not in PHP, that means it's something your webserver knows by HTTP headers of the request).
There you'll find a list of proxy used (if any) between you and the browser. Be careful, it's a coma separated list, with spaces.
The name you should use as the base name of your site is the first name of this list.

In our example we'll have: "myapp.publicdomain.com, myapp.anotherproxy.domain.com" and your absolute url to use is http://myapp.publicdomain.com

With that name overriding your default base name you'll have a better app. But the problem is not really completly fixed in fact. What you have is the host name, but you do not have any information about the protocol.
Maybe you'll prefix the name with 'http://' and it will be ok, but maybe the proxy is using ssl (because using ssl for a public filtering proxy before intranet application is something used a lot). And the protocol should be 'https://'.

If you decide to handle this in a config file of your application you'll obtain 'https://' even in your local network, where you should'nt.
Your only solution will be to tell your app which protocol to use for which name. Quite dirty.
Well, in fact there's a solution, the proxy can set:

RequestHeader set X-Forwarded-Proto "https"

On his SSL configuration and you'll have HTTP_X_FORWARDED_PROTO available on your code.

All theses things works because of 'de facto' http headers added by mod_proxy on Apache. You'll maybe wont have them, and your application will rely on proxy configuration. Remember relative URL is still the best solution.

Tags: Apache, HTTP, PHP, Proxy, Web

comments powered by Disqus