To follow ALL redirects using CURL brings up a lot of special cases

Wednesday, January 20, 2010 22:24
Posted in category CURL PHP Examples

To follow ALL redirects using CURL brings up a lot of special cases. Here’s a function that takes everything into account (even javascript redirects)

<?php

function get_final_url( $url, $timeout = 5 )
{
    $url = str_replace( "&amp;", "&", urldecode(trim($url)) );

    $cookie = tempnam ("/tmp", "CURLCOOKIE");
    $ch = curl_init();
    curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
    curl_setopt( $ch, CURLOPT_URL, $url );
    curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
    curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
    curl_setopt( $ch, CURLOPT_ENCODING, "" );
    curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
    curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
    curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
    curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
    curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
    $content = curl_exec( $ch );
    $response = curl_getinfo( $ch );
    curl_close ( $ch );

    if ($response['http_code'] == 301 || $response['http_code'] == 302)
    {
        ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
        $headers = get_headers($response['url']);

        $location = "";
        foreach( $headers as $value )
        {
            if ( substr( strtolower($value), 0, 9 ) == "location:" )
                return get_final_url( trim( substr( $value, 9, strlen($value) ) ) );
        }
    }

    if (    preg_match("/window\.location\.replace\('(.*)'\)/i", $content, $value) ||
            preg_match("/window\.location\=\"(.*)\"/i", $content, $value)
    )
    {
        return get_final_url ( $value[1] );
    }
    else
    {
        return $response['url'];
    }
}

?>
  • Share/Bookmark
You can leave a response, or trackback from your own site.

2 Responses to “To follow ALL redirects using CURL brings up a lot of special cases”

  1. James A. says:

    March 11th, 2010 at 4:02 pm

    I need to appreciate this good learn!! I certainly loved every little bit of it. I’ve you bookmarked your web blog read the fresh new stuff you blog post.

  2. Per Nielsen says:

    April 14th, 2010 at 7:30 pm

    Hi…
    First of all – thanks for the script. Much appreciated !

    A little fix, atleast it worked for me.
    The url in the javascript location function is used both with ” and ‘, added this to the match.

        if (    preg_match("/window\.location\.replace\('(.*)'\)/i", $content, $value) ||
                preg_match("/window\.location\=[\"'](.*)[\"']/i", $content, $value) ||
                preg_match("/location\.href\=[\"'](.*)[\"']/i", $content, $value)
        )
    

    Per :-)

Leave a Reply