Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Manipulate the web source code. [Basics]
#1
Hi,
The goal of this tut is to learn how to retrieve, modify, and manipulate some web code source, easily.
"What the hell are you doing with that ?" you're going to ask me.

In fact, there is so much different manners to exploit a web source code : retrieve data from an other site, automate some boring tasks, etc...
i.g., a Link Checker.

[Image: r.png] Keep in mind that it exists a lot of better solutions than the one i'm going to explain to you, but ... it works perfectly althought with this method, and it's the easiest way.

Let's begin by some basics strings functions in PHP ;

First of all, let's see...


I ) Strings functions


function str_replace ( mixed $search , mixed $replace , mixed $subject [, int &$count ] )
Documentation
  • You have already understood that this function is useful to replace a term by an other in a string.

    Example :
    PHP Code:
    <?php
    $str
    ="Sheeps are purples";
    echo 
    "Before = ".$str."<br>";
    $str str_replace("purples","yellows",$str);
    echo 
    "After = ".$str."<br>";
    ?>
    Is going to output :
    Quote:Before = Sheeps are purples
    After = Sheeps are yellows

    You will see that this function is useful to filter some words, without using regular expressions.
    [Image: 2_25.png] This tutorial is not talking about regular expressions, but if you know how to use it ... don't hesitate to use them!



function explode ( string $delimiter , string $string [, int $limit ] )
Documentation
  • Returns an array of strings, each of which is a substring of string formed by splitting it on boundaries formed by the string delimiter .

    Example :
    PHP Code:
    <?php
    $str 
    "Hello, I'm Mr Spoon !"// $str is a string
    $str explode (" "$str); // We cut $str with the " " (espace caracter) >> $str is now an array.
    print_r($str);
    echo 
    "<br> str[3] = " $str[3];
    ?>
    Is going to output :
    Quote:Array ( [0] => Hello, [1] => I'm [2] => Mr [3] => Spoon [4] => ! )
    str[3] = Spoon


Okay ... Now we know only these two strings functions.
We will try to do something, only with them.

Let's see what is the function which allow us to get a web source code...



II ) How to get the source code


function file_get_contents ( string $filename [, int $flags = 0 [, resource $context [, int $offset = -1 [, int $maxlen = -1 ]]]] )
Documentation
  • This is going to retrieve the code source of the website that you have specified in the $filename parameter.

    Example :
    PHP Code:
    <?php
    $url
    ="http://supportforum.alwaysdata.net/Web-source/test.html"//We are going to get the web source of this page...
    $page=file_get_contents($url);// we save the source code in $page variable, as a string
    echo $page;// We display $page
    ?>
    Is going to output :
    Quote:This is a test !

    [Image: Untitled-5.png] Hey stop ! But, where are the html tags in my output ?!
    Haha, you are smart if you thought that.
    Indeed, they are not displayed ... But they exists !
    Let's take a look with htmlentities function... :

    PHP Code:
    <?php
    $url
    ="http://supportforum.alwaysdata.net/Web-source/test.html"//We are going to get the web source of this page...
    $page=file_get_contents($url);// we save the source code in $page variable, as a string
    echo htmlentities($page);// We display $page
    ?>
    Is going to output :
    Quote:<html> <head><title>Test !</title></head> <body>This is a test !</body> </html>
    Haha, now they are shown!

    [Image: w.png] Never forget : what you see is not always what you get.





III ) Code your own supportforums last connected members manager



Ok now, we know almost everything... let's work !
What are we going to do for a final example ?
We are going ... to retrieve the list of "Who's online" of supportforums, during the past 24 hours !
It's simple, it's displayed at the bottom of the index page.

Let's start :P

For doing a work like this, i advice you to look attentively the source code of the index...
Our first goal is to get in the source only the list, so we have to filter the rest.
Schematically, we are going to do that : We're going to keep the minimum by using explode.
Have you noticied the "<!-- start: online24_index -->" and the "<!-- end: online24_index -->" before and after the list ? We are going to use them to delimit the list !
[Image: acas.png]

Now ... let's code that !
PHP Code:
<?php
echo "<hr> // LIST OF MEMBERS THE PAST 24 HOURS<br>";
$url "http://www.supportforums.net/";
$page file_get_contents($url); // We put the source code of the index in $page
$page explode("<!-- start: online24_index -->",$page); // There is this line, near the begin of the list of connected ... We are lucky !
$filter1 $page[1]; // $page[0] = all the source before the "start : online24_index" [-] $page[1] = "all the source after the start : online24_index"
$filter2 explode("<!-- end: online24_index -->",$filter1); // We are lucky again ! This line is at the end of the connected list !
$connected $filter2[0]; //filter2[0] = all before the end -- filter2[1] = all after
echo $connected;
?>

Is going to output :
Quote:// LIST OF MEMBERS THE PAST 24 HOURS

Past 24 Hours Stats [Complete List]


Members: 202, Guests: 204, Bots: 2, Invisible: 10
TheAdept, S0rath 0f the Black Sun, Jake, dante1217, Red X, MreGSX, g4143, ProspectDotNet, n4q, SF Phantom, Lunation, Elektrisk, Rofl, IllusionSlayer, SuperFly, Combo, soulscore, DAMINKâ„¢, Ram, Agent, Shinkirou, mon3yexploit, chinaman7x7, J4P4NM4N, nullsession, Cellydy, rated, Glas, Fraggounnet, Conan, iBruteforce, RedTube, Pins222, h4ck3d-, lil PopTart kid, Jamza, kab012345, JTse, Gentleman, Dave1005, MrD., Kotel, Tim, Conspiracy, Psycho, MarkW7, Stevensan, Annuit Coeptis, Dismal, Acekidd01, Namic, PaNiK, Bartdevil, GizSho, d2ax5n, Codine, Scorpion, krzyflipx, R3c0nn1ssanc3, Skill, Nyx-, Goku's Nightmare, Wynd, ThaDunky, -NitroX, YoungHacker, Mephisto, Vorfin, tartou2, Moudi, Sparky, Camgaertner, juan9087, JonP, SMITHY558, fade, Raptor Jesus, GhostRaider, Omniscient, Code King, Pirata Nervo, brett7, Lex, Lord_Scorch, Madafaka, imrans110, joey23art, safin_hawlery, Sp33D, corinaw, Nighthawk, kennwel, sani, kojic, Linux, ...w3, Guerriero420, JDBar, TheDoctor, Grizzly, TechWizard86, miketh2005, headbustaz, Scorpion, Jolty, zenzul, darkvenom, Etheryte, Alowishus, bobslayer1, MyNameIs940, Josh G, Sam, Wirgles, nullsession, Templya, elobire, waseembhtt, InsideSin, n1tr0b, enter, Manutdfans7, ElephantShoe, PaNiK, Soldier of Fortune, BrainDeadFreak, Fernandez, Guerreiro, X-net, iHk, Xeno, L0iz, B22stard, Soulstoler, loge00, hey101, Yeas, SupportMaster, ReVamped, Steve Torres, steelprime, Cyberknife, Nightelf21, rafailos, LOL, Eustaquio, sani, cohen, 4rtl1n3r, WaTy, Team AXVIS, netvirus, Warbiest, proxy, ArmyyStrongg, Extasey, -JD-, Å&nbsp; Λ☨∀И, player_1, HaruhiSuzumiya, xion434, Pandemic, cstactics, El Zorro, Butters, VantisH, nevets04, AL3X, 127.0.0.1, Michael, CstmFXeD, Fallen, tusku, Cyberknife, Cppsean, XpecTz, deicider, SpydR, caspur, Kraut, Xch4ng3, Dimitri, Tm0, DasHaxer, Spunch, Lazydude2000, goku12205, highfuzz, killagamea1341, Socrates, Jakee, SuperBorn


Hey ... i guess we have reached our goal :) !
Congratulations !
But remember : It won't be always that easy !

[Image: r.png] If you want, you could try to filter certain name, using str_replace by example !

Now, let works your imagination, and try to imagine all the possibilities we can do with that ... keep all the data in a database, make some statistics ... etc !
It's you're turn now!



[Image: Untitled-5.png] > You want to take a look of what does it looks like in real ?

You can see that here :
http://supportforum.alwaysdata.net/Web-s...source.php



[Image: Untitled-5.png] > You want to download all the sources of this tutorial ?

You can download it here :
http://supportforum.alwaysdata.net/Web-s...source.zip


Hope it helps ! :)
Reply
#2
Nice tutorial. Gotta bookmark this for future reference.
Reply
#3
This is a great tutorial and also well designed ThumbsupVictoire
Thank you!
Reply
#4
Thanks for your replies!
Glad to help Smile.

Don't hesitate to ask for help if you use this tut.
As you surelly know, it exists many other very useful strings function for this kind of task.
Reply


Forum Jump:


Users browsing this thread: 3 Guest(s)