Friday, December 19, 2014

Parse Website with PHP

Parse Website with PHP

Parsing website pages is always a challenge for PHP Developers. Parsing is using for retrieving images, scraping websites, crawling, error detection websites, and many other useful purposes. But PHP Simple HTML DOM Parser is a awesome or can say dream library for developer that work with both PHP and the DOM element using PHP.


You can get the Simple HTML DOM Parser from here and save file to any directory of your choice. First in a basic example we’ll include the class, and get all hyperlinks on the http://4evertutorials.blogspot.com .

<?php
/*
Online PHP Examples with Source Code
website: http://4evertutorials.blogspot.com/
*/

// Include the parser library
include('simple_html_dom.php');

// Retrieve the DOM from a given URL
$html_dom = file_get_html('http://4evertutorials.blogspot.com/');

// scan for all hyperlinks and print
foreach ($html_dom->find('a') as $element) {
       print $element->href;
}

?>
Below are few basic uses of PHP Simple HTML DOM Parser.
<?php
/*
Online PHP Examples with Source Code
website: http://4evertutorials.blogspot.com/
*/

// Include the parser library
include('simple_html_dom.php');

// Retrieve the DOM from a given URL
$html = file_get_html('http://4evertutorials.blogspot.com/');

// Find all anchor tags and print their HREFs
foreach($html->find('a') as $e) 
    echo $e->href;

// Retrieve all images and print their SRCs
foreach($html->find('img') as $e)
    echo $e->src;

// Find all images, print their text with the "<>" included
foreach($html->find('img') as $e)
    echo $e->outertext;

// Find all SPAN tags that have a class of "SpanClass"
foreach($html->find('span.SpanClass') as $e)
    echo $e->outertext;

// Find all TD tags with "align=center"
foreach($html->find('td[align=center]') as $e)
    echo $e->innertext;

// Extract all text from a given cell
echo $html->find('td[align="center"]', 1)->plaintext;

// Find the DIV tag with an id of "DivId"
foreach($html->find('div#DivId') as $e)
    echo $e->innertext;


?>
 

© 2014 4everTutorials. All rights resevered.

Back To Top