使用Laravel&Elvedia \ Goutte抓取网站:如何提取JSON

I managed to access succesfully a remote JSON resource using Goutte Laravel 4:

$client = Goutte::getNewClient();

//*
$crawler = $client->request('GET', 'http://domain.mg/admin');

$form = $crawler->selectButton('Login')->form();
$crawler = $client->submit($form, array('username' => 'username', 'password' => 'password'));

//*/

$crawler = $client->request('GET', 'http://domain.mg/usergroup/list'); // Yields JSON Response

return dd($crawler);

It yields an output like so:

object(Symfony\Component\DomCrawler\Crawler)#285 (4) { ["uri":protected]=> string(36) "http://domain.mg/usergroup/list" ["defaultNamespacePrefix":"Symfony\Component\DomCrawler\Crawler":private]=> string(7) "default" ["namespaces":"Symfony\Component\DomCrawler\Crawler":private]=> array(0) { } ["storage":"SplObjectStorage":private]=> array(1) { ["0000000075faaa10000000001af55ef8"]=> array(2) { ["obj"]=> object(DOMElement)#241 (17) { ["tagName"]=> string(4) "html" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(4) "html" ["nodeValue"]=> string(438) "[{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(4) "html" ["baseURI"]=> NULL ["textContent"]=> string(438) "[{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]" } ["inf"]=> NULL } } }

I stumbled at extracting/converting the internal representation of the JSON within $crawler object. How could that be done?

Delving into Class Symfony\Component\DomCrawler\Crawler documentation, I found

public string html()

    Returns the first node of the list as HTML.

    Return Value

    string  The node html

which works as I expected.

Turning return dd($crawler) into return ($crawler->html()) yields:

[{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]

Conclusion

Goutte managed very well the complex (Laravel | crsf mechanism) Login process but I dislike striping JSON string using html().

Using return ($crawler->text()) getting at the same outcome is more "neutral" my opinion to.

I'm not sure clear on what you are wanting to do with the JSON exactly, but its fairly simple to convert a JSON string to arrays:

$data = json_decode($jsonString);