My goal here is not to save any items to a database, but to just display a live stream.
I am pulling an RSS feed from Huffington Post
http://www.huffingtonpost.com/section/front-page/feed
I have a WordPress array (PHP) of the most recent 50 articles from the Huff.
$rss = fetch_feed($feed_url);
I want my RSS feed to ONLY display X
total unique posts per day. To make it simple, I was just going to display the post that is closest to intervals of 24 / X
.
For demonstration let's go with 3. The feed would spit out posts published closest to 8, 16 (2 PM), and 24 (midnight) or (0, 8, and 16).
In PHP, how do I sort an object array by a published time variable and then find the closest post to that time? Right now I'm doing a very roundabout way that currently isn't even working.
Here's my current logic:
if(function_exists('fetch_feed')) {
$rss = fetch_feed(get_field('feed_url'));
if(!is_wp_error($rss)) : // error check
$maxitems = $rss->get_item_quantity(50); // number of items at 50
$rss_items = $rss->get_items(0, $maxitems);
endif;
// display feed items ?>
<h1><?php echo $rss->get_title(); ?></h1>
<?php
$coutner = 0;
$daily_max = 3; //how many unique feeds to display per day
$display_interval = floor(24 / $daily_max); //simple way to make even intervals
$posting_time = array(); //to store the times to post
foreach(range(0, $daily_max-1) as $i) {
$posting_time[$i] = $display_interval * $i;
}
$post_interval = 0;
$date = new DateTime();
$today = date("G"); //getting the current day's hour
$time_adjust = $today / $display_interval;
//adjust the posting times order so that its circular
while($today > $posting_time[0]){
$hold = array_pop($posting_time);
echo '<p>hold: ' . $hold;
array_unshift($posting_time,$hold);
}
$accessing = array_pop($posting_time);
?>
<dl>
<?php if($maxitems == 0){ echo '<dt>Feed not available.</dt>';}
else{
foreach ($rss_items as $item) : ?>
<?php
//as soon as the first item is newer than post time, output it & count that time slot as being filled
$rss_item_hour = $item->get_date('G');
if($rss_item_hour > $accessing){ ?>
<dt>
<a href="<?php echo $item->get_permalink(); ?>"
title="<?php echo $item->get_date('j F Y @ G'); ?>">
<?php echo $item->get_title(); ?>
</a>
</dt>
<dd>
<?php echo $item->get_description(); ?>
</dd>
<p>
<?php echo $item->get_date('j F Y | G');
?>
</p>
<?php $coutner = $coutner + 1;
$accessing = array_pop($posting_time);
}
else{echo '<p>else';} ?>
<?php endforeach; ?>
</dl>
<?php }} ?>
The main error right now is that sometimes the circular shifting of while($today > $posting_time[0]){
goes on infinitely. And the loop never seems to go as planned.
I tried to build a solution based on the approach that you took and made it much simpler. There are a number of edge cases that should be considered, and I will explain those later, but I think that it will still achieve the basic goal of your app as is, but I did make some assumptions.
...
<?php
$counter = 0;
$daily_max = 3; //how many unique feeds to display per day
$display_interval = floor(24 / $daily_max); //simple way to make even intervals
$posting_time = array(); //to store the times to post
// Create a list of time intervals largest to smallest ex. [16, 8, 0]
foreach(range($daily_max-1, 0) as $i) {
$posting_time[] = $display_interval * $i;
}
?>
<dl>
<?php
if($maxitems == 0){
echo '<dt>Feed not available.</dt>';
}
else{
foreach ($rss_items as $item){
if(count ($posting_time) == 0){
break;
}
//as soon as the first item is older than the most current post time, output it & count that time slot as being filled
$rss_item_hour = $item->get_date('G');
if($rss_item_hour < $posting_time[0]){
?>
<dt>
<a href="<?php echo $item->get_permalink(); ?>"
title="<?php echo $item->get_date('j F Y @ G'); ?>">
<?php echo $item->get_title(); ?>
</a>
</dt>
<dd>
<?php echo $item->get_description(); ?>
</dd>
<p>
<?php echo $item->get_date('j F Y | G'); ?>
</p>
<?php
$counter++;
array_pop($posting_time);
}
else{
// Debug message
}
}
}
?>
</dl>
...
Ok, so since I don't have access to your fetch_feed data, this is untested, but I am happy to update if there are bugs.
What this will do is pick posts that are roughly broken up by the interval that you specify, but it does not do any checking to determine how close they are to those boundaries. For example, if the latest post is at 16:01, it will be skipped in favor of the first post that comes before 16:00, which may be at say, 9:00. Then it will look for the first post that is before 8:00, which may be at 7:59, so then you will have two posts that are really close in time. Or if there are no posts between 16:00 and 8:00, the first post displayed may be at 7:30, and then the very next post, maybe at 7:28 will also get displayed (since it is now the first post available before 8).
My assumption was that you are less concerned about the exact spacing and are more interested in 'thinning' out the volume of posts a little, which this should achieve and is hopefully suitable for your application.
As I said, I am happy to help you refine it if you have something specific in mind.
The code is a bit monolithic, but I've had to declare a dummy class to try it out (called RSS ), I've also created some dummy data to help me test it.
But the main part consists of two loops - the first builds up the intervals for the articles. The code will ensure that the if the last entry is past the current time, then this is set to the current time (so as of writing - the entries will be hours 0, 8, 15).
Then it looks through the articles (assuming they are in date order) and finds the best place to put the article. It uses the difference between the article time and the time required. Using abs() means that if article 1 is 7:30 and your time slot is 8:00, that's 30 minutes, but if article 2 it 8:26 - that's 26 minutes - which is closer to the time and so will be selected for that slot. With the current logic, it will only overwrite a previous entry if the time is closer. So for 8:00, the article would be 7:30 and not 8:30 (for example).
This code doesn't deal with the output, purely the processing. The end result is in $posting_time[]['article']...
<?php
error_reporting ( E_ALL );
ini_set ( 'display_errors', 1 );
class RSS {
public $date;
public $desc;
public function __construct( $date, $desc ) {
$this->date = $date;
$this->desc = $desc;
}
public function get_date() {
return $this->date;
}
public function get_description() {
return $this->desc;
}
}
$rss_items = [
new RSS( new DateTime("2017-10-13 13:00:00"), 'y13'),
new RSS( new DateTime("2017-10-14 00:00:00"), 'c0'),
new RSS( new DateTime("2017-10-14 01:00:00"), 'c1'),
new RSS( new DateTime("2017-10-14 02:00:00"), 'c2'),
new RSS( new DateTime("2017-10-14 03:00:00"), 'c3'),
new RSS( new DateTime("2017-10-14 04:00:00"), 'c4'),
new RSS( new DateTime("2017-10-14 05:00:00"), 'c5'),
// new RSS( new DateTime("2017-10-14 06:00:00"), 'c6'),
// new RSS( new DateTime("2017-10-14 07:00:00"), 'c7'),
// new RSS( new DateTime("2017-10-14 08:00:00"), 'c8'),
// new RSS( new DateTime("2017-10-14 09:00:00"), 'c9'),
// new RSS( new DateTime("2017-10-14 10:00:00"), 'c10'),
new RSS( new DateTime("2017-10-14 11:00:00"), 'c11'),
// new RSS( new DateTime("2017-10-14 12:00:00"), 'c12'),
// new RSS( new DateTime("2017-10-14 13:00:00"), 'c13'),
];
$maxitems = count($rss_items);
$date = new DateTime();
$today = $date->format("y-m-d");
$currentHour = $date->format("G");
$daily_max = 3; //how many unique feeds to display per day
$display_interval = floor(24 / $daily_max); //simple way to make even intervals
$posting_time = array(); //to store the times to post
foreach(range(0, $daily_max-1) as $i) {
$hour = $display_interval * $i;
if ( $currentHour < $hour ) {
// Set end element to current time
$hour = $currentHour;
}
// Create a small structure which allows to keep the item matched against
// this time slot, and for convenience - keep the time difference as well
$articleTime = $today." $hour:00:00";
$posting_time[] = array('time'=> $articleTime,
'itime' => strtotime($articleTime),
'article' => null,
'article_diff' => 0
);
if ( $currentHour == $hour ) {
break;
}
}
foreach ($rss_items as $item) {
// Fetch the timestamp of the article we're working with
$articleTime = $item->get_date()->getTimeStamp();
// Look for right place to put this article
foreach ( $posting_time as $key=>$interval ){
// If this posting time hasn't got an article, or the
// time difference is smaller for this article compared to
// the one already stored there
if ( $interval['article'] == null ||
(abs($interval['itime'] - $articleTime) <
$interval['article_diff']) ) {
// Set this article as the nearest match and finish this
// inner loop
$posting_time[$key]['article'] = $item;
$posting_time[$key]['article_diff'] = abs($interval['itime']-
$articleTime);
break;
}
}
}
print_r($posting_time);
It is assumed that the get_time() from the article will give a DateTime, if not - this code will need to be changed to get this value as a Unix timestamp, this allows the date comparison to be compared as numbers.
Considering the time as "seconds of the day" (0 - 86400) the following lines would serve your needs (simplified example):
<?php
$postTimes = array(1,600,953,1900,23500,27600,56000,72000);
echo "Closest match is: " + findMatch(24000, $postTimes); //23500
function findMatch($needle, $haystack) {
$closest = null;
foreach ($haystack as $element) {
if ($closest === null || abs($needle - $closest) > abs($element - $needle)) {
$closest = $element;
}
}
return $closest;
}
?>
Finally you just need to implement:
getPostTimesAsSeconds($postArray); //foreach converting dates to seconds-array
and
pickPostBySecondsOfTheDay(23500); //foreach, finding the post matching the seconds of the day.
Try the below example, file_get_contents
is used for this example to fetch the xml
. It will fetch all the feeds from the past 8 hours. Try to use DOMDocument
to handle the xml
feed and Datetime
to manage the time comparisons needed.
$hour_interval = 8;
$feeds = file_get_contents("http://www.huffingtonpost.com/section/front-page/feed");
$doc = new DOMDocument();
$doc->loadXML($feeds);
$items = $doc->getElementsByTagName('item');
$today = new DateTime("now",new DateTimeZone("-04:00")); // do mind the timezone it is the one set in the xml feeds so it is needed for correct time comparison
$nowTimestamp = $today->getTimestamp();
$today->modify('-'.$hour_interval.' hour');
$eightHoursBeforeTimestamp = $today->getTimestamp();
$lastEightHoursItems = [];
foreach ($items as $item) {
$pubDate = $item->getElementsByTagName('pubDate')[0]->nodeValue;
$feedDate = new DateTime($pubDate);
$feedTimestamp = $feedDate->getTimestamp();
if($feedTimestamp<=$nowTimestamp and $feedTimestamp>=$eightHoursBeforeTimestamp) {
array_push($lastEightHoursItems,$item);
}
}
$random_keys = array_rand($lastEightHoursItems,3);
$c = count($random_keys);
for($i=0;$i<$c;$i++) {
echo $lastEightHoursItems[$random_keys[$i]]->getElementsByTagName('title')[0]->nodeValue;
echo $lastEightHoursItems[$random_keys[$i]]->getElementsByTagName('link')[0]->nodeValue;
echo $lastEightHoursItems[$random_keys[$i]]->getElementsByTagName('description')[0]->nodeValue;
echo $lastEightHoursItems[$random_keys[$i]]->getElementsByTagName('pubDate')[0]->nodeValue;
}