Turn Aggregator items into true Drupal Nodes

Drupal 8 should address the longstanding issue with items from the core RSS aggregator module existing as simple db objects rather tahn nodes, which limits what you can do with them in many ways.

The core Aggregator is really a big deal for me. the db created from these items is one of the biggest draws to my site. So I have tried many things to work around the limitations of DB objects.

I flirted with the idea of dumping the table to a flat file then importing it. That would work, can be CRONed and all that. But I really want to keep my solution specific to Drupal and not engage in a one-off like that.

The core mod has a categorization interface that I really like. It allows me to go through 100s of items every day pretty quickly and assign relevant topics to one of about 20 categories.

So, this is what I did to get the aggregator db items into nodes.

I created a View of the Aggregator Items. I created an RSS feed based on this View. then I used Feeds Importers to import that RSS feed into a content type that I created just for that.  This will work and you will be able to get quite a bit of data from the importer. But it isn’t as flexible as I wanted. UI simply could not get all the fields that I wanted to come across, even when forcing fields. That will work, but I always got undesirable formatting for the links. Or the links would come across malformed, with the name of my site prepended to the URL for example.

I really can’t use the Feeds Importers module to import all of the items in natively either. It will work, And you can build an interface of sorts in Views using VBO to assign taxonomy terms to the items, but it is cludgy and doesn’t scale well. But that can be a viable method too if you don’t have a lot of volume.

So, I want to use the core module for its stability and ease of categorization. but I need to be able to do things like allow users to see the items and make comments on them. I say comments but I really mean take notes. So, I used a Views Content Pane (with override URL and AJAX enabled) with a Node Add/Edit Variant in a Panel Page to allow user to see the Aggregated items, drag URLs from the View, take notes and alter the View (that’s where AJAX is magic) from the same place. Here is a SS of what it looks like.

Image

Then, I created another View of the Content Type “Research” which is used to hold the notes, links and such. I created a simple Content Pane View that shows the title of the Nodes (I use the Private Module to keep these posts completely private) and enabled AJAX so that the View Content Pane refreshes without having to refresh the whole page. Now I have a very nice research interface for my member with an easy way to reference my data.

Getting Aggregated RSS Items into Nodes

Well, I am throwing in the towel on this. I have made it work, but it just doesn’t work very well. too bad because, as nodes, the stories can be indexed and searched the same way that the rest of the site can. But all I can get from the current setup is a title and a description. I am going to leave the rss items as db items and use Views to search through them.

Drupal to Twitter – Complete (and to Facebook too!)

Wow, I am finally done with this. I now have my categorized RSS feed going straight to a Twitter account and a Facebook account. Both of these accounts are specific to my web site but that doesn’t really matter. And the real key to getting this to work is Twitterfeed.com. This is a free service that will take RSS feeds and consolidate them into Twitter, FB, Linkedin, there are a couple others too, I believe. But Twitter and FB are enough for me.

The Twitterfeed part wasn’t too bad to connect but it did require some tweaking to get it to work with what I have properly. If you decide to use Twitterfeed.com, be aware that the advanced options are more likely to need tweaking than advanced options usually are.

And what do I have?

  • 50+ RSS Sources from all over the state of TN in the USA.
  • 15 Categories configured in the Drupal Core Aggregator module. I am not using Feeds at all; it isn’t flexible enough for this case.
  • 1000 individual news stories that come in every day from these RSS Sources.
  • A Drupal View that displays the News items that I categorize as they come in.
  • An RSS “Feed” of that View that is available via http://docresource.org/news/24-twitter. This is a part of Views. Click Add, and click Feed.
  • A twitter feed – @docresource and FB account – DOCResource that need to get the same info that the View displays

The Twitter module was not needed to do this. Just Views and the Core Aggregator module. There are good reasons to use the Twitter mod; they just don’t apply here.

Integration with Twitter – RSS to Twitterfeed.com

I have spent three days working on this. I have gone down many a fruitless avenue. You can post to Twitter from Drupal just like it says. But all you can do is post a specific piece of content, one at a time. That may be fine for many different circumstances. But what I have is a “News Ticker” of sorts on my main page. A scaled down version without links for anonymous viewers and a more involved one for authenticated users. and i want that  type of feed to be used in twitter. I want twitter to show the news items as they come in and are categorized by me.

So, i have worked with this. And in theory it should work:

Core Aggregator Mod –> Views –>

Sub View that has an RSS feed that goes to a page URL–>

Feed Importer from the Feed module that pulls from the URL of the RSS feed created by Feeds –>

Content Type that I created and configed in the Importer –>

That Content Type automatically posted to Twitter.

All of this works. the Aggregated items go into the View with the RSS feed which has a URL. My email client gets all of the RSS items, no prob. The importer uses that same RSS URL to pull the aggregated items in as Nodes attached to the content type that is specified in the Twitter config to be posted automatically. But they don’t post.

They are there, in the content, correct user name for publishing, all that. And all I have to do is edit that particular node and then save it and it will then post. BUT, not until then. I believe that the Twitter function is not called until you edit the node and then save it. So i tried to use the Views Bulk Operations module to do all sorts of things to those nodes including bulk saving the node, just like I do when I edit and save. But, it is the clicking of the edit node that calls the function. So, they would just not post automatically. And I am not going to build a site with a bunch of manual admin tasks.

So, I kept looking. And I found this article. And then I found Twitterfeed.com. This is precisely what I wanted. And, it is free. I am very happy. This had driven me nuts for three days. Don’t get me wrong; the twitter module works like it is supposed to. It just wasn’t what I needed. and this is.

Turn a Drupal View into an RSS Feed

In my quest to get Aggregator DB items into Twitter automatically, I have finished one part that in itself is pretty good. I now have an RSS feed available for the View in its display on the front page of the site via the little RSS icon. you’ll need to sign in to see the View, in the center and the RSS icon at the bottom. But you can sign in with Facebook. but here is what i did to make this part work.

Under web services, make sure that you have:

RSS publishing

 You are here
Status message
The configuration options have been saved.
Feed description – add something for the hover on the rss icon

Description of your site, included in each feed.
Number of items in each feed- 10

Default number of items to include in each feed.
Feed contentTitles onlyTitles plus teaserFull text

Global setting for the default display of content items in each feed.
Then, go to your View. Here is a ss of what i have
view feed

this is really it. now i have an rss icon for the view. but the main thing that it is going to help me with is getting the rss feed into an importer from Feeds, then getting it into a Content Type that can be published automatically to Twitter. when i manage to make this work seemlessly, i’ll post it.

Posting Aggregator Content to Twitter II

this is going to work. i actually made it work but missed the series of steps last night. but i will reproduce it today.

Here is what i need to do:

  • Have Twitter display my aggregated news (from RSS Feeds) items as they are classified by me using the Core Aggregator Module. I already have the Twitter integration mod installed and working. 

Here is the issue:

  • Aggregator Core doesn’t create nodes from these items, just simple db objects
  • The Feeds Module will create nodes from RSS items, but the classification isn’t nearly as easy as it is with Core and I have over 100 items per day to classify so I have to use Core

Here is what I am doing to work with existing code/mods.

  • Create a feed version (in Views) of the View for the News Feeds that shows News items from the last 24 hours. This View is the end result of the categorization that I do.
  • Now I have an RSS Feed for the View. I can go to my email client, put that URL in and get the News items that I catregorize as RSS items. And I have a nice little RSS Feed icon at the bottom on the nPanel that hold the View
  • Import that created rss feed into the Feeds module using a Feeds importer
  • The importer puts the items into a Content Type create for the Twitter integration. This part works from a manual process.  when I create content with that Content Type, it goes into Twitter

This all works like it should except that I am missing one thing somewhere. I have made it work once but then I changed something (it was late last night and I was half asleep) and missed what it was that I needed. But I will make this work again.

Aggregators, CRON Jobs and Drupal cleanup.

This was a really involved project. If you use the Aggregator Core module a lot, take a look. I depend on Aggregators more than anything right now and have really had to do some involved work with it. Read on:

I have aggregator needs that the core doesn’t really quite give me. but it does work pretty well. here is what I collect:

  • 50+ feeds from various newspapers culled hourly resulting in several hundred articles per day.
  • Each RSS source is categorized (automatically, by default in drupal) as z-Uncategorized which corresponds to a CID (in the drupal DB) of 22. 
  • As the articles come in, I review and categorize them. I have a shortcut to the z-uncategorized category of items. That gives me all the new items, regardless of the source in one place where I can categorize them quickly by clicking on the categorize tab provided by Core. I keep about 10% of the stories that come in.
  • Because the newspapers maintain articles in their RSS feeds for a period of time beyond my control, they are readded to drupal’s DB whenever the feed is pulled; but now listed with two categories. There are now two entries for each of these stories with the same IID but a different CID. It looks like this below. There is the default z-uncat… category and the Juvenile category that I chose before the feed was queried again.
  • Even thought this looks like one record, it is really two different records in the tables. So, if I look at the aggregator_category_item table, I can see two records for the one IID. One with CID of 22 (the default, z-uncategorized) and the other of whatever I assigned it to. So, I can run a query and delete all with category 22. But, until the newspaper removes it from THEIR feed, it continues to come through.
  • I perform a nightly clean up where I delete all the 22s. This occurs when the papers are slow and new items have all been categorized by me.
  • Eventually (after a few days for most news sources) the stories are removed from the papers’ RSS feeds and do not get repopulated in Drupal with the default of CID 22. So then I am left with a nice single record in the category that I have assigned it to. By cleaning up every night, I get rid of stale 22s as the newspaper removes them from their RSS feed and I don’t have to think about whether they still have it or not.

Image

This is the cron job that I have to do the clean up.

0 22 * * * /usr/bin/mysql –defaults-file=”/home/xxxx/.my.cnf_cron” -e “DELETE FROM drupal.aggregator_category_item WHERE aggregator_category_item.cid = 22” >>/dev/null 2>&1

The .my.cnf.cron file contains authentication information

[client]
host=localhost
user=crondel
password=*****

The user and password is a mysql specific user I created for this job.

The 0 22 * * * means that it will run at 10 PM EST every night. EST because that is the time zone for the server.

Here are the specific rights for the crondel account name for the drupal DB, named, drupal.

GRANT USAGE ON *.* TO ‘crondel’@’localhost’ IDENTIFIED BY PASSWORD ‘*6E52D2AA6010C379DE1AE3BC559E2416A9A5C513’
GRANT SELECT, DELETE ON `drupal`.`aggregator_category_item` TO ‘crondel’@’localhost’

The account needs SELECT rights to execute the WHERE condition of the SQL statement in addition to the DELETE FROM on the specific table in the DB.

You might ask, why not do all this with Feeds? Well, I did try to do it with Feeds. I spent quite a bit of time with it. But feeds grabs each RSS item as a node. And I could not figure out an easy way to categorize the hundreds of stories per day when they all come in as nodes. And since this DB will eventually be huge with 100k+ stories in a searchable archive, I think that it may be easier to keep it this way. I just had to figure out what to do with the extra 22s. And this solution seems to work.

Ug. This was a pain. And if you want to know more about the subject or I have been unclear, let me know and I’ll try to clarify.

Aggregator categories, Views, Feeds, Panels and a solution

I am very happy today. I finally found a solution to the issue that I was having with Views and Aggregator items from the core mod. Aggregator is core, Views is contrib so this patch doesn’t involve hacking the core but be aware that an updated version of Views might break this again. I’m on Views 7.x.3.5

Here is the issue.

In Views, you can create a view for Aggregator items. But even though Categories is listed as an available field to add/filter against, it is not available unless the category field is used as the default category for that feed. when you go through the items and categorize them manually, that assigned category won’t show up in the View. this patch fixes that.

http://drupal.org/node/498438#comment-7063554

This is what I have. 20 RSS feeds from news sources all over the state. This is a catch all and I am only interested in certain stories. So, I have categories created to group the stories that I want together, regardless of the source. Every item that comes in has a default category of New. That way, I can see all new items together for categorization. This all works. It’s a pain, I have to look at about 40-50 items per day and categorize each one. But it is a big deal because I am dealing with a focused target audience. A niche. And stories that deal with their situation are import and need to be easy to get to.

So, I also need the power of Views. Views gives me much better presentation options when used by Panels than the Blocks interface does. Blocks is clunky and Panels is much better. And Views integrates well with Panels.

Now, you might ask, “why not use the Feeds Module?” well, i did. for hours and hours. and while it will do what I want, it is much more difficult for me (AT MY PRESENT LEVEL OF UNDERSTANDING*) to edit, categorize and tag individual nodes created by Feeds for each new news item. That takes a lot longer. And until I can write a module that does autotagging the way that will work for my type of content, I need to be able to use the Aggregator categories they way they are intended.

So, I implemented the patch. And it does work. I had to do it manually because code has been added since the patch was written to the Views mod and the line numbers are different. But it is really short. I’m on Views 7.x.3.5, Core 7.19. The patch goes into the modules/aggregator.views.inc file within the Views folder.

If you want to see why Views doesn’t work properly with Aggregator ietms read the post. the explanation is in there. This was a really frustrating experience but like all of those types of experiences, I am better off for it.