Getting Aggregated RSS Items into Nodes

Well, I am throwing in the towel on this. I have made it work, but it just doesn’t work very well. too bad because, as nodes, the stories can be indexed and searched the same way that the rest of the site can. But all I can get from the current setup is a title and a description. I am going to leave the rss items as db items and use Views to search through them.

Mailhandler Module

Well, I have made more progress on this and understand it pretty well now.

  1. Mailhandler Mailbox – [Name is tncourts@domain.com] Connects you to the POP3/IMAP4 Mailbox.                                                                              
  2. Feeds Importer [TNCourts] – This is one of the big ones. You tie the IMAP Fetcher and the IMAP Parser to the Content type that you should already have created. In this case, it is TNCourts
  3. Content Type [TNCourts] – I think this is where i have some work to do because the body of my email is there but showing up in the View that I have created.
  4. Content [TNCourts] – I unpublish this so it won’t show in my view but you have to have it. Even though it appears as a piece of content the same way that the nodes created by this process are, it is the master for those and the import will break if you delete it. To keep it from showing in Views, I just unpublish it.
  5. Nodes [titles from the subject] of the emails that are imported if you made that choice in step 2 in the Feeds importer mapping. Subject source title target.
  6. Views – Create one to display the content type TNCourts where the content is published.

This is a tricky area. But what you can get from it is really worth the effort.

Mailhandler Module

Well, I’m just about done with this. And it was a bit of a chore. But it does work, well at that. Here are some of the point to consider

  • Have a mailbox that is accessible over the proper TCP port. when I tested secure IMAP earlier, it didn’t work because I am sure that my provider’s firewall is blocking 995. Other than that the POP3, IMAP4 stuff is pretty straightforward
  • Make sure that the email address that is sending the email to the POP3/IMAP4 account is registered to one of your user accounts. That is what the “from” authentication is all about. It has to see that the from field is known so it can be seen as an Authenticated User.**
  • Make sure that the “authenticated user” role is permitted to create content for the content type that the feed importer is attached to in the Drupal Permissions part of the user accounts
  • Make sure that the Feeds Importer that you setup is like this next bullet
  • Basic settings

    • Attached to: Static Text – this is the content type that I created for email import. you can and should create your own.
    • Periodic import: as often as possiblethis way it will pull new email from the POP/IMAP account whenever you run cron manually for testing. when you have it right, send a test email to the account, run cron and the new content will be visible when you click Content
    • Import on submission

    Fetcher

    Mailhandler fetcher

    Connects to an IMAP/POP mailbox. – these are all that you need for the Fetcher

    Parser

    Mailhandler IMAP stream parser

    Parses an IMAP stream. – these are all that you need for the Parser

    Processor

    Node processor

    Create and update nodes.

    Mapping for Node processor – this is what I have for Processor, Settings and Mapping. I believe that the Mapping is the important part and i have it below. the real key for me was the Subject source being matched to the Title target. That way the content has the subject listed in a logical place when you see the node in Content.

    Define which elements of a single item of a feed (= Sources) map to which content pieces in Drupal (= Targets). Make sure that at least one definition has a Unique target. A unique target means that a value for a target can only occur once. E. g. only one item with the URL http://example.com/content/1 can exist.

    SOURCE TARGET TARGET CONFIGURATION

    Message ID

    GUID
    Used as unique.
     Remove

    Subject

    Title
    Not used as unique.
     Remove

    Date (date)

    Published date
     Remove

    Body (HTML)

    Body
     Remove
     this makes sense once you work your way through it and see the logic. It is built this way because this way will provide you with the most flexibility later on if you want to get really complicated. It isn’t the easiest thing in Drupal, but it is far from the hardest.

** Since I am receiving email from an outside party from a mailing list, I created a dummy account with that email address.

Mail Handler Update

I believe that I have the Mailhandler module working properly. And, as usual with Drupal, there are a few quirks.

I tried it with my gmail account, which requires TCP port 995 for POP3 or 993 for IMAP4 These are the secure versions of these protocols. But they don’t appear to work properly. At least, I couldn’t get them. But I have my DOCResource.org email provider, Blackmesh.com, and they were able to give me some fast assistance.

I used IMAP, over port 143. I created an address on the qmail admin page, set the password and then went through the very simple process of connecting. Here is a note. When you are connected successfully, you will get a notice seen in the attached image. But when it fails, it should fail quickly and not tell you anything. And that slowed me down because I thought that it was working. So look for the msg below.

The next step is to configure a Feed importer that uses the Mailhandler mod as a “Fetcher”. I have mine setup but I am not sure if it is working yet. Will return soon.

Image

Aggregator categories, Views, Feeds, Panels and a solution

I am very happy today. I finally found a solution to the issue that I was having with Views and Aggregator items from the core mod. Aggregator is core, Views is contrib so this patch doesn’t involve hacking the core but be aware that an updated version of Views might break this again. I’m on Views 7.x.3.5

Here is the issue.

In Views, you can create a view for Aggregator items. But even though Categories is listed as an available field to add/filter against, it is not available unless the category field is used as the default category for that feed. when you go through the items and categorize them manually, that assigned category won’t show up in the View. this patch fixes that.

http://drupal.org/node/498438#comment-7063554

This is what I have. 20 RSS feeds from news sources all over the state. This is a catch all and I am only interested in certain stories. So, I have categories created to group the stories that I want together, regardless of the source. Every item that comes in has a default category of New. That way, I can see all new items together for categorization. This all works. It’s a pain, I have to look at about 40-50 items per day and categorize each one. But it is a big deal because I am dealing with a focused target audience. A niche. And stories that deal with their situation are import and need to be easy to get to.

So, I also need the power of Views. Views gives me much better presentation options when used by Panels than the Blocks interface does. Blocks is clunky and Panels is much better. And Views integrates well with Panels.

Now, you might ask, “why not use the Feeds Module?” well, i did. for hours and hours. and while it will do what I want, it is much more difficult for me (AT MY PRESENT LEVEL OF UNDERSTANDING*) to edit, categorize and tag individual nodes created by Feeds for each new news item. That takes a lot longer. And until I can write a module that does autotagging the way that will work for my type of content, I need to be able to use the Aggregator categories they way they are intended.

So, I implemented the patch. And it does work. I had to do it manually because code has been added since the patch was written to the Views mod and the line numbers are different. But it is really short. I’m on Views 7.x.3.5, Core 7.19. The patch goes into the modules/aggregator.views.inc file within the Views folder.

If you want to see why Views doesn’t work properly with Aggregator ietms read the post. the explanation is in there. This was a really frustrating experience but like all of those types of experiences, I am better off for it.

Feeds

well, i ran into a major issue with the core aggregator module. I can’t use the categories item as a means of building a views for aggregator items. it is listed there and you would think it should work, but it doesn’t. and it really pissed me off yesterday.

so, i’m dumping the core aggregator to use the feeds module. hopefully, that will be better.

Drupal News Aggregator Services

well, i’m trying out different news aggregators. there aren’t too many options but i think that the big problem here is that i am trying to make this easy on myself and not wanting to do the (often) hard work of learning something about drupal. i like drupal and it is as powerful as it gets but it can also be a complete and total pain in the ass when it comes to making it do what you want. that is why i have decided as a long term project to learn css and then php. that’s the best way to really become proficient at drupal.

ok, activity stream seems to be a good way to aggregate content on an individual user basis. not helpful for me. that was easy. i believe that feeds is going to be the way to go. but i have to spend some time with it and learn how to make it work properly.

New Aggregators and Drush’s ‘archive-backup’

i am working with the different aggregators that are available for drupal. it is really kinda slow going though because the documentation just isn’t that good. and i supposed that what I want to do is rather involved.

here are my requirements

  1. aggregate news items (RSS Feeds) from about 30 sources
  2. Place them in one big list with category choices in each post for categorization
  3. or have them categorized automatically based on a defined keyword algorithm

i think that is it. number three is the real key, i believe. so that is what i am working on in regards to drupal these days

all in all it is going pretty well though. but what i suspect is going to happen is that I am going to have to write the module for no. 3 myself. and that is cool except that i don’t know how. but, that is where i see my career going at this point. i need to learn programming. so, i am going to try to study php for an hour a day.