Tutorial for FeedAPI Image Grabber

Updates:

I get a lot of requests for this, so here is a tutorial explaining the 3 W’s (What, Why and How) of FeedAPI ImageGrabber.

Motivation

a feed item from google reader

a feed item from google reader

This is a screenshot of a news item from my Google Reader. Anyone who has a moderate knowledge about the feeds would know that the image on the left in the above news-item is actually not published with the feed.

In other words, If you refresh any feed on your Drupal website using FeedAPI, you won’t get the image in the posts, unless they are published along with the feed which is quite rare.

The purpose of FeedAPI Imagegrabber is to make the feed more informative as well as interesting for the user. As, we all know that “comics are much better than novels”, this module appends the feed-item with an appropriate image from its content URL. The goal of the FeedAPI ImageGrabber is to mimic the thumbnail display of Google Reader for the feeds on your Drupal website.

As soon as you refresh the feed, ImageGrabber automatically attaches an appropriate image from the original article to the feed-item node on your website. You don’t even have to press an extra button!!


How it works
A classic method of mimicking the behavior of FeedAPI Imagegrabber will be to do the same thing manually. Let us go through the procedure if you were to do it manually:

  1. Refresh the feed.
  2. For each feed-item, go to their respective original URL and save the image to display.
  3. For each node, crop the image, convert it into a thumbnail and then upload it in an CCK imagefield.

FeedAPI Imagegrabber automates the last 2 steps of the three step process described. You just need to refresh the feed and the ImageGrabber do the rest for you. The most difficult part which only humans can do is to select an appropriate image, for which I am constantly improving on the heuristics.

Download and Install
Download the latest release of FeedAPI ImageGrabber, compatible with your Drupal release, from the project page. Uncompress the folder in your site’s module directory and enable it from the admin/module page.

You are now ready to create new feeds with ImageGrabber enabled.

Next, you must install the following modules to get going:

  • Imagefield
  • Filefield
  • CCK

You must then decide the content type for your feed-items for a particular feed and add CCK imagefields to that content type, before proceeding.

Form User Interface
You can enable the ImageGrabber for the feeds you want to download images for. When you add or edit any feed (i.e. feedapi enabled node), you will see the following additional settings for ImageGrabber:

Form Interface for ImageGrabber

Form Interface for ImageGrabber

Here is an detailed explanation of the input fields:

  1. Enable ImageGrabber: Check this box if you want to enable ImageGrabber for this particular feed.
  2. Select image field: ImageGrabber stores the downloaded image in a cck imagefield. You must select one of the imagefield from the drop-down menu.
  3. If you receive this message “There are no imagefields associated with the ‘abcd’ content type.” , then you must add imagefield to the content type mentioned or change the content type of the feed-items to the one which has imagefields associated with it.

  4. Search for an image between the tag identified by:
    To understand this, consider that the every feed-item associated with a particular feed has the following structure (HTML source code):

    HTML source code of a feed-item webpage

    HTML source code of a feed-item webpage

    For the above example, we know that the images associated with this particular post will always be within the

    <div class="post">...</div>
    tag

    Therefore, for this feed you must select the ‘class’ option here and enter ‘post’ in the text-field.
    Similarly, you must find out one such tag from the feed-items of the feed you want to create, see if it is identified by an id or an class. Then, select the appropriate option and enter the class or id in the text-field below. If you select None, the default tag will be ‘body’.

    simple way to find the id or class: Just find out the title of your article in the source code, and start looking for its parent/grandparent tags. Select the nearest parent which has an id or class.

    Now ImageGrabber will look for the images just between this tag, hence eliminating the download (and selection) of images in the sidebar or the header advertisement.

  5. Enter the id or class (whatever you selected) in the text-field. Leave it empty if you selected the None option above.
  6. Feeling Lucky: This has definitely been borrowed from the big devil Google. It will just select the first image it encounter between the tag specified, if ‘feeling lucky’ option is selected. Otherwise, it will select the largest image between the tags. I would personally recommend you to try out the ‘feeling lucky’ option as it helps saving a lot of bandwidth!! But if it doesn’t work for you, select the otherwise.

When you are done, just refresh the feed and you will see the images attached to the feed-item nodes.

Note: You can also modify your theme a little to get the Google Reader look in which the image floats on the left of the feed-item.

After refresh, you might get the following warning:

ImageGrabber: PHP execution time limit for system is 30 seconds, due to which images for some feed items couldn’t be downloaded. Please click on ‘Grab Images’ to refresh those feed-items.

If you get such a warning, no need to get afraid, just click on ‘Grab Images’ and the remaining feed-items will have their images as well. You may get multiple warnings depending upon your system’s execution time.

Enjoy grabbing images!!

Do leave a comment about this tutorial. Please make support and feature requests using support forums only.

17 thoughts on “Tutorial for FeedAPI Image Grabber

  1. Pingback: Drupal: FeedAPI Imagegrabber | Public Mind

  2. lid

    Great thinks for this tutorial but i have this error :

    Fatal error: require_once() [function.require]: Failed opening required ‘modules/feedapi/parser_simplepie/simplepie.inc’ (include_path=’.;C:\php5\pear’) in C:\wamp\www\drupal6\modules\feedapi\parser_simplepie\parser_simplepie.module on line 222

    Thinks for your help 😉

    Reply
    1. Nitin Post author

      Lid,

      FeedAPI (not FeedAPI ImageGrabber) requires you to download simplepie package. You can always find more details on your reports page (Go to admin/reports/status)

      Reply
  3. Brandon Trew

    Hi there

    Have installed Curl, but for some reason I keep getting this error:

    warning: curl_setopt_array() [function.curl-setopt-array]: CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set in /var/www/vhosts/grpstr.com/httpdocs/sites/all/modules/feedapi_imagegrabber/feedapi_imagegrabber.module on line 666

    Checked phpinfo
    The site is not in safe mode
    And the open_basedir = /var/www/vhosts/grpstr.com/httpdocs:/tmp

    Any ideas? Please help!

    Reply
    1. Nitin Post author

      Brandon,

      I think it will be better if you move this issue to your web host service provider. They will be in better position to tell you the cause for this. If that’s not possible, then you should look into curl forums, I have little idea about these configurations. Please leave a comment if you are able to figure this out, it might help others.

      Reply
  4. Willem

    I ‘ve setup my default crontab using wget. Would using curl mean setting up a separate crontab for imagegrabber, or must i change the default crontab to use curl instead of wget?

    Reply
    1. Nitin Post author

      Willem,

      You don’t need to change your default crontab to use cURL. ImageGrabber needs cURL to download the images, so it must be installed on your system but you can set crontab using any method.

      Reply
      1. Willem

        Thanks! It’s reassuring for such a linux newbie as i am to know that these services are not interdependent. I wasn’t clear on that. And thanks very much for making this module. I look forward to trying it out in my current project.

        Reply
  5. Willem

    Basicly I’m trying to get the user images from an advanced search.twitter.com feed (to accompany selected data from a feed that I map to a content type with feedapi_mapper).

    It seems to me that imagegrabbers behaviour is independent of feedapi_mapper. Imagegrabber looks at the description node of the feed’s xml for the link to follow to retrieve an image.
    What I want to do is to try to get it to look at another node of the feed’s xml, to acquire the link to be followed.
    I’ve jumped into your code to find out where this “source-node” is set, but I’m a php newbie as well: i can’t figure it out.

    Reply
    1. Nitin Post author

      FeedAPI ImageGrabber is completely independent of FeedAPI Mapper. But, it doesn’t looks into the description to scrap links but follows the “original_url” as provided by FeedAPI for a feed-item, i.e. the link you would normally see under the ‘original article’ title associated with feed-item node.

      It won’t be possible what you want to do using ImageGrabber, may be after customizing the module you can achieve these results.

      Reply
  6. Pingback: Drupal: Feeds Image Grabber | Public Mind

  7. Pingback: Drupal: Tutorial for Feeds Image Grabber | Public Mind

  8. wheel spin roulette

    The tutorial about the feedAPI image grapper is fantastic.The post gives the brief instructions of the working method and the downloading/installation process of the API grapper. Nice post!

    Reply
  9. tony

    hello… i added an imagefield to the feeditems content type but the drop down menu to store the image in … does not appear when adding a new feed. any idea how i can fix this? thanks for the tutorial.

    Reply

Leave a Reply