9 min read

Input Formats and Filters

It is necessary to stipulate the type of content we will be posting, in any given post. This is done through the use of the Input format setting that is displayed when posting content to the site—assuming the user in question has sufficient permissions to post different types of content.

In order to control what is and is not allowed, head on over to the Input formats link under Site configuration. This will bring up a list of the currently defined input formats, like this:

At the moment, you might be wondering why we need to go to all this trouble to decide whether people can add certain HTML tags to their content. The answer to this is that because both HTML and PHP are so powerful, it is not hard to subvert even fairly simple abilities for malicious purposes.

For example, you might decide to allow users the ability to link to their homepages from their blogs. Using the ability to add a hyperlink to their postings, a malicious user could create a Trojan, virus or some other harmful content, and link to it from an innocuous and friendly looking piece of HTML like this:

<p>Hi Friends! My <a href="link_to_trojan.exe">homepage</a> is a great place to meet and 
learn about my interests and hobbies. </p>

This snippet writes out a short paragraph with a link, supposedly to the author’s homepage. In reality, the hyperlink reference attribute points to a trojan, link_to_trojan.exe. That’s just HTML! PHP can do a lot more damage—to the extent that if you don’t have proper security or disaster-recovery policies in place, then it is possible that your site can be rendered useless or destroyed entirely.

Security is the main reason why, as you may have noticed from the previous screenshot, anything other than Filtered HTML is unavailable for use by anyone except the administrator. By default, PHP is not even present, let alone disabled.

When thinking about what permissions to allow, it is important to re-iterate the tenet:

Never allow users more permissions than they require to complete their intended tasks!

As they stand, you might not find the input formats to your liking, and so Drupal provides some functionality to modify them. Click on the configure link adjacent to the Filtered HTML option, and this will bring up the following page:

The Edit tab provides the option to alter the Name property of the input format; the Roles section in this case cannot be changed, but as you will see when we come around to creating our own input format, roles can be assigned however you wish to allow certain users to make use of an input format, or not.

The final section provides a checklist of the types of Filters to apply when using this input format. In this previous screenshot, all have been selected, and this causes the input format to apply the:

  • HTML corrector – corrects any broken HTML within postings to prevent undesirable results in the rest of your page.
  • HTML filter – determines whether or not to strip or remove unwanted HTML.
  • Line break converter – Turns standard typed line breaks (i.e. whenever a poster clicks Enter) into standard HTML.
  • URL filter – allows recognized links and email addresses to be clickable without having to write the HTML tags, manually.

The line break converter is particularly useful for users because it means that they do not have to explicitly enter <br> or <p> HTML tags in order to display new lines or paragraph breaks—this can get tedious by the time you are writing your 400th blog entry. If this is disabled, unless the user has the ability to add the relevant HTML tags, the content may end up looking like this:

Click on the Configure tab, at the top of the page, in order to begin working with the HTML filter. You should be presented with something like this:

The URL filter option is really there to help protect the formatting and layout of your site. It is possible to have quite long URLs these days, and because URLs do not contain spaces, there is nowhere to naturally split them up. As a result, a browser might do some strange things to cater for the long string and whatever it is; this will make your site look odd.

Decide how many characters the longest string should be and enter that number in the space provided. Remember that some content may appear in the sidebars, so you can’t let it get too long if they is supposed to be a fixed width.

The HTML filter section lets you specify whether to Strip disallowed tags, or escape them (Escape all tags causes any tags that are present in the post to be displayed as written). Remember that if all the tags are stripped from the content, you should enable the Line break converter so that users can at least paragraph their content properly. Which tags are to be stripped is decided in the Allowed HTML tags section, where a list of all the tags that are to be allowed can be entered—anything else gets handled appropriately.

Selecting Display HTML help forces Drupal to provide HTML help for users posting content—try enabling and disabling this option and browsing to this relative URL in each case to see the difference: filter/tips. There is quite a bit of helpful information on HTML in the long filter tips; so take a moment to read over those.

The filter tips can be reached whenever a user expands the Input format section of the content post and clicks on More information about formatting options at the bottom of that section.

Finally, the Spam link deterrent is a useful tool if the site is being used to bombard members with links to unsanctioned (and often unsavory) products. Spammers will use anonymous accounts to post junk (assuming anonymous users are allowed to post content) and enabling this for anonymous posts is an effective way of breaking them.

This is not the end of the story, because we also need to be able to create input formats in the event we require something that the default options can’t cater for. For our example, there are several ways in which this can be done, but there are three main criteria that need to be satisfied before we can consider creating the page. We need to be able to:

  1. Upload image files and attach them to the post.
  2. Insert and display the image files within the body of the post.
  3. Use PHP in order to dynamically generate some of the content (this option is really only necessary to demonstrate how to embed PHP in a posting for future reference).

There are several methods for displaying image files within posts. The one we will discuss here, does not require us to download and install any contribution modules, such as Img_assist. Instead, we will use HTML directly to achieve this, specifically, we use the <img> tag.

Take a look at the previous screenshot that shows the configure page of the Filtered HTML input format. Notice that the <img> tag is not available for use. Let’s create our own input format to cater for this, instead of modifying this default format.

Before we do, first enable the PHP Filter module under Modules in Site building so that it can easily be used when the time comes. With that change saved, you will find that there is now an extra option to the Filters section of each input format configuration page:

It’s not a good idea to enable the PHP evaluator for either of the default options, but adding it to one of our own input formats will be ok to play with. Head on back to the main input formats page under Site configuration (notice that there is an additional input format available, called PHP code) and click on Add input format. This will bring up the same configuration type page we looked at earlier. It is easy to implement whatever new settings you want, based on how the input format is to be used.

For our example, we need the ability to post images and make use of PHP scripts, so make the new input format as follows:

As we will need to make use of some PHP code a bit later on, we have enabled the PHP evaluator option, as well as prevented the use of this format for anyone but ourselves—normally, you would create a format for a group of users who require the modified posting abilities, but in this case, we are simply demonstrating how to create a new input format; so this is fine for now.

PHP should not be enabled for anyone other than yourself, or a highly trusted administrator who needs it to complete his or her work.

Click Save configuration to add this new format to the list, and then click on the Configure tab to work on the HTML filter. The only change required between this input format and the default Filtered HTML, in terms of HTML, is the addition of the <img> and <div> tags, separated by a space in the Allowed HTML tags list, as follows:

As things stand at the moment, you may run into problems with adding PHP code to any content postings. This is because some filters affect the function of others, and to be on the safe side, click on the Rearrange tab and set the PHP evaluator to execute first:

Since the PHP evaluator’s weight is the lowest, it is treated first, with all the others following suit. It’s a safe bet that if you are getting unexpected results when using a certain type of filter, you need to come to this page and change the settings. We’ll see a bit more about this, in a moment.

Now, the PHP evaluator gets dibs on the content and can properly process any PHP. For the purposes of adding images and PHP to posts (as the primary user), this is all that is needed for now. Once satisfied with the settings save the changes.

Before building the new page, it is probably most useful to have a short discourse on HTML, because it is a requirement if you are to attempt more complex postings.

 

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here