Recent Changes · Search:
 

The Tertiary Education Portal is a catalogue of information relevant to learners engaged in or considering a tertiary education. Phase 2 of Save as PDF adds the ability to convert selected information resources referenced on the portal into a PDF, downloaded to the student’s desk top.

1.  Benefits

Using pdf2you, the HTML on a web site becomes the one authoritative source of content. Instead of having to put up a separate PDF version for visitors to download, sites can let visitors generate a PDF dynamically. This ensures that the web content and the PDF always contain the same information. It saves time and makes it easier to update the site’s content. It becomes unnecessary to maintain 2 or more versions of the same information and eliminates the risk of inadvertently publishing different versions of the same content.

For sites that conform to the Egovernment web guidelines, pdf2you “just works”. Where sites have a lot of PDF-based information, using pdf2you simplifies the task of migrating to comply with these guidelines. As the PDF-based content is migrated to HTML, the PDF versions can simply be discarded. By using the metalogue to describe resources that are PDF-able, other portals can potentially take advantage of the service in future. It also means all metadata about the resource is held in once place.

2.  Working examples

Pdf2you is a web service that takes a collection of web pages and reformats them into PDF, laid out in a style suitable for printing. This frees individual site managers from the need to maintain separate HTML and PDF versions of content, with the risk that the 2 sources contain different information. Resources tagged as PDF-able will look like the following examples.

The This Site?:images/pdficonsmall.gif icons below link to the pdf2you production server. Clicking on the icon will generate a PDF of the target page and all linked pages. Some of these are quite long documents.

(2 levels of nested links)

(2 levels of nested links)

(don’t follow links)

By default, pdf2you follows links to pages referred to on the selected page, within the same site. The last example overrides this behaviour and returns a PDF of the selected page only. The default behaviour expected on the portal is to follow links.

Top

3.  What pdf2you does

The following diagram shows what happens behind the scenes when a visitor asks for a PDF. The pdf2you service generates the document dynamically, using the web page’s HTML code as the source.

In essence, pdf2you separates HTML content from presentation, converts the content’s HTML markup into typesetting markup, then typesets the result.

Top

4.  Implications for the tertiary portal

The following diagram shows how each actor with a role in the portal, and in sites to which the portal refers, will be affected. The main requirement is that the pdf2you service must be easy for everyone involved.

It shows a distributed process where each web site manager takes responsibility for deciding how best to make use of the pdf2you service.

Top

5.  Process in more detail

Let’s describe what this means for each actor in the process.

Student visitor to the portal
Some resource links on the portal will have PDF icons against them. Click the icon and the visitor gets a PDF of the link and its related pages. The presence of a PDF icon against a resouce is an indicator that there is rich content at the end of the link. It will only be worth assigning a PDF-able tag to content-rich pages.

Portal resource administrator
This is the person responsible for maintaining the resource metalogue entries for a site. The administrator tags a resource as being PDF-able, using a suitable NZGLS element. The Ministry needs to make a business decision about which element to use. The most suitable one appears to be format, with the value ‘application/x-pdf2you’. Resource administrators will need guidelines to help them decide which resources to tag as PDF-able. For example, a page with a list of links to pages with Printable View buttons is a good candidate. It will be better not to tag a suitable resource than to tag an unsuitable resource.

Web manager
This is the person responsible for a web site that can be reached through the portal. The web manager sets standards for how web page authors can designate what content is PDF-able. There are 3 options for doing this:

  1. ask the pdf2you service manager to create a handler for the site; as long as authors follow their site standards, the handler will be able to convert the site

  2. if the site uses <div id=‘somename’> … </div> tags to indicate where body text begins and ends, tell pdf2you where to find body copy by adding <meta name=‘pdf…’ content=‘somename’ /> tags to the header:
    • name=‘pdfbody’ defines the location of body text
    • name=‘pdftoc’ optionally defines the location of a list of links to follow (if not specified, pdf2you uses pdfbody)
    • name=‘pdfignore’ optionally defines text within pdfbody that pdf2you should ignore

  3. or on simple pages, place PDF-able content between <div id=‘pdf2you’> … </div> tags; the pdf2you service looks for these by default
Web managers can also use <span … > tags instead of <div … > and class names instead of id. The examples at the top of the page use site-specific handlers; they required no changes to the original sites.

Portal software manager
This is the person who controls changes to the software that runs the portal. The software manager enhances the portal software to generate an appropriate link to the pdf2you service for any resources tagged with format=‘application/x-pdf2you’. Resources will carry the PDF icon in both the hierarchical navigation and search views.

Pdf2you service manager
From time-to-time, develop handlers to support converting a new site’s content into PDF or typesetting of less common HTML tags (although PDF and HTML are not, in fact, acronyms).

Top

6.  Advanced features

The pdf2you service sets defaults for how the resulting PDF is laid out. A web manager can define settings appropriate for a particular site using <meta name=‘pdfkeyword’ content=‘text’ /> tags. Save as PDF#metadata lists the available keywords. For example,

 <meta name=‘pdfwatermark’ content=‘draft’ />

will print “draft” as a watermark behind every page of the PDF.

Top

7.  How to decide PDF-ability

The following are things to look for when assessing whether a resource available through the portal is PDF-able:

  • lots of text, not much graphics
  • many links to other pages (such as a “contents” page)
  • long pages (more than 2 screens-ful)
  • pages with “Printable View” links

Some sites may prefer to use pdf2you in place of “Printable View” links; others, including this one, use the printable view as the source for the PDF.

Top

Top

All Recent Changes

Edit SideBar

Note style of sidebar (eg right justify) overrides formatting such as bullets and small specifed here, this could be considered buggy

ShareAlike Licence

Edit · History · Print · Recent Changes · Search · Links
Page last modified on 15 December 2006, at 11:27 AM