Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 Ind
 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 

9 Maintaining the GAP website
 9.1 Overview
 9.2 Getting started
 9.3 The Mixer
 9.4 XHTML
 9.5 Git usage
 9.6 The web server in St Andrews
 9.7 Installation on the web server
 9.8 The GAP manuals on the web pages
 9.9 The GAP packages on the web pages
 9.10 The search engine on the web pages
 9.11 The GAP bibliography
 9.12 The sitemap
 9.13 The GAP forum archive

9 Maintaining the GAP website

This chapter describes how the information accessible on www.gap-system.org is stored and collected, and how it is transformed into web pages.

9.1 Overview

The GAP website (in the following just called "website") has a tree structure for easier navigation and overview. Each node and each leaf of the tree is a web page. Every single page resides somewhere in this tree. This position is shown in the navigation bar on the left hand side, and the user can navigate through the tree using this navigation bar. However, pages can still link to other pages that reside in some other branch of the tree.

With very few exceptions, all pages are static HTML pages conforming to the XHTML 1.0 standard (see Section 9.4). However, these pages are not edited directly by the maintainer, but they are produced by a tool called "the Mixer" (see 9.3), which takes so called ".mixer-files" as source and produces the final HTML files. During this process, the navigation bar and some other parts of the page are created automatically, such that the maintainer does not have to worry about technicalities. A .mixer file essentially contains the content of the page in form of a well-formed XML document (see again Section 9.4 for an explanation) and the Mixer handles the technical details.

All the sources for the web pages are kept in the git repository https://github.com/gap-system/GapWWW. So you can clone this repository using


git clone https://github.com/gap-system/GapWWW

The web server in St Andrews also uses its clone, updates it to the latest revision of the master branch, runs the Mixer and then serves the pages. Another named branch is called testing and it is served on the password protected version of the GAP website at http://devel.gap-system.org/testsite where work in progress may be published to be reviewed internally.

The GAP website has some pages that are treated specially such as the GAP manuals, the pages for the GAP packages, the pages providing search facilities, the pages for the GAP bibliography, the sitemap, and the (old) GAP forum archive. The setup for these special pages is described in Sections 9.8 to 9.13 in this chapter.

In the following sections we first cover the Mixer, the web standard XHTML 1.0, the usage of git for the web pages, and the installation of the web site on the web server.

9.2 Getting started

There are several possible workflows dependently on how much efforts you would like to commit to the website maintenance.

A minimalistic scenario for small improvements (e.g. correcting details and fixing typos) only requires to install git and then:

  1. Clone the Website repository: git clone https://github.com/gap-system/GapWWW

  2. Make changes in the master branch

  3. Commit and push changes to trigger notification to website admininstrator(s) to check and approve this update.

A more robust scenario, especially for changes that are more likely to break the Mixer syntax, is to clone also the Mixer repository with


git clone https://github.com/gap-system/Mixer

an build the Mixer as described in the mixer.README file (see Section 9.3 for further details). For this step, you will need a C compiler (for compiling parts of the Mixer) and a Python interpreter (for running the Mixer).

With Mixer, you may run the mixer.py script (probably with -f option to rebuild everything regardless the timestamps) inside the GapWWW working directory to check how produced html pages look like in your browser before committing and pushing the changes.

Finally, while changes in the master branches will trigger notification to website admininstrator(s) to check and approve them, for changes that you want to be internally reviewed prior to publication, you may use the testing branch which is served on the password protected version of the GAP website at http://devel.gap-system.org/testsite. Changes in the testing branch will appear at the testing site immediately after pushing them to the master repository. This workflow is useful if you want to show your suggestions to a wider group of people who may not have an opportunity to install Mixer and have a local version of the GAP website to review your changes.

If you are one of website admininstrator(s), then you will also need to be able to access the web server in St Andrews via ssh to run certain update scripts and copy necessary data.

9.3 The Mixer

The Mixer is a Python script that uses a C-library to parse XML documents (see Section 9.4). Therefore this library (which comes with the Mixer) has to be compiled first.

The Mixer is kept in the git repository https://github.com/gap-system/Mixer. To clone this repository, use


git clone https://github.com/gap-system/Mixer

The above command creates a clone of this repository in the directory Mixer of the current directory. In that directory you can create the manual of the Mixer by calling make mixer.pdf provided you have an installation of LaTeX on your machine. In that manual the Mixer and its installation are described in details.

Alternatively you can download a copy of the Mixer and its documentation from this page.

A small comment on the rationale behind the Mixer might be in order. The fact that the input of the Mixer, that is the .mixer-files have to be well-formed XML documents (see Section 9.4) might at first sight be considered inconvenient and a bit awkward. However, this fact greatly improves the chances that the resulting HTML files conform to the XHTML 1.0 standard and at the same time lead to the fact that the Mixer is able to give very concise and usable error message during parsing in case something is not well-formed. This together with the automatically generated navigation bar makes the Mixer a valuable tool for the creation of web pages.

Note in particular that the tree structure of the whole web site is controlled by the tree files in each subdirectory, exactly as described in the manual of the Mixer.

9.4 XHTML

The HTML language has undergone a series of revision and standardizations. One major step was to make an HTML standard that conforms to the XML standard which happened with the revision "XHTML 1.0" of the HTML standard. This step was important because the XML framework makes it much easier to parse such documents automatically and check for "well-formedness". Here, the term "well-formed" means that the document fulfils a set of syntactic rules. That is, a document might be well-formed and at the same time not make any sense. See this page for details. A short introduction to the XML standard can be found in Section GAPDoc: XML in the GAPDoc manual.

The GAP web pages should conform to the standard XHTML 1.0. To cut a long story short, this means a few restrictions on the markup to use. Here we quickly cover the most important things, which should enable anybody who has ever seen an HTML document of any version to get started.

  1. All tags must be written with lower case letters in the element names.

  2. All non-empty elements must have a start- and end-tag, in particular enclose paragraphs in <p> and </p> or list entries in <li> and </li>.

  3. Elements must be properly nested like brackets, that is things like <a><b></a></b> are not allowed.

  4. Attributes always must have an assigned value and the value must be enclosed by either double or single quotes; for example <a href="https://www.gap-system.org">GAP site</a>.

  5. Write empty elements like <br />, the space before the / is not necessary according to the specification but it helps some old browsers to interpret it correctly.

  6. Do not put information on colors or fonts in the XHTML file. Instead use the .css style sheet file. (For complicated cases use the class attribute to mark elements for which you want to give special formating rules in the style sheet.

  7. The XML markup characters "<", "&", and ">" must be entered as "&lt;", "&amp;", and "&gt;" respectively. There are quite a few such "entities" which are defined to enter special characters. See this page for details.

Using the W3C specification HTML 4.01 - this includes a nice elements overview - together with the above rules and the general rule to avoid complicated looking constructs when possible, we found it not too difficult to produce sets of valid web pages.

9.5 Git usage

We assume here that you are familiar with the standard git commands git clone, git pull, git push, git update, git commit etc.

The source files for the web site are kept in the git repository https://github.com/gap-system/GapWWW. You may clone it by doing


git clone https://github.com/gap-system/GapWWW 

This command creates in your current directory a directory GapWWW with the complete source tree of the web site.

Source files are treated like any other source file in the git repository, that is you can update, modify, commit, add, remove them as usual.

The only thing one has to understand with respect to git is which implications the branch in which the change has appeared will have on the process of its publication:

A little comment on the rationale behind this setup might be in order. It allows that more than one person works independently on the website and those people exchange versions via git, without publishing them immediately. The actual guidelines who does what in this process should be agreed on separately.

9.6 The web server in St Andrews

Currently, the actually published version of the web site is contained in the directory /gap/GapWWW on the following machine in St Andrews:


    yin.mcs.st-andrews.ac.uk

This machine is not really a web server, but the real web server mounts and serves the directory /gap/GapWWW from yin via NFS.

The Mixer is checked out (still old CVS version, it has not been changed since it remained unchanged over several years) and installed in the directory /gap/Mixer. It can be called with the command


    /gap/Mixer/mixer.py

The files are checked out with ownership gapchron which is a user on yin with the same numerical user ID than the gap user. In other words, one has to be the user gap to manipulate the data. Note that the home directory of the user gap is in fact /gap.

To get access to this data the easiest and most secure way is probably to create an RSA key pair, append the public key to /gap/.ssh/authorized_keys and to keep the private key in the .ssh subdirectory of the user's home directory.

There is one shell script which is run by a website administrator to update the website. This script is in bin/updateGapWWW.sh. It basically pulls the latest version from the master repository and runs the Mixer. You can trigger the update manually by doing


   ssh gap@yin.mcs.st-andrews.ac.uk bin/updateGapWWW.sh

once you have ssh access to yin.

Before performing an update on yin, it is wise to check first whether the Mixer runs without an error message in your own checked out version of the website.

9.7 Installation on the web server

This section describes the procedure to install the GAP web site on a machine from scratch. Thus, this section is usually not needed because all this is already done on the machine yin.mcs.st-andrews.ac.uk. However, if one wants to have an exact copy of the web site or have to install it somewhere anew, this section is needed. This section was derived from the ASCII document GapWWW/INSTALL long time ago when it was under CVS control (so GapWWW/INSTALL is likely heavily outdated).

9.7-1 Needed ingredients

9.7-2 Installation procedure
  1. Clone the git repository GapWWW:

    
    git clone https://github.com/gap-system/GapWWW 
    
    

    This creates a subdirectory GapWWW in the current directory.

  2. Clone the git repository Mixer:

    
    git clone https://github.com/gap-system/Mixer
    
    

    This creates a subdirectory Mixer in the current directory.

  3. Unpack some (frozen) subtrees, which are in archives:

    
        cd GapWWW  
        gzip -dc ForumArchive.tar.gz | tar xvf -
        cd Gap3  
        gzip -dc Manual3.tar.gz | tar xvf -
        cd ..    
    
    
  4. Edit GapWWW/lib/config, see that file for instructions:

    
        vi lib/config
    
    

    In this file a few variables have to be defined to adapt the web pages to the local conditions.

  5. Copy a whole doc directory of a GAP distribution to the place mentioned in GapWWW/lib/config (see step 4.) in the variable GAPManualLink (this is GapWWW/Manuals in the current setup).

  6. The files for the GAP bibliography have been included into this directory tree in the repository.

    Create the html and PDF versions by:

    
        cd Doc/Bib
        gap4 convbib.g
        cd ../.. 
    
    

    Some more information about this is in GapWWW/Doc/Bib/INFO which is unchanges since 2010 and may be somewhat outdated.

  7. Install search facility:

    Things are in GapWWW/Search. You need the swish utility installed to create the index files for searching. Create a link in the Search directory to the swish executable. Then create index files by:

    
        cd Search
        ln -s PATHTOSWISH swish
        make
        cd ..
    
    

    (PATHTOSWISH has to be replaced by the path to the swish executable.)

    The CGI script GapWWW/Search/search.cgi will take care of the rest.

  8. Install package manuals:

    Copy the result of Frank's scripts to the place mentioned in GapWWW/lib/config (in the variable pkgmixerpath). (currently, this is GapWWW/Manuals, copy the whole pkg directory)

    To update the package pages, copy all .mixer files and pkgconf.py to GapWWW/Packages and rerun the Mixer.

  9. Make sure that the file GapWWW/lib/AllLinksOfAllHelpSections.data is always up-to-date (this has to be adjusted whenever the released manuals change).

    In the development version of GAP there is a file dev/LinksOfAllHelpSections.g. Read this with a current GAP version with all currently released packages installed and call WriteAllLinksOfAllHelpSections(), this writes the file AllLinksOfAllHelpSections.data. It has then to be checked in to its place under the GapWWW tree. Do not forget to publish the latest revision.

  10. Run the mixer:

    
        ../Mixer/mixer.py -f
    
    

    (the -f forces creation regardless of timestamps)

9.7-3 Installing updated versions

If things are changed in the repository, all that has to be done to update the pages locally is:


git pull

in the GapWWW directory, followed by a


  ../Mixer/mixer.py

The mixer has an option -f to force recreation of all pages. This is necessary if some general files like the address database lib/addresses or templates changes.

To change the sitemap, use yEd graph editor to modify sitemap.graphml file, then used yEd export menu to create sitemap.html file with associated .png image.

9.8 The GAP manuals on the web pages

All GAP manuals are available in HTML format via the web pages. This works by simply copying the doc directory of a complete GAP installation to the place specified by the variable GAPManualLink in GapWWW/lib/config (which is GapWWW/Manuals in the current setup). Note that those files are not under version control there, they are only copied to checked out working copies, like for example on the web server in St Andrews.

The single remaining point to explain is how one can specify links to manual sections on the web pages. This is done with a special Mixer tag like the following:


    <mixer manual="Reference: Lists">Chapter about lists</mixer>

This element creates a link to the manual section which would appear in the GAP help system when called with "?Reference: Lists", which happens to be the chapter in the reference manual about lists. The text of the link would be "Chapter about lists".

This works, because the Mixer has access to a file containing the links to all manual sections. This file resides in GapWWW/lib/AllLinksOfAllHelpSections.data, which is created using dev/LinksOfAllHelpSections.g in the development version of GAP as described in Section 9.7.

The value of the attribute "manual" in the "mixer" tag must be the complete text of the section heading the link should point to.

9.9 The GAP packages on the web pages

The archives and web pages for the GAP packages are generated by yet another set of tools described in Chapter 8. These generate for every package a .mixer-file and for all packages together a file pkgconf.py. All these files have to be put under version control in the directory GapWWW/Packages. These nodes then only have to be put into the tree by mentioning them in the tree file there.

9.10 The search engine on the web pages

The search engine on the web pages internally uses the swish tool. It is used to create an index of all pages which allows very fast searches when a user submits a query. All files for this setup are in the directory GapWWW/Search.

The indices are regenerated by doing


    touch everything.conf
    make

in that directory. This is done automatically every night, such that usually nothing has to be done after installation.

To make this work, one needs a swish executable and has to create a link GapWWW/Search/swish to that executable.

9.11 The GAP bibliography

The GAP bibliography resides in the directory GapWWW/Doc/Bib.

The source files are:

GapCite.MR

This file contains just MR numbers of papers that cite or refer to (one of the versions of) GAP (here and below "MR" stands for "Mathematical Reviews".). The format is alternatingly one line of the form 1stAuthorSurname Paper (not starting with a blank) and one line MR-Number (starting with a blank). MR numbers will be used to get full bibliographic info from MathSciNet, and the textual description only helps when adding papers to the file (in particular, to keep entries sorted by the first author).

GapCite.notyet

BiBTeX entries for papers that are not yet in MR but likely will be there in a few months

GapNonMR.bib

BiBTeX entries for papers that will not be in MR (e.g. theses)

NonVerif.MR

Things not yet verified, same format as GapCite.MR

NonVerif.NonMR

Things not yet verified, same format as GapCite.notyet

GapIgnore.MR

This file contains a list of GAP strings corresponding to MR numbers of papers that may be falsely reported by MathSciNet as citing GAP (for example, if they refer to the History of Mathematics Archive website wrongly stating its address in the GAP domain as may be returned by some search systems). If necessary, add new items there in an obvious way.

It is possible to check MathSciNet for new references to GAP reading the file updatebib.g into GAP. It will produce two files:

tobeadded.txt

This file has the same format as GapCite.MR and lists publications citing GAP which should be examined and after that added either to GapCite.MR or to GapIgnore.MR.

suggested.txt

This file contains suggestions to "move" certain entries from GapCite.notyet and GapNonMR.bib to GapCite.MR. All suggestions, including those which do not match the publication listed in the GAP bibliography, should be carefully examined before any changes.

Note that updatebib.g is not a complete solution for updating GAP bibliography. It searches for occurrences of the substring www.gap in citations (this covers both old and current addresses of the GAP website), but it does not cover publications citing GAP without its website or referring to it only in the text; finally, it covers only MathSciNet and does not look into other bibliography databases. Therefore, manual search still should be used to discover more GAP citations. The function SearchMathSciNetForUpdates from updatebib.g may be helpful in this direction since it performs more broad search in the MathSciNet, dropping some more strict limitations.

After the source files of the GAP bibliography are updated, the script newmakegapbib uses GapCite.MR, GapCite.notyet and GapNonMR.bib (and also HEADER and MRBIB) to produce gap-published.bib (this requires subscription to MathSciNet, which St Andrews has). The advantage of this approach is that MathSciNet gives us good BiBTeX entries (no need to look up journal names or diacritic characters) and their updates, and MR numbers we can link to. It also makes it easier to add entries as only the MR number is needed.

At the end of its work newmakegapbib will also display error messages reporting MR numbers whose BibTeX record it failed to fetch from MathSciNet -- these should be investigated since they may point out on some inconsistencies in our data.

(There is also a script GETMR that will return MR numbers for papers -- convenient to look up a large number of papers one found in the citation index.)

Finally GAP itself called with convbib.g produces the web page and a nice PDF bibliography from gap-published.bib (using further helper files gapbib.tex and gap-head.bib). The resulting files are gap-published.html and gap-published.pdf which are linked from the main web page bib.html. NOTE that gap-published.html and gap-published.pdf are not under version control because they can be generated automatically by convbib.g rather quickly. In addition, convbib.g creates statistics.generated and statistics.mscreport - two pages with tables which are used in statistics.mixer to create statistics.html.

The output of convbib.g should be also checked for errors and warnings reporting repeated entries, incomplete BibTeX records (mostly may be ignored), etc.

NOTE: The current setup does not run GAP on convbib.g every night. This means that everybody who changes the GAP bibliography has to do this manually on yin after every change.

9.12 The sitemap

The sitemap picture is generated and edited in the following way: The original source is the file sitemap.graphml which is generated and edited with the yEd program. yEd functionality allows to export the sitemap as a clickable HTML image map, producing two files sitemap.html and sitemap1_1.png. Because the sitemap usually does not change very much, these two files are also kept under the version control.

9.13 The GAP forum archive

Until December 2003 the GAP forum archive was handled by a tool written especially for this task. At that point it was switched to mailman, a generic tool for mailing list, which also does the archiving. Therefore the old forum archives are frozen in form of a huge amount of HTML pages. These are not kept under version control as single files but as one big binary archive under GapWWW/ForumArchive.tar.gz.

To install those pages in a checked out working copy one just has to extract this archive by doing


    gzip -dc ForumArchive.tar.gz | tar xf -

in the GapWWW directory as explained in Section 9.7.

 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 
Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 Ind

generated by GAPDoc2HTML