Design document 1-18-05; dcw When I first suggested writing a little program to generate the TPJ web site, I had something simple in mind. For each issue, there would be a directory, e.g., /pracjour/2005-1. In that directory would be the PDFs of each paper in the issue and a driver file (e.g., pracjourn/2005-1/2005-1.txt) which contained enough information to generate the table of contents for the issue. Shortly, it became clear that title and author indexes and bibtex entries also could be built from the information in such driver files. To see what an issue driver file currently looks like (after I make the change to allow multiple authors for a paper), see http://dw2.tug.org/pracjourn/2005-1/2005-1.txt My further view and hope was that the web site generation maintenance side of TPJ could be relatively independent of the editorial side of TPJ -- that the editorial side could prepare the PDFs of papers and the issue driver file (e.g., not on the TUG server) and then send them to me (e.g., zipped in a directory such as 2005-1), and I would run the web site generation program to add the issue to the web site. Well, as they say, that was then and this is now. Since my original suggestion of a web site generation program I have had a slow dawning of what the rest of your probably already understood about what was desirable for this journal web site, and we have had resulting creeping featurism in the program. You can understand the current function of the program and structure of its databases by looking at www.tug.org/pracjourn and at http://dw2.tug.org/pracjourn/readme.txt Today, within the directory for each issue (e.g., pracjourn/2005-1) there is a directory for each paper with names like 2005-1/waud, 2005-1/walden, 2005-1/carnes-welcome, and 2005-1/carnes-conference See, for instance, any one of those directories, e.g., http://dw.tug.org/pracjourn/2005-1/waud In each such directory, optionally (from the point of view of the site generation program), are a .tex file for the paper, a .pdf file, a file _p.html with the html text of the paper, and several other little files containing the abstract text, indications of links to example files, a graphic for the author's email address, etc.). The site generation program turns various of these files into an html file with the name index.html which is what (in effect) is linked to from the table of contents and the title and author indexes. See, for instance, www.tug.org/pracjourn/2005-1/walden/index.html or (the same thing) www.tug.org/pracjourn/2005-1/walden Last night I was remembering the rule of thumb I used to use when I was managing programming projects: it takes at least the same level of staffing to maintain a computer program over its lifetime as it did to develop the program (and the lifetime is a lot longer than the development time). And then I remembered, I'm was the developer and now I am the only maintainer -- oh no! I'm thinking of two revisions to the program right now: one to add some stuff we have been talking about, and one to do a clean-up so I won't be embarrassed to show the code to someone else. These thoughts, and our recent experience getting out the first TPJ issue, bring three issues to my mind: 1. I am about to expand the number of little files that are in each paper directory, so the program can recognize and uniformly format more stuff, e.g., bios and email addresses. But having so many little file to create, update, and possibly deal with in the face of some later program changes, could become unwieldy. 2. I see a need to separate things so an issue can be prepared by the editorial side of TPJ, perhaps somewhere else from the TUG server, and the nearly final issue directory and driver file and paper directories are passed to the web site generation side of TPJ for posting. 3. I see potential need for non-backward compatibility, i.e., if the format of the html page for a paper changes, do we want to have to go back and reformat all papers in all past issues? Therefore, I have several thoughts I would like you to consider: a) For future issues, perhaps the editorial side of TPJ should create the issue and paper directories and the issue driver file (using instructions I will provide) in their own area on the TUG server or somewhere other than the TUG server. When things are ready, they could be moved (e.g., FTPed) to the web site part of the TUG server. This idea addresses point 2 above. b) Rather than adding more little files to each paper directory and hand coding into the program what to do with all those little files, perhaps there should be one driver file for the html page for each paper; for an idea of what this might look like, see: http://dw2.tug.org/pracjourn/_page.txt Of course, interpretation of this driver file would be coded into the program. This idea addresses point 1 above. c) Perhaps the program that converts all the little files in a paper directory or converts the one driver file (see point b) into index.html for the paper should be separated from the main Perl program and put in the issue directory. Thus, the editorial side could use it to view the html pages they are created for papers, and potentially different such programs could be in the directories for various issues, thus retaining a capability to recreate the html pages for an old issue. This idea addresses points 2 and 3 above. Subnote i. The editorial side might also want to modify this paper-page generation program to do more what they wish it would do. Subnote ii. The main web site generation program could just assume the presence of an index.html page in each paper directory and not worry about generating them at all, or it could know how to find the paper-page generation routine in the issue directory and use it to regenerate the index.html pages for each paper in the issue. Subnote iii. I could also provide the editorial side with a program to generate the table of contents for an issue, e.g., as it looks in the archive of back issues -- e.g., see http://www.tug.org/pracjourn/2005-1/index.html d) Perhaps we should forget about having a program which generates index.html for papers, and the editorial side of TPJ should just develop an html page for each paper containing whatever information the editors want. In this case, the main web site generation program would then just assume the presence of an index.html file in each paper directory. In an manner of speaking, this idea addresses points 1, 2 and 3 above. Several of the above lettered points help partition TPJ work into some things that can happen on the editorial side and other things that can happen on the web site generation side. However, nothing I have said here cleanly partitions how we maintain the look of the whole web site (as opposed to the look of the individual papers which I discussed above). Of course, some of the look of the whole web site is built-in to the main web site generation Perl code (e.g., creating the indexes); other parts of the look are built into html templates in a separate file the main program accesses. I will be happy to help someone who wants to modify the templates to make something look different, and in some cases I may have to modify the program. The main thing to note here is that I think we should keep relatively separate in our minds the look of the index.html page for individual papers and the look of the rest of the web site in which the individual issues are embedded. Just in case it is needed, let me provide a bit more context for my thinking. Mainly I am trying to think about the trade-offs between development and long term operations (it's the kind of thing I think about), i.e., how to partition things so things are manageable in the long run. Especially, I am trying to highlight some possibilities now since I am about to make some more changes to the program and it would be good to be moving in the right direction: a) so web-site-maintenance only has to be involved at the end of creating an issue and the editorial side or a production editor can create a new issue without lots of interaction with web-site-maintenance (this was my goal from the beginning) b) that paper look is separated from web site look (they can both be guided by editorial, but they involve different sorts of changes) c) that we have the right mix of backward compatibility and freedom from an over burden of backward compatibility and, somewhat orthogonally, d) not forgetting the possibility of automating less in the issue generation domain, although personally I would not choose that option Writing this note did convince me that I should separate whatever work my program does building issues from the work it does building the greater web site structure in which the issue reside, for two reasons: a) to allow editorial and the production editor to build an issue without touching the rest of the web site, and b) to allow the possibility of old issues not having to be regenerated if we later change the format of issues. Once this separation is made, it will be easy to bind issue regeneration to web site generation as much as we desire (but no more).