NAME
sisu - documents: markup, structuring, publishing in multiple standard
formats, and search
SYNOPSIS
sisu [-abcDdFehIiMmNnopqRrSsTtUuVvwXxYyZz0-9] [filename/wildcard]
sisu [-Ddcv] [instruction] [filename/wildcard]
sisu [-CcFLSVvW]
sisu --v2 [operations]
sisu --v1 [operations]
SISU - MANUAL,
RALPH AMISSAH
WHAT IS SISU?
1. INTRODUCTION - WHAT IS SISU?
SiSU is a framework for document structuring, publishing (in multiple
open standard formats) and search, comprising of: (a) a lightweight
document structure and presentation markup syntax; and (b) an
accompanying engine for generating standard document format outputs
from documents prepared in sisu markup syntax, which is able to produce
multiple standard outputs (including the population of sql databases)
that (can) share a common numbering system for the citation of text
within a document.
SiSU is developed under an open source, software libre license (GPL3).
Its use case for development is work with medium to large document sets
and cope with evolving document formats/ representation technologies.
Documents are prepared once, and generated as need be to update the
technical presentation or add additional output formats. Various output
formats (including search related output) share a common mechanism for
cross-output-format citation.
SiSU both defines a markup syntax and provides an engine that produces
open standards format outputs from documents prepared with SiSU markup.
From a single lightly prepared document sisu custom builds several
standard output formats which share a common (text object) numbering
system for citation of content within a document (that also has
implications for search). The sisu engine works with an abstraction of
the document’s structure and content from which it is possible to
generate different forms of representation of the document.
Significantly SiSU markup is more sparse than html and outputs which
include html, EPUB, LaTeX, landscape and portrait pdfs, Open Document
Format (ODF), all of which can be added to and updated. SiSU is also
able to populate SQL type databases at an object level, which means
that searches can be made with that degree of granularity.
Source document preparation and output generation is a two step
process: (i) document source is prepared, that is, marked up in sisu
markup syntax and (ii) the desired output subsequently generated by
running the sisu engine against document source. Output representations
if updated (in the sisu engine) can be generated by re-running the
engine against the prepared source. Using SiSU markup applied to a
document, SiSU custom builds (to take advantage of the strengths of
different ways of representing documents) various standard open output
formats including plain text, HTML, XHTML, XML, EPUB, OpenDocument,
LaTeX or PDF files, and populate an SQL database with objects[^1]
(equating generally to paragraph-sized chunks) so searches may be
performed and matches returned with that degree of granularity ( e.g.
your search criteria is met by these documents and at these locations
within each document). Document output formats share a common object
numbering system for locating content. This is particularly suitable
for "published" works (finalized texts as opposed to works that are
frequently changed or updated) for which it provides a fixed means of
reference of content.
In preparing a SiSU document you optionally provide semantic
information related to the document in a document header, and in
marking up the substantive text provide information on the structure of
the document, primarily indicating heading levels and footnotes. You
also provide information on basic text attributes where used. The rest
is automatic, sisu from this information custom builds[^2] the
different forms of output requested.
SiSU works with an abstraction of the document based on its structure
which is comprised of its headings[^3] and objects[^4], which enables
SiSU to represent the document in many different ways, and to take
advantage of the strengths of different ways of presenting documents.
The objects are numbered, and these numbers can be used to provide a
common basis for citing material within a document across the different
output format types. This is significant as page numbers are not well
suited to the digital age, in web publishing, changing a browser’s
default font or using a different browser can mean that text will
appear on a different page; and publishing in different formats, html,
landscape and portrait pdf etc. again page numbers are not useful to
cite text. Dealing with documents at an object level together with
object numbering also has implications for search that SiSU is able to
take advantage of.
One of the challenges of maintaining documents is to keep them in a
format that allows use of them independently of proprietary platforms.
Consider issues related to dealing with legacy proprietary formats
today and what guarantee you have that old proprietary formats will
remain (or can be read without proprietary software/equipment) in 15
years time, or the way the way in which html has evolved over its
relatively short span of existence. SiSU provides the flexibility of
producing documents in multiple non-proprietary open formats including
html, pdf[^5] ODF,[^6] and EPUB.[^7] Whilst SiSU relies on software,
the markup is uncomplicated and minimalistic which guarantees that
future engines can be written to run against it. It is also easily
converted to other formats, which means documents prepared in SiSU can
be migrated to other document formats. Further security is provided by
the fact that the software itself, SiSU is available under GPL3 a
licence that guarantees that the source code will always be open, and
free as in libre, which means that that code base can be used, updated
and further developed as required under the terms of its license.
Another challenge is to keep up with a moving target. SiSU permits new
forms of output to be added as they become important, (Open Document
Format text was added in 2006 when it became an ISO standard for office
applications and the archival of documents), EPUB was introduced in
2009; and allows the technical representations existing output to be
updated (html has evolved and the related module has been updated
repeatedly over the years, presumably when the World Wide Web
Consortium (w3c) finalises html 5 which is currently under development,
the html module will again be updated allowing all existing documents
to be regenerated as html 5).
The document formats are written to the file-system and available for
indexing by independent indexing tools, whether off the web like Google
and Yahoo or on the site like Lucene and Hyperestraier.
SiSU also provides other features such as concordance files and
document content certificates, and the working against an abstraction
of document structure has further possibilities for the research and
development of other document representations, the availability of
objects is useful for example for topic maps and thesauri, together
with the flexibility of SiSU offers great possibilities.
SiSU is primarily for published works, which can take advantage of the
citation system to reliably reference its documents. SiSU works well
in a complementary manner with such collaborative technologies as
Wikis, which can take advantage of and be used to discuss the substance
of content prepared in SiSU
<http://www.jus.uio.no/sisu>
2. COMMANDS SUMMARY
2.1 DESCRIPTION
SiSU SiSU is a document publishing system, that from a simple single
marked-up document, produces multiple of output formats including:
plaintext, html, xhtml, XML, epub, odt (odf text), LaTeX, pdf, info,
and SQL (PostgreSQL and SQLite), which share numbered text objects
("object citation numbering") and the same document structure
information. For more see: <http://www.jus.uio.no/sisu>
2.2 DOCUMENT PROCESSING COMMAND FLAGS
-a [filename/wildcard]
produces plaintext with Unix linefeeds and without markup,
(object numbers are omitted), has footnotes at end of each
paragraph that contains them [ -A for equivalent dos
(linefeed) output file] [see -e for endnotes]. (Options
include: --endnotes for endnotes --footnotes for footnotes at
the end of each paragraph --unix for unix linefeed (default)
--msdos for msdos linefeed)
-b [filename/wildcard]
produces xhtml/XML output for browser viewing (sax parsing).
-C [--init-site]
configure/initialise shared output directory files initialize
shared output directory (config files such as css and dtd files
are not updated if they already exist unless modifier is used).
-C --init-site configure/initialise site more extensive than -C
on its own, shared output directory files/force update, existing
shared output config files such as css and dtd files are updated
if this modifier is used.
-CC configure/initialise shared output directory files initialize
shared output directory (config files such as css and dtd files
are not updated if they already exist unless modifier is used).
The equivalent of: -C --init-site configure/initialise site,
more extensive than -C on its own, shared output directory
files/force update, existing shared output config files such as
css and dtd files are updated if -CC is used.
-c [filename/wildcard]
screen toggle ansi screen colour on or off depending on default
set (unless -c flag is used: if sisurc colour default is set to
’true’, output to screen will be with colour, if sisurc colour
default is set to ’false’ or is undefined screen output will be
without colour).
-D [instruction] [filename]
database postgresql ( --pgsql may be used instead) possible
instructions, include: --createdb; --create; --dropall; --import
[filename]; --update [filename]; --remove [filename]; see
database section below.
-d [--db-[database type (sqlite|pg)]] --[instruction] [filename]
database type default set to sqlite, (for which --sqlite may be
used instead) or to specify another database --db-[pgsql,
sqlite] (however see -D) possible instructions include:
--createdb; --create; --dropall; --import [filename]; --update
[filename]; --remove [filename]; see database section below.
-e [filename/wildcard]
produces an epub document, [sisu version 2 only]
(filename.epub)
-F [--webserv=webrick]
generate examples of (naive) cgi search form for sqlite and
pgsql depends on your already having used sisu to populate an
sqlite and/or pgsql database, (the sqlite version scans the
output directories for existing sisu_sqlite databases, so it is
first necessary to create them, before generating the search
form) see -d -D and the database section below. If the optional
parameter --webserv=webrick is passed, the cgi examples created
will be set up to use the default port set for use by the
webrick server, (otherwise the port is left blank and the system
setting used, usually 80). The samples are dumped in the present
work directory which must be writable, (with screen instructions
given that they be copied to the cgi-bin directory). -Fv (in
addition to the above) provides some information on setting up
hyperestraier for sisu
-h [filename/wildcard]
produces html output, segmented text with table of contents
(toc.html and index.html) and the document in a single file
(scroll.html)
-I [filename/wildcard]
produces texinfo and info file, (view with pinfo).
-i [filename/wildcard]
produces man page of file, not suitable for all outputs.
-L prints license information.
-M [filename/wildcard/url]
maintenance mode files created for processing preserved and
their locations indicated. (also see -V)
-m [filename/wildcard/url]
assumed for most other flags, creates new intermediate files for
processing (document abstraction) that is used in all subsequent
processing of other output. This step is assumed for most
processing flags. To skip it see -n
-N [filename/wildcard/url]
document digest or document content certificate ( DCC ) as md5
digest tree of the document: the digest for the document, and
digests for each object contained within the document (together
with information on software versions that produced it)
(digest.txt). -NV for verbose digest output to screen.
-n [filename/wildcard/url]
skip the creation of intermediate processing files (document
abstraction) if they already exist, this skips the equivalent of
-m which is otherwise assumed by most processing flags.
-o [filename/wildcard/url]
output basic document in opendocument file format
(opendocument.odt).
-p [filename/wildcard]
produces LaTeX pdf (portrait.pdf & landscape.pdf). Default paper
size is set in config file, or document header, or provided with
additional command line parameter, e.g. --papersize-a4 preset
sizes include: ’A4’, U.S. ’letter’ and
-q [filename/wildcard]
quiet less output to screen.
-R [filename/wildcard]
copies sisu output files to remote host using rsync. This
requires that sisurc.yml has been provided with information on
hostname and username, and that you have your "keys" and ssh
agent in place. Note the behavior of rsync different if -R is
used with other flags from if used alone. Alone the rsync
--delete parameter is sent, useful for cleaning the remote
directory (when -R is used together with other flags, it is
not). Also see -r
-r [filename/wildcard]
copies sisu output files to remote host using scp. This requires
that sisurc.yml has been provided with information on hostname
and username, and that you have your "keys" and ssh agent in
place. Also see -R
-S produces a sisupod a zipped sisu directory of markup files
including sisu markup source files and the directories local
configuration file, images and skins. Note: this only includes
the configuration files or skins contained in
./_sisu not those in ~/.sisu -S [filename/wildcard] option.
Note: (this option is tested only with zsh).
-S [filename/wildcard]
produces a zipped file of the prepared document specified along
with associated images, by default named sisupod.zip they may
alternatively be named with the filename extension .ssp This
provides a quick way of gathering the relevant parts of a sisu
document which can then for example be emailed. A sisupod
includes sisu markup source file, (along with associated
documents if a master file, or available in multilingual
versions), together with related images and skin. SiSU commands
can be run directly against a sisupod contained in a local
directory, or provided as a url on a remote site. As there is a
security issue with skins provided by other users, they are not
applied unless the flag --trust or --trusted is added to the
command instruction, it is recommended that file that are not
your own are treated as untrusted. The directory structure of
the unzipped file is understood by sisu, and sisu commands can
be run within it. Note: if you wish to send multiple files, it
quickly becomes more space efficient to zip the sisu markup
directory, rather than the individual files for sending). See
the -S option without [filename/wildcard].
-s [filename/wildcard]
copies sisu markup file to output directory.
-t [filename/wildcard (*.termsheet.rb)]
standard form document builder, preprocessing feature
-U [filename/wildcard]
prints url output list/map for the available processing flags
options and resulting files that could be requested, (can be
used to get a list of processing options in relation to a file,
together with information on the output that would be produced),
-u provides url output mapping for those flags requested for
processing. The default assumes sisu_webrick is running and
provides webrick url mappings where appropriate, but these can
be switched to file system paths in sisurc.yml
-u [filename/wildcard]
provides url mapping of output files for the flags requested for
processing, also see -U
-V on its own, provides SiSU version and environment information
(sisu --help env)
-V [filename/wildcard]
even more verbose than the -v flag. (also see -M)
-v on its own, provides SiSU version information
-v [filename/wildcard]
provides verbose output of what is being generated, where output
is placed (and error messages if any), as with -u flag provides
a url mapping of files created for each of the processing flag
requests. See also -V
-W starts ruby’s webrick webserver points at sisu output
directories, the default port is set to 8081 and can be changed
in the resource configuration files. [tip: the webrick server
requires link suffixes, so html output should be
created using the -h option rather than -H ; also,
note -F webrick ].
-w [filename/wildcard]
produces concordance (wordmap) a rudimentary index of all the
words in a document. (Concordance files are not generated for
documents of over 260,000 words unless this limit is increased
in the file sisurc.yml)
-X [filename/wildcard]
produces XML output with deep document structure, in the nature
of dom.
-x [filename/wildcard]
produces XML output shallow structure (sax parsing).
-Y [filename/wildcard]
produces a short sitemap entry for the document, based on html
output and the sisu_manifest. --sitemaps generates/updates the
sitemap index of existing sitemaps. (Experimental, [g,y,m
announcement this week])
-y [filename/wildcard]
produces an html summary of output generated (hyperlinked to
content) and document specific metadata (sisu_manifest.html).
This step is assumed for most processing flags.
-Z [filename/wildcard]
Zap, if used with other processing flags deletes output files of
the type about to be processed, prior to processing. If -Z is
used as the lone processing related flag (or in conjunction with
a combination of -[mMvVq]), will remove the related document
output directory.
-z [filename/wildcard]
produces php (zend) [this feature is disabled for the time
being]
--harvest *.ss[tm]
makes two lists of sisu output based on the sisu markup
documents in a directory: list of author and authors works (year
and titles), and; list by topic with titles and author. Makes
use of header metadata fields (author, title, date,
topic_register). Can be used with maintenance (-M) and remote
placement (-R) flags.
3. COMMAND LINE MODIFIERS
--no-ocn
[with -h -H or -p] switches off object citation numbering.
Produce output without identifying numbers in margins of html or
LaTeX/pdf output.
--no-annotate
strips output text of editor endnotes[^*1] denoted by asterisk
or dagger/plus sign
--no-asterisk
strips output text of editor endnotes[^*2] denoted by asterisk
sign
--no-dagger
strips output text of editor endnotes[^+1] denoted by
dagger/plus sign
4. DATABASE COMMANDS
dbi - database interface
-D or --pgsql set for postgresql -d or --sqlite default set for sqlite
-d is modifiable with --db=[database type (pgsql or sqlite)]
-Dv --createall
initial step, creates required relations (tables, indexes) in
existing postgresql database (a database should be created
manually and given the same name as working directory, as
requested) (rb.dbi) [ -dv --createall sqlite equivalent] it
may be necessary to run sisu -Dv --createdb initially NOTE: at
the present time for postgresql it may be necessary to manually
create the database. The command would be ’createdb [database
name]’ where database name would be SiSU_[present working
directory name (without path)]. Please use only alphanumerics
and underscores.
-Dv --import
[filename/wildcard] imports data specified to postgresql db
(rb.dbi) [ -dv --import sqlite equivalent]
-Dv --update
[filename/wildcard] updates/imports specified data to postgresql
db (rb.dbi) [ -dv --update sqlite equivalent]
-D --remove
[filename/wildcard] removes specified data to postgresql db
(rb.dbi) [ -d --remove sqlite equivalent]
-D --dropall
kills data" and drops (postgresql or sqlite) db, tables &
indexes [ -d --dropall sqlite equivalent]
The v in e.g. -Dv is for verbose output.
5. SHORTCUTS, SHORTHAND FOR MULTIPLE FLAGS
--update [filename/wildcard]
Checks existing file output and runs the flags required to
update this output. This means that if only html and pdf output
was requested on previous runs, only the -hp files will be
applied, and only these will be generated this time, together
with the summary. This can be very convenient, if you offer
different outputs of different files, and just want to do the
same again.
-0 to -5 [filename or wildcard]
Default shorthand mappings (note that the defaults can be
changed/configured in the sisurc.yml file):
-0 -mNhwpAobxXyYv [this is the default action run when no
options are give, i.e. on ’sisu [filename]’]
-1 -mhewpy
-2 -mhewpaoy
-3 -mhewpAobxXyY
-4 -mhewpAobxXDyY --import
-5 -mhewpAobxXDyY --update
add -v for verbose mode and -c for color, e.g. sisu -2vc
[filename or wildcard]
consider -u for appended url info or -v for verbose output
5.1 COMMAND LINE WITH FLAGS - BATCH PROCESSING
In the data directory run sisu -mh filename or wildcard eg. "sisu -h
cisg.sst" or "sisu -h *.{sst,ssm}" to produce html version of all
documents.
Running sisu (alone without any flags, filenames or wildcards) brings
up the interactive help, as does any sisu command that is not
recognised. Enter to escape.
6. HELP
6.1 SISU MANUAL
The most up to date information on sisu should be contained in the
sisu_manual, available at:
<http://sisudoc.org/sisu/sisu_manual/>
The manual can be generated from source, found respectively, either
within the SiSU tarball or installed locally at:
./data/doc/sisu/v2/sisu_markup_samples/sisu_manual/
/usr/share/doc/sisu/v2/sisu_markup_samples/sisu_manual/
move to the respective directory and type e.g.:
sisu sisu_manual.ssm
6.2 SISU MAN PAGES
If SiSU is installed on your system usual man commands should be
available, try:
man sisu
man sisu_markup
man sisu_commands
Most SiSU man pages are generated directly from sisu documents that
are used to prepare the sisu manual, the sources files for which are
located within the SiSU tarball at:
./data/doc/sisu/v2/sisu_markup_samples/sisu_manual/
Once installed, directory equivalent to:
/usr/share/doc/sisu/sisu_manual/
Available man pages are converted back to html using man2html:
/usr/share/doc/sisu/.html/
./data/doc/sisu/.html/
An online version of the sisu man page is available here:
* various sisu man pages <http://www.jus.uio.no/sisu/man/> [^8]
* sisu.1 <http://www.jus.uio.no/sisu/man/sisu.html> [^9]
6.3 SISU BUILT-IN INTERACTIVE HELP
This is particularly useful for getting the current sisu
setup/environment information:
sisu --help
sisu --help [subject]
sisu --help commands
sisu --help markup
sisu --help env [for feedback on the way your system is
setup with regard to sisu]
sisu -V [environment information, same as above command]
sisu (on its own provides version and some help information)
Apart from real-time information on your current configuration the
SiSU manual and man pages are likely to contain more up-to-date
information than the sisu interactive help (for example on commands and
markup).
NOTE: Running the command sisu (alone without any flags, filenames or
wildcards) brings up the interactive help, as does any sisu command
that is not recognised. Enter to escape.
6.4 HELP SOURCES
For lists of alternative help sources, see:
man page
man sisu_help_sources
man2html
/usr/share/doc/sisu/.html/sisu.html
<http://sisudoc.org/sisu/sisu_help_sources/index.html>
7. INTRODUCTION TO SISU MARKUP[^10]
7.1 SUMMARY
SiSU source documents are plaintext (UTF-8)[^11] files
All paragraphs are separated by an empty line.
Markup is comprised of:
* at the top of a document, the document header made up of semantic
meta-data about the document and if desired additional processing
instructions (such an instruction to automatically number headings from
a particular level down)
* followed by the prepared substantive text of which the most
important single characteristic is the markup of different heading
levels, which define the primary outline of the document structure.
Markup of substantive text includes:
* heading levels defines document structure
* text basic attributes, italics, bold etc.
* grouped text (objects), which are to be treated differently, such
as code
blocks or poems.
* footnotes/endnotes
* linked text and images
* paragraph actions, such as indent, bulleted, numbered-lists, etc.
Some interactive help on markup is available, by typing sisu and
selecting markup or sisu --help markup
To check the markup in a file:
sisu --identify [filename].sst
For brief descriptive summary of markup history
sisu --query-history
or if for a particular version:
sisu --query-0.38
7.2 MARKUP EXAMPLES
7.2.1 ONLINE
Online markup examples are available together with the respective
outputs produced from <http://www.jus.uio.no/sisu/SiSU/examples.html>
or from <http://www.jus.uio.no/sisu/sisu_examples/>
There is of course this document, which provides a cursory overview of
sisu markup and the respective output produced:
<http://www.jus.uio.no/sisu/sisu_markup/>
Some example marked up files are available as html with syntax
highlighting for viewing: <http://www.jus.uio.no/sisu/sample/syntax>
an alternative presentation of markup syntax:
<http://www.jus.uio.no/sisu/sample/on_markup.txt>
7.2.2 INSTALLED
With SiSU installed sample skins may be found in:
/usr/share/doc/sisu/sisu_markup_samples/dfsg (or equivalent directory)
and if sisu-markup-samples is installed also under:
/usr/share/doc/sisu/sisu_markup_samples/non-free
8. MARKUP OF HEADERS
Headers contain either: semantic meta-data about a document, which can
be used by any output module of the program, or; processing
instructions.
Note: the first line of a document may include information on the
markup version used in the form of a comment. Comments are a percentage
mark at the start of a paragraph (and as the first character in a line
of text) followed by a space and the comment:
% this would be a comment
8.1 SAMPLE HEADER
This current document is loaded by a master document that has a header
similar to this one:
% SiSU master 2.0
@title: SiSU
:subtitle: Manual
@creator: :author: Amissah, Ralph
@rights: Copyright (C) Ralph Amissah 2007, License GPL 3
@classify:
:type: information
:topic_register: SiSU:manual;electronic documents:SiSU:manual
:subject: ebook, epublishing, electronic book, electronic publishing,
electronic document, electronic citation, data structure,
citation systems, search
% used_by: manual
@date: :published: 2008-05-22
:created: 2002-08-28
:issued: 2002-08-28
:available: 2002-08-28
:modified: 2010-03-03
@make: :num_top: 1
:breaks: new=C; break=1
:skin: skin_sisu_manual
:bold: /Gnu|Debian|Ruby|SiSU/
:manpage: name=sisu - documents: markup, structuring, publishing
in multiple standard formats, and search;
synopsis=sisu [-abcDdeFhIiMmNnopqRrSsTtUuVvwXxYyZz0-9] [filename/wildcard ]
. sisu [-Ddcv] [instruction]
. sisu [-CcFLSVvW]
. sisu --v2 [operations]
. sisu --v1 [operations]
@links: { SiSU Manual }http://www.jus.uio.no/sisu/sisu_manual/
{ Book Samples and Markup Examples }http://www.jus.uio.no/sisu/SiSU/examples.html
{ SiSU @ Wikipedia }http://en.wikipedia.org/wiki/SiSU
{ SiSU @ Freshmeat }http://freshmeat.net/projects/sisu/
{ SiSU @ Ruby Application Archive }http://raa.ruby-lang.org/project/sisu/
{ SiSU @ Debian }http://packages.qa.debian.org/s/sisu.html
{ SiSU Download }http://www.jus.uio.no/sisu/SiSU/download.html
{ SiSU Changelog }http://www.jus.uio.no/sisu/SiSU/changelog.html
{ SiSU help }http://www.jus.uio.no/sisu/sisu_manual/sisu_help/
{ SiSU help sources }http://www.jus.uio.no/sisu/sisu_manual/sisu_help_sources/
8.2 AVAILABLE HEADERS
Header tags appear at the beginning of a document and provide meta
information on the document (such as the Dublin Core), or information
as to how the document as a whole is to be processed. All header
instructions take either the form @headername: or 0~headername. All
Dublin Core meta tags are available
@indentifier: information or instructions
where the "identifier" is a tag recognised by the program, and the
"information" or "instructions" belong to the tag/indentifier specified
Note: a header where used should only be used once; all headers apart
from @title: are optional; the @structure: header is used to describe
document structure, and can be useful to know.
This is a sample header
% SiSU 2.0 [declared file-type identifier with markup version]
@title: [title text] [this header is the only one that is mandatory]
:subtitle: [subtitle if any]
:language: English
@creator: :author: [Lastname, First names]
:illustrator: [Lastname, First names]
:translator: [Lastname, First names]
:prepared_by: [Lastname, First names]
@date: :published: [year or yyyy-mm-dd]
:created: [year or yyyy-mm-dd]
:issued: [year or yyyy-mm-dd]
:available: [year or yyyy-mm-dd]
:modified: [year or yyyy-mm-dd]
:valid: [year or yyyy-mm-dd]
:added_to_site: [year or yyyy-mm-dd]
:translated: [year or yyyy-mm-dd]
@rights: :copyright: Copyright (C) [Year and Holder]
:license: [Use License granted]
:text: [Year and Holder]
:translation: [Name, Year]
:illustrations: [Name, Year]
@classify:
:topic_register: SiSU:markup sample:book;book:novel:fantasy
:type:
:subject:
:description:
:keywords:
:abstract:
:isbn: [ISBN]
:loc: [Library of Congress classification]
:dewey: [Dewey classification
:pg: [Project Gutenberg text number]
@links: { SiSU }http://www.jus.uio.no/sisu/
{ FSF }http://www.fsf.org
@make:
:skin: skin_name
[skins change default settings related to the appearance of documents generated]
:num_top: 1
:headings: [text to match for each level
(e.g. PART; Chapter; Section; Article;
or another: none; BOOK|FIRST|SECOND; none; CHAPTER;)
:breaks: new=:C; break=1
:promo: sisu, ruby, sisu_search_libre, open_society
:bold: [regular expression of words/phrases to be made bold]
:italics: [regular expression of words/phrases to italicise]
@original: :language: [language]
@notes: :comment:
:prefix: [prefix is placed just after table of contents]
9. MARKUP OF SUBSTANTIVE TEXT
9.1 HEADING LEVELS
Heading levels are :A~ ,:B~ ,:C~ ,1~ ,2~ ,3~ ... :A - :C being part /
section headings, followed by other heading levels, and 1 -6 being
headings followed by substantive text or sub-headings. :A~ usually the
title :A~? conditional level 1 heading (used where a stand-alone
document may be imported into another)
:A~ [heading text] Top level heading [this usually has similar
content to the title @title: ] NOTE: the heading levels described
here are in 0.38 notation, see heading
:B~ [heading text] Second level heading [this is a heading level
divider]
:C~ [heading text] Third level heading [this is a heading level
divider]
1~ [heading text] Top level heading preceding substantive text of
document or sub-heading 2, the heading level that would normally be
marked 1. or 2. or 3. etc. in a document, and the level on which sisu
by default would break html output into named segments, names are
provided automatically if none are given (a number), otherwise takes
the form 1~my_filename_for_this_segment
2~ [heading text] Second level heading preceding substantive text of
document or sub-heading 3, the heading level that would normally be
marked 1.1 or 1.2 or 1.3 or 2.1 etc. in a document.
3~ [heading text] Third level heading preceding substantive text of
document, that would normally be marked 1.1.1 or 1.1.2 or 1.2.1 or
2.1.1 etc. in a document
1~filename level 1 heading,
% the primary division such as Chapter that is followed by substantive text,
% and may be further subdivided (this is the level on which by default html
% segments are made)
9.2 FONT ATTRIBUTES
markup example:
normal text !{emphasis}! *{bold text}* _{underscore}_ /{italics}/
normal text
!{emphasis}!
*{bold text}*
_{underscore}_
/{italics}/
^{superscript}^
,{subscript},
+{inserted text}+
-{strikethrough}-
resulting output:
normal text emphasis bold text underscore italics "citation"
^superscript^ [subscript] ++inserted text++ --strikethrough--
normal text
emphasis
bold text
underscore
italics
"citation"
^superscript^
[subscript]
++inserted text++
--strikethrough--
9.3 INDENTATION AND BULLETS
markup example:
ordinary paragraph
_1 indent paragraph one step
_2 indent paragraph two steps
_9 indent paragraph nine steps
resulting output:
ordinary paragraph
indent paragraph one step
indent paragraph two steps
indent paragraph nine steps
markup example:
_* bullet text
_1* bullet text, first indent
_2* bullet text, two step indent
resulting output:
* bullet text
* bullet text, first indent
* bullet text, two step indent
Numbered List (not to be confused with headings/titles, (document
structure))
markup example:
# numbered list numbered list 1., 2., 3, etc.
_# numbered list numbered list indented a., b., c., d., etc.
9.4 FOOTNOTES / ENDNOTES
Footnotes and endnotes not distinguished in markup. They are
automatically numbered. Depending on the output file format (html,
EPUB, odf, pdf etc.), the document output selected will have either
footnotes or endnotes.
markup example:
~{ a footnote or endnote }~
resulting output:
[^12]
markup example:
normal text~{ self contained endnote marker & endnote in one }~ continues
resulting output:
normal text[^13] continues
markup example:
normal text ~{* unnumbered asterisk footnote/endnote, insert multiple asterisks if required }~ continues
normal text ~{** another unnumbered asterisk footnote/endnote }~ continues
resulting output:
normal text [^*] continues
normal text [^**] continues
markup example:
normal text ~[* editors notes, numbered asterisk footnote/endnote series ]~ continues
normal text ~[+ editors notes, numbered asterisk footnote/endnote series ]~ continues
resulting output:
normal text [^*3] continues
normal text [^+2] continues
Alternative endnote pair notation for footnotes/endnotes:
% note the endnote marker
normal text~^ continues
^~ endnote text following the paragraph in which the marker occurs
the standard and pair notation cannot be mixed in the same document
9.5 LINKS
9.5.1 NAKED URLS WITHIN TEXT, DEALING WITH URLS
urls are found within text and marked up automatically. A url within
text is automatically hyperlinked to itself and by default decorated
with angled braces, unless they are contained within a code block (in
which case they are passed as normal text), or escaped by a preceding
underscore (in which case the decoration is omitted).
markup example:
normal text http://www.jus.uio.no/sisu continues
resulting output:
normal text <http://www.jus.uio.no/sisu> continues
An escaped url without decoration
markup example:
normal text _http://www.jus.uio.no/sisu continues
deb http://www.jus.uio.no/sisu/archive unstable main non-free
resulting output:
normal text <_http://www.jus.uio.no/sisu> continues
deb <_http://www.jus.uio.no/sisu/archive> unstable main non-free
where a code block is used there is neither decoration nor
hyperlinking, code blocks are discussed later in this document
resulting output:
deb http://www.jus.uio.no/sisu/archive unstable main non-free
deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
To link text or an image to a url the markup is as follows
markup example:
about { SiSU }http://url.org markup
9.5.2 LINKING TEXT
resulting output:
about SiSU <http://www.jus.uio.no/sisu/> markup
A shortcut notation is available so the url link may also be provided
automatically as a footnote
markup example:
about {~^ SiSU }http://url.org markup
resulting output:
aboutSiSU <http://www.jus.uio.no/sisu/> [^14] markup
9.5.3 LINKING IMAGES
markup example:
{ tux.png 64x80 }image
% various url linked images
{tux.png 64x80
{GnuDebianLinuxRubyBetterWay.png 100x101
{~^ ruby_logo.png
resulting output:
[ tux.png ]
tux.png 64x80
[ ruby_logo (png missing) ] [^15]
GnuDebianLinuxRubyBetterWay.png 100x101 and Ruby
linked url footnote shortcut
{~^ [text to link] }http://url.org
% maps to: { [text to link] }http://url.org ~{ http://url.org }~
% which produces hyper-linked text within a document/paragraph,
with an endnote providing the url for the text location used in the hyperlink
text marker *~name
note at a heading level the same is automatically achieved by
providing names to headings 1, 2 and 3 i.e. 2~[name] and 3~[name] or in
the case of auto-heading numbering, without further intervention.
9.6 GROUPED TEXT
9.6.1 TABLES
Tables may be prepared in two either of two forms
markup example:
table{ c3; 40; 30; 30;
This is a table
this would become column two of row one
column three of row one is here
And here begins another row
column two of row two
column three of row two, and so on
}table
resulting output:
[table omitted, see other document formats]
a second form may be easier to work with in cases where there is not
much information in each column
markup example: [^16]
!_ Table 3.1: Contributors to Wikipedia, January 2001 - June 2005
{table~h 24; 12; 12; 12; 12; 12; 12;}
|Jan. 2001|Jan. 2002|Jan. 2003|Jan. 2004|July 2004|June 2006
Contributors* | 10| 472| 2,188| 9,653| 25,011| 48,721
Active contributors** | 9| 212| 846| 3,228| 8,442| 16,945
Very active contributors*** | 0| 31| 190| 692| 1,639| 3,016
No. of English language articles| 25| 16,000| 101,000| 190,000| 320,000| 630,000
No. of articles, all languages | 25| 19,000| 138,000| 490,000| 862,000|1,600,000
\* Contributed at least ten times; \** at least 5 times in last month; \* more than 100 times in last month.
resulting output:
Table 3.1: Contributors to Wikipedia, January 2001 - June 2005
[table omitted, see other document formats]
* Contributed at least ten times; ** at least 5 times in last month;
*** more than 100 times in last month.
9.6.2 POEM
basic markup:
poem{
Your poem here
}poem
Each verse in a poem is given a separate object number.
markup example:
poem{
`Fury said to a
mouse, That he
met in the
house,
both go to
law: I will
prosecute
YOU. --Come,
I´ll take no
denial; We
must have a
trial: For
really this
morning I´ve
nothing
to do.
Said the
mouse to the
cur,
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath.
judge, I´ll
be jury,
Said
cunning
old Fury:
try the
whole
cause,
and
condemn
you
to
death.
}poem
resulting output:
´Fury said to a
mouse, That he
met in the
house,
both go to
law: I will
prosecute
YOU. --Come,
I´ll take no
denial; We
must have a
trial: For
really this
morning I´ve
nothing
to do.
Said the
mouse to the
cur,
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath.
judge, I´ll
be jury,
Said
cunning
old Fury:
try the
whole
cause,
and
condemn
you
to
death.
9.6.3 GROUP
basic markup:
group{
Your grouped text here
}group
A group is treated as an object and given a single object number.
markup example:
group{
´Fury said to a
mouse, That he
met in the
house,
both go to
law: I will
prosecute
YOU. --Come,
I´ll take no
denial; We
must have a
trial: For
really this
morning I´ve
nothing
to do.
Said the
mouse to the
cur,
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath.
judge, I´ll
be jury,
Said
cunning
old Fury:
try the
whole
cause,
and
condemn
you
to
death.
}group
resulting output:
´Fury said to a
mouse, That he
met in the
house,
both go to
law: I will
prosecute
YOU. --Come,
I´ll take no
denial; We
must have a
trial: For
really this
morning I´ve
nothing
to do.
Said the
mouse to the
cur,
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath.
judge, I´ll
be jury,
Said
cunning
old Fury:
try the
whole
cause,
and
condemn
you
to
death.
9.6.4 CODE
Code tags are used to escape regular sisu markup, and have been used
extensively within this document to provide examples of SiSU markup.
You cannot however use code tags to escape code tags. They are however
used in the same way as group or poem tags.
A code-block is treated as an object and given a single object number.
[an option to number each line of code may be considered at
some later time]
use of code tags instead of poem compared, resulting output:
´Fury said to a
mouse, That he
met in the
house,
both go to
law: I will
prosecute
YOU. --Come,
I´ll take no
denial; We
must have a
trial: For
really this
morning I´ve
nothing
to do.
Said the
mouse to the
cur,
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath.
judge, I´ll
be jury,
Said
cunning
old Fury:
try the
whole
cause,
and
condemn
you
to
death.
9.7 BOOK INDEX
To make an index append to paragraph the book index term relates to
it, using an equal sign and curly braces.
Currently two levels are provided, a main term and if needed a
sub-term. Sub-terms are separated from the main term by a colon.
Paragraph containing main term and sub-term.
={Main term:sub-term}
The index syntax starts on a new line, but there should not be an
empty line between paragraph and index markup.
The structure of the resulting index would be:
Main term, 1
sub-term, 1
Several terms may relate to a paragraph, they are separated by a
semicolon. If the term refers to more than one paragraph, indicate the
number of paragraphs.
Paragraph containing main term, second term and sub-term.
={first term; second term: sub-term}
The structure of the resulting index would be:
First term, 1,
Second term, 1,
sub-term, 1
If multiple sub-terms appear under one paragraph, they are separated
under the main term heading from each other by a pipe symbol.
Paragraph containing main term, second term and sub-term.
={Main term:sub-term+1|second sub-term
A paragraph that continues discussion of the first sub-term
The plus one in the example provided indicates the first sub-term
spans one additional paragraph. The logical structure of the resulting
index would be:
Main term, 1,
sub-term, 1-3,
second sub-term, 1,
10. COMPOSITE DOCUMENTS MARKUP
It is possible to build a document by creating a master document that
requires other documents. The documents required may be complete
documents that could be generated independently, or they could be
markup snippets, prepared so as to be easily available to be placed
within another text. If the calling document is a master document
(built from other documents), it should be named with the suffix .ssm
Within this document you would provide information on the other
documents that should be included within the text. These may be other
documents that would be processed in a regular way, or markup bits
prepared only for inclusion within a master document .sst regular
markup file, or .ssi (insert/information) A secondary file of the
composite document is built prior to processing with the same prefix
and the suffix ._sst
basic markup for importing a document into a master document
<< filename1.sst
<< filename2.ssi
The form described above should be relied on. Within the Vim editor it
results in the text thus linked becoming hyperlinked to the document it
is calling in which is convenient for editing. Alternative markup for
importation of documents under consideration, and occasionally
supported have been.
<< filename.ssi
<<{filename.ssi}
% using textlink alternatives
<< |filename.ssi|@|^|
MARKUP SYNTAX HISTORY
11. NOTES RELATED TO FILES-TYPES AND MARKUP SYNTAX
0.38 is substantially current, depreciated 0.16 supported, though file
names were changed at 0.37
* sisu --query=[sisu version [0.38] or ´history]
provides a short history of changes to SiSU markup
0.57 (2007w34/4) SiSU 0.57 is the same as 0.42 with the introduction
of some a shortcut to use the headers @title and @creator in the first
heading [expanded using the contents of the headers @title: and
@author:]
:A~ @title by @author
0.52 (2007w14/6) declared document type identifier at start of
text/document:
.B SiSU 0.52
or, backward compatible using the comment marker:
% SiSU 0.38
variations include ´ SiSU (text|master|insert) [version]´ and
´sisu-[version]´
0.51 (2007w13/6) skins changed (simplified), markup unchanged
0.42 (2006w27/4) * (asterisk) type endnotes, used e.g. in relation to
author
SiSU 0.42 is the same as 0.38 with the introduction of some additional
endnote types,
Introduces some variations on endnotes, in particular the use of the
asterisk
~{* for example for describing an author }~ and ~{** for describing a second author }~
* for example for describing an author
** for describing a second author
and
~[* my note ]~ or ~[+ another note ]~
which numerically increments an asterisk and plus respectively
*1 my note +1 another note
0.38 (2006w15/7) introduced new/alternative notation for headers, e.g.
@title: (instead of 0~title), and accompanying document structure
markup, :A,:B,:C,1,2,3 (maps to previous 1,2,3,4,5,6)
SiSU 0.38 introduced alternative experimental header and
heading/structure markers,
@headername: and headers :A~ :B~ :C~ 1~ 2~ 3~
as the equivalent of:
0~headername and headers 1~ 2~ 3~ 4~ 5~ 6~
The internal document markup of SiSU 0.16 remains valid and standard
Though note that SiSU 0.37 introduced a new file naming convention
SiSU has in effect two sets of levels to be considered, using 0.38
notation A-C headings/levels, pre-ordinary paragraphs /pre-substantive
text, and 1-3 headings/levels, levels which are followed by ordinary
text. This may be conceptualised as levels A,B,C, 1,2,3, and using such
letter number notation, in effect: A must exist, optional B and C may
follow in sequence (not strict) 1 must exist, optional 2 and 3 may
follow in sequence i.e. there are two independent heading level
sequences A,B,C and 1,2,3 (using the 0.16 standard notation 1,2,3 and
4,5,6) on the positive side: the 0.38 A,B,C,1,2,3 alternative makes
explicit an aspect of structuring documents in SiSU that is not
otherwise obvious to the newcomer (though it appears more complicated,
is more in your face and likely to be understood fairly quickly); the
substantive text follows levels 1,2,3 and it is ´nice´ to do most work
in those levels
0.37 (2006w09/7) introduced new file naming convention, .sst (text),
.ssm (master), .ssi (insert), markup syntax unchanged
SiSU 0.37 introduced new file naming convention, using the file
extensions .sst
.ssm and .ssi to replace .s1 .s2 .s3 .r1 .r2 .r3 and .si
this is captured by the following file ´rename´ instruction:
rename ´s/.s[123]$/.sst/´ *.s{1,2,3}
rename ´s/.r[123]$/.ssm/´ *.r{1,2,3}
rename ´s/.si$/.ssi/´ *.si
The internal document markup remains unchanged, from SiSU 0.16
0.35 (2005w52/3) sisupod, zipped content file introduced
0.23 (2005w36/2) utf-8 for markup file
0.22 (2005w35/3) image dimensions may be omitted if rmagick is
available to be relied upon
0.20.4 (2005w33/4) header 0~links
0.16 (2005w25/2) substantial changes introduced to make markup
cleaner, header 0~title type, and headings [1-6]~ introduced, also
percentage sign (%) at start of a text line as comment marker
SiSU 0.16 (0.15 development branch) introduced the use of
the header 0~ and headings/structure 1~ 2~ 3~ 4~ 5~ 6~
in place of the 0.1 header, heading/structure notation
SiSU 0.1 headers and headings structure represented by header 0{~ and
headings/structure 1{ 2{ 3{ 4{~ 5{ 6{
12. SISU FILETYPES
SiSU has plaintext and binary filetypes, and can process either type
of document.
12.1 .SST .SSM .SSI MARKED UP PLAIN TEXT
SiSU documents are prepared as plain-text (utf-8) files with SiSU
markup. They may make reference to and contain images (for example),
which are stored in the directory beneath them _sisu/image. SiSU
plaintext markup files are of three types that may be distinguished by
the file extension used: regular text .sst; master documents, composite
documents that incorporate other text, which can be any regular text or
text insert; and inserts the contents of which are like regular text
except these are marked
.ssi and are not processed.
SiSU processing can be done directly against a sisu documents; which
may be located locally or on a remote server for which a url is
provided.
SiSU source markup can be shared with the command:
sisu -s [filename]
12.1.1 SISU TEXT - REGULAR FILES (.SST)
The most common form of document in SiSU , see the section on SiSU
markup.
<http://www.jus.uio.no/sisu/sisu_markup>
<http://www.jus.uio.no/sisu/sisu_manual>
12.1.2 SISU MASTER FILES (.SSM)
Composite documents which incorporate other SiSU documents which may
be either regular SiSU text .sst which may be generated independently,
or inserts prepared solely for the purpose of being incorporated into
one or more master documents.
The mechanism by which master files incorporate other documents is
described as one of the headings under under SiSU markup in the SiSU
manual.
Note: Master documents may be prepared in a similar way to regular
documents, and processing will occur normally if a .sst file is renamed
.ssm without requiring any other documents; the .ssm marker flags that
the document may contain other documents.
Note: a secondary file of the composite document is built prior to
processing with the same prefix and the suffix ._sst [^17]
<http://www.jus.uio.no/sisu/sisu_markup>
<http://www.jus.uio.no/sisu/sisu_manual>
12.1.3 SISU INSERT FILES (.SSI)
Inserts are documents prepared solely for the purpose of being
incorporated into one or more master documents. They resemble regular
SiSU text files except they are ignored by the SiSU processor. Making a
file a .ssi file is a quick and convenient way of flagging that it is
not intended that the file should be processed on its own.
12.2 SISUPOD, ZIPPED BINARY CONTAINER (SISUPOD.ZIP, .SSP)
A sisupod is a zipped SiSU text file or set of SiSU text files and any
associated images that they contain (this will be extended to include
sound and multimedia-files)
SiSU plaintext files rely on a recognised directory structure to find
contents such as images associated with documents, but all images for
example for all documents contained in a directory are located in the
sub-directory _sisu/image. Without the ability to create a sisupod it
can be inconvenient to manually identify all other files associated
with a document. A sisupod automatically bundles all associated files
with the document that is turned into a pod.
The structure of the sisupod is such that it may for example contain a
single document and its associated images; a master document and its
associated documents and anything else; or the zipped contents of a
whole directory of prepared SiSU documents.
The command to create a sisupod is:
sisu -S [filename]
Alternatively, make a pod of the contents of a whole directory:
sisu -S
SiSU processing can be done directly against a sisupod; which may be
located locally or on a remote server for which a url is provided.
<http://www.jus.uio.no/sisu/sisu_commands>
<http://www.jus.uio.no/sisu/sisu_manual>
13. EXPERIMENTAL ALTERNATIVE INPUT REPRESENTATIONS
13.1 ALTERNATIVE XML
SiSU offers alternative XML input representations of documents as a
proof of concept, experimental feature. They are however not strictly
maintained, and incomplete and should be handled with care.
convert from sst to simple xml representations (sax, dom and node):
sisu --to-sax [filename/wildcard] or sisu --to-sxs
[filename/wildcard]
sisu --to-dom [filename/wildcard] or sisu --to-sxd
[filename/wildcard]
sisu --to-node [filename/wildcard] or sisu --to-sxn
[filename/wildcard]
convert to sst from any sisu xml representation (sax, dom and node):
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
13.1.1 XML SAX REPRESENTATION
To convert from sst to simple xml (sax) representation:
sisu --to-sax [filename/wildcard] or sisu --to-sxs
[filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
13.1.2 XML DOM REPRESENTATION
To convert from sst to simple xml (dom) representation:
sisu --to-dom [filename/wildcard] or sisu --to-sxd
[filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
13.1.3 XML NODE REPRESENTATION
To convert from sst to simple xml (node) representation:
sisu --to-node [filename/wildcard] or sisu --to-sxn
[filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
14. CONFIGURATION
14.1 DETERMINING THE CURRENT CONFIGURATION
Information on the current configuration of SiSU should be available
with the help command:
sisu -v
which is an alias for:
sisu --help env
Either of these should be executed from within a directory that
contains sisu markup source documents.
14.2 CONFIGURATION FILES (CONFIG.YML)
SiSU configration parameters are adjusted in the configuration file,
which can be used to override the defaults set. This includes such
things as which directory interim processing should be done in and
where the generated output should be placed.
The SiSU configuration file is a yaml file, which means indentation is
significant.
SiSU resource configuration is determined by looking at the following
files if they exist:
./_sisu/sisurc.yml
~/.sisu/sisurc.yml
/etc/sisu/sisurc.yml
The search is in the order listed, and the first one found is used.
In the absence of instructions in any of these it falls back to the
internal program defaults.
Configuration determines the output and processing directories and the
database access details.
If SiSU is installed a sample sisurc.yml may be found in
/etc/sisu/sisurc.yml
15. SKINS
Skins modify the default appearance of document output on a document,
directory, or site wide basis. Skins are looked for in the following
locations:
./_sisu/skin
~/.sisu/skin
/etc/sisu/skin
Within the skin directory are the following the default
sub-directories for document skins:
./skin/doc
./skin/dir
./skin/site
A skin is placed in the appropriate directory and the file named
skin_[name].rb
The skin itself is a ruby file which modifies the default appearances
set in the program.
15.1 DOCUMENT SKIN
Documents take on a document skin, if the header of the document
specifies a skin to be used.
@skin: skin_united_nations
15.2 DIRECTORY SKIN
A directory may be mapped on to a particular skin, so all documents
within that directory take on a particular appearance. If a skin exists
in the skin/dir with the same name as the document directory, it will
automatically be used for each of the documents in that directory,
(except where a document specifies the use of another skin, in the
skin/doc directory).
A personal habit is to place all skins within the doc directory, and
symbolic links as needed from the site, or dir directories as required.
15.3 SITE SKIN
A site skin, modifies the program default skin.
15.4 SAMPLE SKINS
With SiSU installed sample skins may be found in:
/etc/sisu/skin/doc and
/usr/share/doc/sisu/v2/sisu_markup_samples/samples/_sisu/skin/doc
(or equivalent directory) and if sisu-markup-samples is installed also
under:
/usr/share/doc/sisu-markup-samples/v2/samples/_sisu/skin/doc
Samples of list.yml and promo.yml (which are used to create the right
column list) may be found in:
/usr/share/doc/sisu/sisu_markup_samples/dfsg/_sisu/skin/yml (or
equivalent
directory)
16. CSS - CASCADING STYLE SHEETS (FOR HTML, XHTML AND XML)
CSS files to modify the appearance of SiSU html, XHTML or XML may be
placed in the configuration directory: ./_sisu/css ; ~/.sisu/css or;
/etc/sisu/css and these will be copied to the output directories with
the command sisu -CC.
The basic CSS file for html output is html.css, placing a file of that
name in directory _sisu/css or equivalent will result in the default
file of that name being overwritten.
HTML: html.css
XML DOM: dom.css
XML SAX: sax.css
XHTML: xhtml.css
The default homepage may use homepage.css or html.css
Under consideration is to permit the placement of a CSS file with a
different name in directory _sisu/css directory or equivalent, and
change the default CSS file that is looked for in a skin.[^18]
17. ORGANISING CONTENT
17.1 DIRECTORY STRUCTURE AND MAPPING
The output directory root can be set in the sisurc.yml file. Under the
root, subdirectories are made for each directory in which a document
set resides. If you have a directory named poems or conventions, that
directory will be created under the output directory root and the
output for all documents contained in the directory of a particular
name will be generated to subdirectories beneath that directory (poem
or conventions). A document will be placed in a subdirectory of the
same name as the document with the filetype identifier stripped (.sst
.ssm)
The last part of a directory path, representing the sub-directory in
which a document set resides, is the directory name that will be used
for the output directory. This has implications for the organisation of
document collections as it could make sense to place documents of a
particular subject, or type within a directory identifying them. This
grouping as suggested could be by subject (sales_law,
english_literature); or just as conveniently by some other
classification (X University). The mapping means it is also possible to
place in the same output directory documents that are for
organisational purposes kept separately, for example documents on a
given subject of two different institutions may be kept in two
different directories of the same name, under a directory named after
each institution, and these would be output to the same output
directory. Skins could be associated with each institution on a
directory basis and resulting documents will take on the appropriate
different appearance.
18. HOMEPAGES
SiSU is about the ability to auto-generate documents. Home pages are
regarded as custom built items, and are not created by SiSU SiSU has a
default home page, which will not be appropriate for use with other
sites, and the means to provide your own home page instead in one of
two ways as part of a site´s configuration, these being:
1. through placing your home page and other custom built documents in
the subdirectory _sisu/home/ (this probably being the easier and more
convenient option)
2. through providing what you want as the home page in a skin,
Document sets are contained in directories, usually organised by site
or subject. Each directory can/should have its own homepage. See the
section on directory structure and organisation of content.
18.1 HOME PAGE AND OTHER CUSTOM BUILT PAGES IN A SUB-DIRECTORY
Custom built pages, including the home page index.html may be placed
within the configuration directory _sisu/home/ in any of the locations
that is searched for the configuration directory, namely ./_sisu ;
~/_sisu ; /etc/sisu From there they are copied to the root of the
output directory with the command:
sisu -CC
18.2 HOME PAGE WITHIN A SKIN
Skins are described in a separate section, but basically are a file
written in the programming language Ruby that may be provided to change
the defaults that are provided with sisu with respect to individual
documents, a directories contents or for a site.
If you wish to provide a homepage within a skin the skin should be in
the directory _sisu/skin/dir and have the name of the directory for
which it is to become the home page. Documents in the directory
commercial_law would have the homepage modified in skin_commercial
law.rb; or the directory poems in skin_poems.rb
class Home
def homepage
# place the html content of your homepage here, this will become index.html
<<HOME <html>
<head></head>
<doc>
<p>this is my new homepage.</p>
</doc>
</html>
HOME
end
end
19. MARKUP AND OUTPUT EXAMPLES
19.1 MARKUP EXAMPLES
Current markup examples and document output samples are provided at
<http://www.jus.uio.no/sisu/SiSU/examples.html>
Some markup with syntax highlighting may be found under
<http://www.jus.uio.no/sisu/sample/syntax> but is not as up to date.
For some documents hardly any markup at all is required at all, other
than a header, and an indication that the levels to be taken into
account by the program in generating its output are.
20. SISU SEARCH - INTRODUCTION
SiSU output can easily and conveniently be indexed by a number of
standalone indexing tools, such as Lucene, Hyperestraier.
Because the document structure of sites created is clearly defined,
and the text object citation system is available hypothetically at
least, for all forms of output, it is possible to search the sql
database, and either read results from that database, or just as simply
map the results to the html output, which has richer text markup.
In addition to this SiSU has the ability to populate a relational sql
type database with documents at an object level, with objects numbers
that are shared across different output types, which make them
searchable with that degree of granularity. Basically, your match
criteria is met by these documents and at these locations within each
document, which can be viewed within the database directly or in
various output formats.
21. SQL
21.1 POPULATING SQL TYPE DATABASES
SiSU feeds sisu markupd documents into sql type databases
PostgreSQL[^19] and/or SQLite[^20] database together with information
related to document structure.
This is one of the more interesting output forms, as all the
structural data of the documents are retained (though can be ignored by
the user of the database should they so choose). All site
texts/documents are (currently) streamed to four tables:
* one containing semantic (and other) headers, including, title,
author,
subject, (the Dublin Core...);
* another the substantive texts by individual "paragraph" (or object)
-
along with structural information, each paragraph being identifiable
by its
paragraph number (if it has one which almost all of them do), and the
substantive text of each paragraph quite naturally being searchable
(both in
formatted and clean text versions for searching); and
* a third containing endnotes cross-referenced back to the paragraph
from
which they are referenced (both in formatted and clean text versions
for
searching).
* a fourth table with a one to one relation with the headers table
contains
full text versions of output, eg. pdf, html, xml, and ascii.
There is of course the possibility to add further structures.
At this level SiSU loads a relational database with documents chunked
into objects, their smallest logical structurally constituent parts, as
text objects, with their object citation number and all other
structural information needed to construct the document. Text is stored
(at this text object level) with and without elementary markup tagging,
the stripped version being so as to facilitate ease of searching.
Being able to search a relational database at an object level with the
SiSU citation system is an effective way of locating content generated
by SiSU object numbers, and all versions of the document have the same
numbering, complex searches can be tailored to return just the
locations of the search results relevant for all available output
formats, with live links to the precise locations in the database or in
html/xml documents; or, the structural information provided makes it
possible to search the full contents of the database and have headings
in which search content appears, or to search only headings etc. (as
the Dublin Core is incorporated it is easy to make use of that as
well).
22. POSTGRESQL
22.1 NAME
SiSU - Structured information, Serialized Units - a document
publishing system, postgresql dependency package
22.2 DESCRIPTION
Information related to using postgresql with sisu (and related to the
sisu_postgresql dependency package, which is a dummy package to install
dependencies needed for SiSU to populate a postgresql database, this
being part of SiSU - man sisu).
22.3 SYNOPSIS
sisu -D [instruction] [filename/wildcard if required]
sisu -D --pg --[instruction] [filename/wildcard if required]
22.4 COMMANDS
Mappings to two databases are provided by default, postgresql and
sqlite, the same commands are used within sisu to construct and
populate databases however -d (lowercase) denotes sqlite and -D
(uppercase) denotes postgresql, alternatively --sqlite or --pgsql may
be used
-D or --pgsql may be used interchangeably.
22.4.1 CREATE AND DESTROY DATABASE
--pgsql --createall
initial step, creates required relations (tables, indexes) in
existing (postgresql) database (a database should be created
manually and given the same name as working directory, as
requested) (rb.dbi)
sisu -D --createdb
creates database where no database existed before
sisu -D --create
creates database tables where no database tables existed before
sisu -D --Dropall
destroys database (including all its content)! kills data and
drops tables, indexes and database associated with a given
directory (and directories of the same name).
sisu -D --recreate
destroys existing database and builds a new empty database
structure
22.4.2 IMPORT AND REMOVE DOCUMENTS
sisu -D --import -v [filename/wildcard]
populates database with the contents of the file. Imports
documents(s) specified to a postgresql database (at an object
level).
sisu -D --update -v [filename/wildcard]
updates file contents in database
sisu -D --remove -v [filename/wildcard]
removes specified document from postgresql database.
23. SQLITE
23.1 NAME
SiSU - Structured information, Serialized Units - a document
publishing system.
23.2 DESCRIPTION
Information related to using sqlite with sisu (and related to the
sisu_sqlite dependency package, which is a dummy package to install
dependencies needed for SiSU to populate an sqlite database, this being
part of SiSU - man sisu).
23.3 SYNOPSIS
sisu -d [instruction] [filename/wildcard if required]
sisu -d --(sqlite|pg) --[instruction] [filename/wildcard if
required]
23.4 COMMANDS
Mappings to two databases are provided by default, postgresql and
sqlite, the same commands are used within sisu to construct and
populate databases however -d (lowercase) denotes sqlite and -D
(uppercase) denotes postgresql, alternatively --sqlite or --pgsql may
be used
-d or --sqlite may be used interchangeably.
23.4.1 CREATE AND DESTROY DATABASE
--sqlite --createall
initial step, creates required relations (tables, indexes) in
existing (sqlite) database (a database should be created
manually and given the same name as working directory, as
requested) (rb.dbi)
sisu -d --createdb
creates database where no database existed before
sisu -d --create
creates database tables where no database tables existed before
sisu -d --dropall
destroys database (including all its content)! kills data and
drops tables, indexes and database associated with a given
directory (and directories of the same name).
sisu -d --recreate
destroys existing database and builds a new empty database
structure
23.4.2 IMPORT AND REMOVE DOCUMENTS
sisu -d --import -v [filename/wildcard]
populates database with the contents of the file. Imports
documents(s) specified to an sqlite database (at an object
level).
sisu -d --update -v [filename/wildcard]
updates file contents in database
sisu -d --remove -v [filename/wildcard]
removes specified document from sqlite database.
24. INTRODUCTION
24.1 SEARCH - DATABASE FRONTEND SAMPLE, UTILISING DATABASE AND SISU FEATURES,
INCLUDING OBJECT CITATION NUMBERING (BACKEND CURRENTLY POSTGRESQL)
Sample search frontend <http://search.sisudoc.org> [^21] A small
database and sample query front-end (search from) that makes use of the
citation system, object citation numbering to demonstrates
functionality.[^22]
SiSU can provide information on which documents are matched and at
what locations within each document the matches are found. These
results are relevant across all outputs using object citation
numbering, which includes html, XML, EPUB, LaTeX, PDF and indeed the
SQL database. You can then refer to one of the other outputs or in the
SQL database expand the text within the matched objects (paragraphs) in
the documents matched.
Note you may set results either for documents matched and object
number locations within each matched document meeting the search
criteria; or display the names of the documents matched along with the
objects (paragraphs) that meet the search criteria.[^23]
sisu -F --webserv-webrick
builds a cgi web search frontend for the database created
The following is feedback on the setup on a machine provided by
the help command:
sisu --help sql
Postgresql
user: ralph
current db set: SiSU_sisu
port: 5432
dbi connect: DBI:Pg:database=SiSU_sisu;port=5432
sqlite
current db set: /home/ralph/sisu_www/sisu/sisu_sqlite.db
dbi connect DBI:SQLite:/home/ralph/sisu_www/sisu/sisu_sqlite.db
Note on databases built
By default, [unless otherwise specified] databases are built
on a directory basis, from collections of documents within that
directory. The name of the directory you choose to work from is
used as the database name, i.e. if you are working in a
directory called /home/ralph/ebook the database SiSU_ebook is
used. [otherwise a manual mapping for the collection is
necessary]
24.2 SEARCH FORM
sisu -F
generates a sample search form, which must be copied to the
web-server cgi directory
sisu -F --webserv-webrick
generates a sample search form for use with the webrick server,
which must be copied to the web-server cgi directory
sisu -Fv
as above, and provides some information on setting up
hyperestraier
sisu -W
starts the webrick server which should be available wherever
sisu is properly installed
The generated search form must be copied manually to the
webserver directory as instructed
25. HYPERESTRAIER
See the documentation for hyperestraier:
<http://hyperestraier.sourceforge.net/>
/usr/share/doc/hyperestraier/index.html
man estcmd
NOTE: the examples that follow assume that sisu output is placed in
the directory /home/ralph/sisu_www
(A) to generate the index within the webserver directory to be
indexed:
estcmd gather -sd [index name] [directory path to index]
the following are examples that will need to be tailored according to
your needs:
cd /home/ralph/sisu_www
estcmd gather -sd casket /home/ralph/sisu_www
you may use the ´find´ command together with ´egrep´ to limit indexing
to particular document collection directories within the web server
directory:
find /home/ralph/sisu_www -type f | egrep
´/home/ralph/sisu_www/sisu/.+?.html$´ |estcmd gather -sd casket -
Check which directories in the webserver/output directory (~/sisu_www
or elsewhere depending on configuration) you wish to include in the
search index.
As sisu duplicates output in multiple file formats, it it is probably
preferable to limit the estraier index to html output, and as it may
also be desirable to exclude files ´plain.txt´, ´toc.html´ and
´concordance.html´, as these duplicate information held in other html
output e.g.
find /home/ralph/sisu_www -type f | egrep
´/sisu_www/(sisu|bookmarks)/.+?.html$´ | egrep -v
´(doc|concordance).html$´ |estcmd gather -sd casket -
from your current document preparation/markup directory, you would
construct a rune along the following lines:
find /home/ralph/sisu_www -type f | egrep
´/home/ralph/sisu_www/([specify first directory for
inclusion]|[specify second directory for inclusion]|[another
directory for inclusion? ...])/.+?.html$´ |
egrep -v ´(doc|concordance).html$´ |estcmd gather -sd
/home/ralph/sisu_www/casket -
(B) to set up the search form
(i) copy estseek.cgi to your cgi directory and set file permissions to
755:
sudo cp -vi /usr/lib/estraier/estseek.cgi /usr/lib/cgi-bin
sudo chmod -v 755 /usr/lib/cgi-bin/estseek.cgi
sudo cp -v /usr/share/hyperestraier/estseek.* /usr/lib/cgi-bin
[see estraier documentation for paths]
(ii) edit estseek.conf, with attention to the lines starting
´indexname:´ and ´replace:´:
indexname: /home/ralph/sisu_www/casket
replace: ^file:///home/ralph/sisu_www{{!}}http://localhost
replace: /index.html?${{!}}/
(C) to test using webrick, start webrick:
sisu -W
and try open the url: <http://localhost:8081/cgi-bin/estseek.cgi>
26. SISU_WEBRICK
26.1 NAME
SiSU - Structured information, Serialized Units - a document
publishing system
26.2 SYNOPSIS
sisu_webrick [port]
or
sisu -W [port]
26.3 DESCRIPTION
sisu_webrick is part of SiSU (man sisu) sisu_webrick starts Ruby ´s
Webrick web-server and points it to the directories to which SiSU
output is written, providing a list of these directories (assuming SiSU
is in use and they exist).
The default port for sisu_webrick is set to 8081, this may be modified
in the yaml file: ~/.sisu/sisurc.yml a sample of which is provided as
/etc/sisu/sisurc.yml (or in the equivalent directory on your system).
26.4 SUMMARY OF MAN PAGE
sisu_webrick, may be started on it´s own with the command:
sisu_webrick [port] or using the sisu command with the -W flag: sisu -W
[port]
where no port is given and settings are unchanged the default port is
8081
26.5 DOCUMENT PROCESSING COMMAND FLAGS
sisu -W [port] starts Ruby Webrick web-server, serving SiSU output
directories, on the port provided, or if no port is provided and the
defaults have not been changed in ~/.sisu/sisurc.yaml then on port 8081
26.6 FURTHER INFORMATION
For more information on SiSU see: <http://www.jus.uio.no/sisu>
or man sisu
26.7 AUTHOR
Ralph Amissah ralph@amissah.com or ralph.amissah@gmail.com
26.8 SEE ALSO
sisu(1)
sisu_vim(7)
sisu(8)
27. REMOTE SOURCE DOCUMENTS
SiSU processing instructions can be run against remote source
documents by providing the url of the documents against which the
processing instructions are to be carried out. The remote SiSU
documents can either be sisu marked up files in plaintext .sst or .ssm
or; zipped sisu files, sisupod.zip or filename.ssp
.sst / .ssm - sisu text files
SiSU can be run against source text files on a remote machine, provide
the processing instruction and the url. The source file and any
associated parts (such as images) will be downloaded and generated
locally.
sisu -3 http://[provide url to valid .sst or .ssm file]
Any of the source documents in the sisu examples page can be used in
this way, see <http://www.jus.uio.no/sisu/SiSU/examples.html> and use
the url for the desired document.
NOTE: to set up a remote machine to serve SiSU documents in this way,
images should be in the directory relative to the document source
../_sisu/image
sisupod - zipped sisu files
A sisupod is the zipped content of a sisu marked up text or texts and
any other associated parts to the document such as images.
SiSU can be run against a sisupod on a (local or) remote machine,
provide the processing instruction and the url, the sisupod will be
downloaded and the documents it contains generated locally.
sisu -3 http://[provide url to valid sisupod.zip or .ssp file]
Any of the source documents in the sisu examples page can be used in
this way, see <http://www.jus.uio.no/sisu/SiSU/examples.html> and use
the url for the desired document.
REMOTE DOCUMENT OUTPUT
28. REMOTE OUTPUT
Once properly configured SiSU output can be automatically posted once
generated to a designated remote machine using either rsync, or scp.
In order to do this some ssh authentication agent and keychain or
similar tool will need to be configured. Once that is done the
placement on a remote host can be done seamlessly with the -r (for scp)
or -R (for rsync) flag, which may be used in conjunction with other
processing flags, e.g.
sisu -3R sisu_remote.sst
28.1 COMMANDS
-R [filename/wildcard]
copies sisu output files to remote host using rsync. This
requires that sisurc.yml has been provided with information on
hostname and username, and that you have your different if -R is
used with other flags from if used alone. Alone the rsync
--delete parameter is sent, useful for cleaning the remote
directory (when -R is used together with other flags, it is
not). Also see -r
-r [filename/wildcard]
copies sisu output files to remote host using scp. This requires
that sisurc.yml has been provided with information on hostname
and username, and that you have your
28.2 CONFIGURATION
[expand on the setting up of an ssh-agent / keychain]
29. REMOTE SERVERS
As SiSU is generally operated using the command line, and works within
a Unix type environment, SiSU the program and all documents can just as
easily be on a remote server, to which you are logged on using a
terminal, and commands and operations would be pretty much the same as
they would be on your local machine.
30. QUICKSTART - GETTING STARTED HOWTO
30.1 INSTALLATION
Installation is currently most straightforward and tested on the
Debian platform, as there are packages for the installation of sisu and
all requirements for what it does.
30.1.1 DEBIAN INSTALLATION
SiSU is available directly from the Debian Sid and testing archives
(and possibly Ubuntu), assuming your /etc/apt/sources.list is set
accordingly:
aptitude update
aptitude install sisu-complete
The following /etc/apt/sources.list setting permits the download of
additional markup samples:
#/etc/apt/sources.list
deb http://ftp.fi.debian.org/debian/ unstable main non-free contrib
deb-src http://ftp.fi.debian.org/debian/ unstable main non-free contrib
d
The aptitude commands become:
aptitude update
aptitude install sisu-complete sisu-markup-samples
If there are newer versions of SiSU upstream of the Debian archives,
they will be available by adding the following to your
/etc/apt/sources.list
#/etc/apt/sources.list
deb http://www.jus.uio.no/sisu/archive unstable main non-free
deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
repeat the aptitude commands
aptitude update
aptitude install sisu-complete sisu-markup-samples
Note however that it is not necessary to install sisu-complete if not
all components of sisu are to be used. Installing just the package sisu
will provide basic functionality.
30.1.2 RPM INSTALLATION
RPMs are provided though untested, they are prepared by running alien
against the source package, and against the debs.
They may be downloaded from:
<http://www.jus.uio.no/sisu/SiSU/download.html#rpm>
as root type:
rpm -i [rpm package name]
30.1.3 INSTALLATION FROM SOURCE
To install SiSU from source check information at:
<http://www.jus.uio.no/sisu/SiSU/download.html#current>
* download the source package
* Unpack the source
Two alternative modes of installation from source are provided,
setup.rb (by Minero Aoki) and a rant(by Stefan Lang) built install
file, in either case: the first steps are the same, download and unpack
the source file:
For basic use SiSU is only dependent on the programming language in
which it is written Ruby , and SiSU will be able to generate html,
EPUB, various XMLs, including ODF (and will also produce LaTeX).
Dependencies required for further actions, though it relies on the
installation of additional dependencies which the source tarball does
not take care of, for things like using a database (postgresql or
sqlite)[^24] or converting LaTeX to pdf.
setup.rb
This is a standard ruby installer, using setup.rb is a three step
process. In the root directory of the unpacked SiSU as root type:
ruby setup.rb config
ruby setup.rb setup
#[and as root:]
ruby setup.rb install
further information on setup.rb is available from:
<http://i.loveruby.net/en/projects/setup/>
<http://i.loveruby.net/en/projects/setup/doc/usage.html>
The root directory of the unpacked SiSU as root type:
ruby install base
or for a more complete installation:
ruby install
or
ruby install base
This makes use of Rant (by Stefan Lang) and the provided Rantfile. It
has been configured to do post installation setup setup configuration
and generation of first test file. Note however, that additional
external package dependencies, such as tetex-extra are not taken care
of for you.
Further information on
<http://make.rubyforge.org/>
<http://rubyforge.org/frs/?group_id=615>
For a list of alternative actions you may type:
ruby install help
ruby install -T
30.2 TESTING SISU, GENERATING OUTPUT
To check which version of sisu is installed:
sisu -v
Depending on your mode of installation one or a number of markup
sample files may be found either in the directory:
or
change directory to the appropriate one:
cd /usr/share/doc/sisu/sisu_markup_samples/dfsg
30.2.1 BASIC TEXT, PLAINTEXT, HTML, XML, ODF, EPUB
Having moved to the directory that contains the markup samples (see
instructions above if necessary), choose a file and run sisu against it
sisu -NhwoabxXyv
free_as_in_freedom.rms_and_free_software.sam_williams.sst
this will generate html including a concordance file, opendocument
text format, plaintext, XHTML and various forms of XML, and
OpenDocument text
30.2.2 LATEX / PDF
Assuming a LaTeX engine such as tetex or texlive is installed with the
required modules (done automatically on selection of sisu-pdf in Debian
)
Having moved to the directory that contains the markup samples (see
instructions above if necessary), choose a file and run sisu against it
sisu -pv free_as_in_freedom.rms_and_free_software.sam_williams.sst
sisu -3 free_as_in_freedom.rms_and_free_software.sam_williams.sst
should generate most available output formats: html including a
concordance file, opendocument text format, plaintext, XHTML and
various forms of XML, and OpenDocument text and pdf
30.2.3 RELATIONAL DATABASE - POSTGRESQL, SQLITE
Relational databases need some setting up - you must have permission
to create the database and write to it when you run sisu.
Assuming you have the database installed and the requisite permissions
sisu --sqlite --recreate
sisu --sqlite -v --import
free_as_in_freedom.rms_and_free_software.sam_williams.sst
sisu --pgsql --recreate
sisu --pgsql -v --import
free_as_in_freedom.rms_and_free_software.sam_williams.sst
30.3 GETTING HELP
30.3.1 THE MAN PAGES
Type:
man sisu
The man pages are also available online, though not always kept as up
to date as within the package itself:
* sisu.1 <http://www.jus.uio.no/sisu/man/sisu.1> [^25]
* sisu.8 <http://www.jus.uio.no/sisu/man/sisu.8> [^26]
* man directory <http://www.jus.uio.no/sisu/man> [^27]
30.3.2 BUILT IN HELP
sisu --help
sisu --help --env
sisu --help --commands
sisu --help --markup
30.3.3 THE HOME PAGE
<http://www.jus.uio.no/sisu>
<http://www.jus.uio.no/sisu/SiSU>
30.4 MARKUP SAMPLES
A number of markup samples (along with output) are available off:
<http://www.jus.uio.no/sisu/SiSU/examples.html>
Additional markup samples are packaged separately in the file:
*
On Debian they are available in non-free[^28] to include them it is
necessary to include non-free in your /etc/apt/source.list or obtain
them from the sisu home site.
31. EDITOR FILES, SYNTAX HIGHLIGHTING
The directory:
./data/sisu/v2/conf/editor-syntax-etc/
/usr/share/sisu/v2/conf/editor-syntax-etc
contains rudimentary sisu syntax highlighting files for:
* (g)vim <http://www.vim.org>
package: sisu-vim
status: largely done
there is a vim syntax highlighting and folds component
* gedit <http://www.gnome.org/projects/gedit>
* gobby <http://gobby.0x539.de/>
file: sisu.lang
place in:
/usr/share/gtksourceview-1.0/language-specs
or
~/.gnome2/gtksourceview-1.0/language-specs
status: very basic syntax highlighting
comments: this editor features display line wrap and is used by Goby!
* nano <http://www.nano-editor.org>
file: nanorc
save as:
~/.nanorc
status: basic syntax highlighting
comments: assumes dark background; no display line-wrap; does line
breaks
* diakonos (an editor written in ruby)
<http://purepistos.net/diakonos>
file: diakonos.conf
save as:
~/.diakonos/diakonos.conf
includes:
status: basic syntax highlighting
comments: assumes dark background; no display line-wrap
* kate & kwrite <http://kate.kde.org>
file: sisu.xml
place in:
/usr/share/apps/katepart/syntax
or
~/.kde/share/apps/katepart/syntax
[settings::configure kate::{highlighting,filetypes}]
[tools::highlighting::{markup,scripts}:: .B SiSU ]
* nedit <http://www.nedit.org>
file: sisu_nedit.pats
nedit -import sisu_nedit.pats
status: a very clumsy first attempt [not really done]
comments: this editor features display line wrap
* emacs <http://www.gnu.org/software/emacs/emacs.html>
files: sisu-mode.el
to file ~/.emacs add the following 2 lines:
(add-to-list ´load-path
(require ´sisu-mode.el)
[not done / not yet included]
* vim & gvim <http://www.vim.org>
files:
package is the most comprehensive sisu syntax highlighting and editor
environment provided to date (is for vim/ gvim, and is separate from
the
contents of this directory)
status: this includes: syntax highlighting; vim folds; some error
checking
comments: this editor features display line wrap
NOTE:
[ .B SiSU parses files with long lines or line breaks, but,
display linewrap (without line-breaks) is a convenient editor
feature to have for sisu markup]
32. HOW DOES SISU WORK?
SiSU markup is fairly minimalistic, it consists of: a (largely
optional) document header, made up of information about the document
(such as when it was published, who authored it, and granting what
rights) and any processing instructions; and markup within the
substantive text of the document, which is related to document
structure and typeface. SiSU must be able to discern the structure of
a document, (text headings and their levels in relation to each other),
either from information provided in the document header or from markup
within the text (or from a combination of both). Processing is done
against an abstraction of the document comprising of information on the
document´s structure and its objects,[2] which the program serializes
(providing the object numbers) and which are assigned hash sum values
based on their content. This abstraction of information about document
structure, objects, (and hash sums), provides considerable flexibility
in representing documents different ways and for different purposes
(e.g. search, document layout, publishing, content certification,
concordance etc.), and makes it possible to take advantage of some of
the strengths of established ways of representing documents, (or indeed
to create new ones).
33. SUMMARY OF FEATURES
* sparse/minimal markup (clean utf-8 source texts). Documents are
prepared in a single UTF-8 file using a minimalistic mnemonic syntax.
Typical literature, documents like headers are optional.
* markup is easily readable/parsable by the human eye, (basic markup
is simpler and more sparse than the most basic HTML), [this may also
be converted to XML representations of the same input/source
document].
* markup defines document structure (this may be done once in a header
pattern-match description, or for heading levels individually); basic
text attributes (bold, italics, underscore, strike-through etc.) as
required; and semantic information related to the document (header
information, extended beyond the Dublin core and easily further
extended as required); the headers may also contain processing
instructions. SiSU markup is primarily an abstraction of document
structure and document metadata to permit taking advantage of the basic
strengths of existing alternative practical standard ways of
representing documents [be that browser viewing, paper publication,
sql search etc.] (html, epub, xml, odf, latex, pdf, sql)
* for output produces reasonably elegant output of established
industry and institutionally accepted open standard formats.[3] takes
advantage of the different strengths of various standard formats for
representing documents, amongst the output formats currently supported
are:
* html - both as a single scrollable text and a segmented document
* xhtml
* epub
* XML - both in sax and dom style xml structures for further
development as
required
* ODF - open document format, the iso standard for document storage
* LaTeX - used to generate pdf
* pdf (via LaTeX)
* sql - population of an sql database, (at the same object level that
is
used to cite text within a document)
Also produces: concordance files; document content certificates (md5
or sha256 digests of headings, paragraphs, images etc.) and html
manifests (and sitemaps of content). (b) takes advantage of the
strengths implicit in these very different output types, (e.g. PDFs
produced using typesetting of LaTeX, databases populated with documents
at an individual object/paragraph level, making possible granular
search (and related possibilities))
* ensuring content can be cited in a meaningful way regardless of
selected output format. Online publishing (and publishing in multiple
document formats) lacks a useful way of citing text internally within
documents (important to academics generally and to lawyers) as page
numbers are meaningless across browsers and formats. sisu seeks to
provide a common way of pinpoint the text within a document, (which can
be utilized for citation and by search engines). The outputs share a
common numbering system that is meaningful (to man and machine) across
all digital outputs whether paper, screen, or database oriented, (pdf,
HTML, EPUB, xml, sqlite, postgresql), this numbering system can be used
to reference content.
* Granular search within documents. SQL databases are populated at an
object level (roughly headings, paragraphs, verse, tables) and become
searchable with that degree of granularity, the output information
provides the object/paragraph numbers which are relevant across all
generated outputs; it is also possible to look at just the matching
paragraphs of the documents in the database; [output indexing also
work well with search indexing tools like hyperestraier].
*longtermmaintainabilityofdocumentcollectionsinaworldofchanging
formats, having a very sparsely marked-up source document base. there
is a considerable degree of future-proofing, output representations are
(open document text) module in 2006, epub in 2009 and in future html5
output sometime in future, without modification of existing prepared
texts
* SQL search aside, documents are generated as required and static
once generated.
* documents produced are static files, and may be batch processed,
this needs to be done only once but may be repeated for various reasons
as desired (updated content, addition of new output formats, updated
technology document presentations/representations)
* document source (plaintext utf-8) if shared on the net may be used
as input and processed locally to produce the different document
outputs
* document source may be bundled together (automatically) with
associated documents (multiple language versions or master document
with inclusions) and images and sent as a zip file called a sisupod, if
shared on the net these too may be processed locally to produce the
desired document outputs
* generated document outputs may automatically be posted to remote
sites.
* for basic document generation, the only software dependency is Ruby
, and a few standard Unix tools (this covers plaintext, HTML, EPUB,
XML, ODF, LaTeX). To use a database you of course need that, and to
convert the LaTeX generated to pdf, a latex processor like tetex or
texlive.
* as a developers tool it is flexible and extensible
Syntax highlighting for SiSU markup is available for a number of text
editors.
SiSU is less about document layout than about finding a way with
little markup to be able to construct an abstract representation of a
document that makes it possible to produce multiple representations of
it which may be rather different from each other and used for different
purposes, whether layout and publishing, or search of content
i.e. to be able to take advantage from this minimal preparation
starting point of some of the strengths of rather different established
ways of representing documents for different purposes, whether for
search (relational database, or indexed flat files generated for that
purpose whether of complete documents, or say of files made up of
objects), online viewing (e.g. html, xml, pdf), or paper publication
(e.g. pdf)...
the solution arrived at is by extracting structural information about
the document (about headings within the document) and by tracking
objects (which are serialized and also given hash values) in the manner
described. It makes possible representations that are quite different
from those offered at present. For example objects could be saved
individually and identified by their hashes, with an index of how the
objects relate to each other to form a document.
34. HELP SOURCES
For a summary of alternative ways to get help on SiSU try one of the
following:
man page
man sisu_help
man2html
<http://www.jus.uio.no/sisu/man/sisu_help.html>
sisu generated output - links to html
<http://sisudoc.org/sisu/sisu_help/index.html>
help sources lists
Alternative sources for this help sources page listed here:
man sisu_help_sources
<http://sisudoc.org/sisu/sisu_help_sources/index.html>
34.1 MAN PAGES
34.1.1 MAN
man sisu
man 7 sisu_complete
man 7 sisu_pdf
man 7 sisu_postgresql
man 7 sisu_sqlite
man sisu_termsheet
man sisu_webrick
34.2 SISU GENERATED OUTPUT - LINKS TO HTML
Note SiSU documentation is prepared in SiSU and output is available in
multiple formats including amongst others html, pdf, odf and epub which
may be also be accessed via the html pages[^28]
34.2.1 WWW.SISUDOC.ORG
<http://sisudoc.org/sisu/sisu_manual/index.html>
<http://sisudoc.org/sisu/sisu_manual/index.html>
<http://sisudoc.org/sisu/sisu_commands/index.html>
<http://sisudoc.org/sisu/sisu_complete/index.html>
<http://sisudoc.org/sisu/sisu_configuration/index.html>
<http://sisudoc.org/sisu/sisu_description/index.html>
<http://sisudoc.org/sisu/sisu_examples/index.html>
<http://sisudoc.org/sisu/sisu_faq/index.html>
<http://sisudoc.org/sisu/sisu_filetypes/index.html>
<http://sisudoc.org/sisu/sisu_help/index.html>
<http://sisudoc.org/sisu/sisu_help_sources/index.html>
<http://sisudoc.org/sisu/sisu_howto/index.html>
<http://sisudoc.org/sisu/sisu_introduction/index.html>
<http://sisudoc.org/sisu/sisu_manual/index.html>
<http://sisudoc.org/sisu/sisu_markup/index.html>
<http://sisudoc.org/sisu/sisu_output_overview/index.html>
<http://sisudoc.org/sisu/sisu_pdf/index.html>
<http://sisudoc.org/sisu/sisu_postgresql/index.html>
<http://sisudoc.org/sisu/sisu_quickstart/index.html>
<http://sisudoc.org/sisu/sisu_remote/index.html>
<http://sisudoc.org/sisu/sisu_search/index.html>
<http://sisudoc.org/sisu/sisu_skin/index.html>
<http://sisudoc.org/sisu/sisu_sqlite/index.html>
<http://sisudoc.org/sisu/sisu_syntax_highlighting/index.html>
<http://sisudoc.org/sisu/sisu_vim/index.html>
<http://sisudoc.org/sisu/sisu_webrick/index.html>
34.3 MAN2HTML
34.3.1 LOCALLY INSTALLED
<file:///usr/share/doc/sisu/.html/sisu.html>
<file:///usr/share/doc/sisu/.html/sisu_help.html>
<file:///usr/share/doc/sisu/.html/sisu_help_sources.html>
/usr/share/doc/sisu/.html/sisu.html
/usr/share/doc/sisu/.html/sisu_pdf.html
/usr/share/doc/sisu/.html/sisu_postgresql.html
/usr/share/doc/sisu/.html/sisu_sqlite.html
/usr/share/doc/sisu/.html/sisu_webrick.html
34.3.2 WWW.JUS.UIO.NO/SISU
<http://www.jus.uio.no/sisu/man/sisu.html>
<http://www.jus.uio.no/sisu/man/sisu.html>
<http://www.jus.uio.no/sisu/man/sisu_complete.html>
<http://www.jus.uio.no/sisu/man/sisu_pdf.html>
<http://www.jus.uio.no/sisu/man/sisu_postgresql.html>
<http://www.jus.uio.no/sisu/man/sisu_sqlite.html>
<http://www.jus.uio.no/sisu/man/sisu_webrick.html>
1. objects include: headings, paragraphs, verse, tables, images,
but not footnotes/endnotes which are numbered separately and
tied to the object from which they are referenced.
2. i.e. the html, pdf, epub, odf outputs are each built
individually and optimised for that form of presentation, rather
than for example the html being a saved version of the odf, or
the pdf being a saved version of the html.
3. the different heading levels
4. units of text, primarily paragraphs and headings, also any
tables, poems, code-blocks
5. Specification submitted by Adobe to ISO to become a full open
ISO specification <http://www.linux-
watch.com/news/NS75427226.html>
6. ISO standard ISO/IEC 26300:2006
7. An open standard format for e-books
*1. square brackets
*2. square brackets
+1. square brackets
8. <http://www.jus.uio.no/sisu/man/>
9. <http://www.jus.uio.no/sisu/man/sisu.html>
10. From sometime after SiSU 0.58 it should be possible to describe
SiSU markup using SiSU, which though not an original design goal
is useful.
11. files should be prepared using UTF-8 character encoding
12. a footnote or endnote
13. self contained endnote marker & endnote in one
*. unnumbered asterisk footnote/endnote, insert multiple asterisks
if required
**. another unnumbered asterisk footnote/endnote
*3. editors notes, numbered asterisk footnote/endnote series
+2. editors notes, numbered asterisk footnote/endnote series
14. <http://www.jus.uio.no/sisu/>
15. <http://www.ruby-lang.org/en/>
16. Table from the Wealth of Networks by Yochai Benkler
<http://www.jus.uio.no/sisu/the_wealth_of_networks.yochai_benkler>
17. is not a regular file to be worked on, and thus less likely that
people will have processing. It may be however that when the
resulting file is shared .ssc is an appropriate suffix to use.
19. <http://www.postgresql.org/> <http://advocacy.postgresql.org/>
<http://en.wikipedia.org/wiki/Postgresql>
20. <http://www.hwaci.com/sw/sqlite/>
<http://en.wikipedia.org/wiki/Sqlite>
21. <http://search.sisudoc.org>
22. (which could be extended further with current back-end). As
regards scaling of the database, it is as scalable as the
database (here Postgresql) and hardware allow.
23. of this feature when demonstrated to an IBM software innovations
evaluator in 2004 he said to paraphrase: this could be of
interest to us. We have large document management systems, you
can search hundreds of thousands of documents and we can tell
you which documents meet your search criteria, but there is no
way we can tell you without opening each document where within
each your matches are found.
24. There is nothing to stop MySQL support being added in future.
25. <http://www.jus.uio.no/sisu/man/sisu.1>
26. <http://www.jus.uio.no/sisu/man/sisu.8>
27. <http://www.jus.uio.no/sisu/man>
28. the Debian Free Software guidelines require that everything
distributed within Debian can be changed - and the documents are
authors’ works that while freely distributable are not freely
changeable.
29. named index.html or more extensively through sisu_manifest.html
Title: SiSU - Manual
Creator:
Ralph Amissah
Rights:
Copyright (C) Ralph Amissah 2007, part of SiSU documentation,
License GPL 3;
Publisher:
SiSU http://www.jus.uio.no/sisu (this copy)
Date created:
2002-08-28
Date issued:
2002-08-28
Date available:
2002-08-28
Date modified:
2010-03-03
Date: 2008-05-22
Sourcefile:
sisu.ssm.sst
Filetype:
SiSU text insert 2.0
Source digest:
MD5(sisu.ssm.sst)= fd741a3ccf160aa55b942d76bd4e3f2a
Generated by:
Generated by: SiSU 2.0.0 of 2010w09/6 (2010-03-06)
Ruby version:
ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux]
Document (dal) last generated:
Wed Mar 17 13:34:15 -0400 2010
Other versions of this document:
manifest: <http://www.jus.uio.no/sisu/sisu/sisu_manifest.html>
html: <http://www.jus.uio.no/sisu/sisu/toc.html>
epub: <http://www.jus.uio.no/sisu/epub/sisu.epub>
pdf: <http://www.jus.uio.no/sisu/sisu/portrait.pdf>
pdf: <http://www.jus.uio.no/sisu/sisu/landscape.pdf>
at: <http://www.jus.uio.no/sisu>
* Generated by: SiSU 2.0.0 of 2010w09/6 (2010-03-06)
* Ruby version: ruby 1.8.7 (2010-01-10 patchlevel 249)
[i486-linux]
* Last Generated on: Wed Mar 17 13:34:17 -0400 2010
* SiSU http://www.jus.uio.no/sisu