NAME
cant - CAnonicalize N-Triples
DESCRIPTION
CAnonicalize N-Triples
OPTIONS
--verbose
-v Print what you are doing as you go
--help -h Print this message and exit
--from=uri
-f uri Specify an input file (or web resource)
--diff=uri
-d uri Specify a difference file
Can have any number of --from <file> parameters, in which case files
are merged. If none are given, /dev/stdin is used.
If any diff files are given then the diff files are read merged
separately and compared with the input files. the result is a list of
differences instead of the canonicalizd graph. This is NOT a minimal
diff. Exits with nonzero system status if graphs do not match.
This is an independent n-triples cannonicalizer. It uses heuristics,
and will not terminate on all graphs. It is designed for testing: the
output and the reference output are both canonicalized and compared.
It uses the very simple NTriples format. It is designed to be
independent of the SWAP code so that it can be used to test the SWAP
code. It doesn’t boast any fancy algorithms - just tries to get the job
done for the small files in the test datasets.
The algorithm to generate a "signature" for each bnode. This is just
found by looking in its immediate viscinity, treating any local bnode
as a blank. Bnodes which have signatures unique within the graph can
be allocated cannonical identifiers as a function of the ordering of
the signatures. These are then treated as fixed nodes. If another pass
is done of the new graph, the signatures are more distinct.
This works for well-labelled graphs, and graphs which don’t have large
areas of interconnected bnodes or large duplicate areas. A particular
failing is complete lack of treatment of symmetry between bnodes.
References:
.google graph isomorphism See also eg
http://www.w3.org/2000/10/rdf-tests/rdfcore/utils/ntc/compare.cc
NTriples: see http://www.w3.org/TR/rdf-testcases/#ntriples
Not to mention,
published this month by coincidence:
Kelly, Brian, [Whitehead Institute]
"Graph cannonicalization", Dr Dobb’s Journal, May 2003.
$Id: cant.py,v 1.15 2007/06/26 02:36:15 syosi Exp $
This is or was http://www.w3.org/2000/10/swap/cant.py W3C open source
licence <http://www.w3.org/Consortium/Legal/copyright-software.html>.
2004-02-31 Serious bug fixed. This is a test program, shoul dbe itself
tested.
Quis custodiet ipsos custodes?