lgf.dox
changeset 31 02083323ff2c
child 32 ef12f83752f6
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/lgf.dox	Mon Feb 15 01:47:33 2010 +0100
     1.3 @@ -0,0 +1,179 @@
     1.4 +/* -*- mode: C++; indent-tabs-mode: nil; -*-
     1.5 + *
     1.6 + * This file is a part of LEMON, a generic C++ optimization library.
     1.7 + *
     1.8 + * Copyright (C) 2003-2008
     1.9 + * Egervary Jeno Kombinatorikus Optimalizalasi Kutatocsoport
    1.10 + * (Egervary Research Group on Combinatorial Optimization, EGRES).
    1.11 + *
    1.12 + * Permission to use, modify and distribute this software is granted
    1.13 + * provided that this copyright notice appears in all copies. For
    1.14 + * precise terms see the accompanying LICENSE file.
    1.15 + *
    1.16 + * This software is provided "AS IS" with no warranty of any kind,
    1.17 + * express or implied, and with no claim as to its suitability for any
    1.18 + * purpose.
    1.19 + *
    1.20 + */
    1.21 +
    1.22 +namespace lemon {
    1.23 +/**
    1.24 +[PAGE]sec_lgf[PAGE] Input-Output for Graphs
    1.25 +
    1.26 +\todo Clarify this section.
    1.27 +
    1.28 +LEMON provides a versatile file format for storing graphs
    1.29 +and related node and arc maps.
    1.30 +Such a format should be really flexible, it should be able to store arbitrary
    1.31 +number of maps of arbitrary value types.
    1.32 +On the other hand, the file size and the ease of processing are also critical
    1.33 +to support storing huge graphs, which is a major goal of LEMON.
    1.34 +These requirements forbid using complicated and deeply structured formats
    1.35 +like XML.
    1.36 +That is why a compact file format is designed for LEMON instead of using
    1.37 +hierarchical formats, such as GraphML, GXL or GML.
    1.38 +
    1.39 +The LEMON Graph Format (LGF) comprises different sections, for
    1.40 +example a digraph is stored in a \c @nodes and an \c @arcs
    1.41 +section. These parts use column oriented formats, each
    1.42 +column belongs to a map in the graph. The first line of the section associate
    1.43 +names to these maps, which can be used to refer them.
    1.44 +Note that this simple idea makes it possible to extend the files with
    1.45 +new maps (columns) at any
    1.46 +position without having to modify the codes using these files.
    1.47 +
    1.48 +The \c label map has special role, it must store unique values, which in turn
    1.49 +can be used to refer
    1.50 +to the nodes and arcs in the file. Finally, the first two column of the
    1.51 +\c @arcs section is anonymous, they indicate the source and target nodes,
    1.52 +respectively.
    1.53 +
    1.54 +\code
    1.55 +  @nodes
    1.56 +  label coordinate
    1.57 +  0     (20,100)
    1.58 +  1     (40,120)
    1.59 +  ...
    1.60 +  41    (600,100)
    1.61 +  @arcs
    1.62 +            label length
    1.63 +  0    1    0     16
    1.64 +  0    2    1     12
    1.65 +  2    12   2     20
    1.66 +  ...
    1.67 +  36   41   123   21
    1.68 +  @attributes
    1.69 +  source 0
    1.70 +  target 41
    1.71 +  caption "A shortest path problem"
    1.72 +\endcode
    1.73 +
    1.74 +The \ref DigraphReader and \ref DigraphWriter classes are used
    1.75 +to read and write a digraph and corresponding maps. By default, a map
    1.76 +can be used with these classes if its value type has standard I/O
    1.77 +operators (\c operator<<(ostream&, T) and \c operator>>(istream&, T&)).
    1.78 +Otherwise, a function object
    1.79 +can be specified which converts the value type to \c std::string.
    1.80 +The above LGF file can be scanned as follows.
    1.81 +
    1.82 +\code
    1.83 +  ListDigraph g;
    1.84 +  ListDigraph::NodeMap<dim2::Point<int> > coord(g);
    1.85 +  ListDigraph::ArcMap<int> length(g);
    1.86 +  ListDigraph::Node src;
    1.87 +  std::string title;
    1.88 +
    1.89 +  digraphReader(g, std::cin)
    1.90 +    .nodeMap("coord", coord)
    1.91 +    .arcMap("length", length)
    1.92 +    .attribute("caption", title)
    1.93 +    .node("source", src)
    1.94 +    .run();
    1.95 +\endcode
    1.96 +
    1.97 +Apart from LGF, the library can also interpret other graph
    1.98 +formats, such as the well-known DIMACS format or the NAUTY graph6
    1.99 +format.
   1.100 +
   1.101 +<hr>
   1.102 +
   1.103 +The \e LGF is a <em>column oriented</em>
   1.104 +file format for storing graphs and associated data like
   1.105 +node and edge maps.
   1.106 +
   1.107 +Each line with \c '#' first non-whitespace
   1.108 +character is considered as a comment line.
   1.109 +
   1.110 +Otherwise the file consists of sections starting with
   1.111 +a header line. The header lines starts with an \c '@' character followed by the
   1.112 +type of section. The standard section types are \c \@nodes, \c
   1.113 +\@arcs and \c \@edges
   1.114 +and \@attributes. Each header line may also have an optional
   1.115 +\e name, which can be use to distinguish the sections of the same
   1.116 +type.
   1.117 +
   1.118 +The standard sections are column oriented, each line consists of
   1.119 +<em>token</em>s separated by whitespaces. A token can be \e plain or
   1.120 +\e quoted. A plain token is just a sequence of non-whitespace characters,
   1.121 +while a quoted token is a
   1.122 +character sequence surrounded by double quotes, and it can also
   1.123 +contain whitespaces and escape sequences.
   1.124 +
   1.125 +The \c \@nodes section describes a set of nodes and associated
   1.126 +maps. The first is a header line, its columns are the names of the
   1.127 +maps appearing in the following lines.
   1.128 +One of the maps must be called \c
   1.129 +"label", which plays special role in the file.
   1.130 +The following
   1.131 +non-empty lines until the next section describes nodes of the
   1.132 +graph. Each line contains the values of the node maps
   1.133 +associated to the current node.
   1.134 +
   1.135 +\code
   1.136 + @nodes
   1.137 + label  coordinates  size    title
   1.138 + 1      (10,20)      10      "First node"
   1.139 + 2      (80,80)      8       "Second node"
   1.140 + 3      (40,10)      10      "Third node"
   1.141 +\endcode
   1.142 +
   1.143 +The \c \@arcs section is very similar to the \c \@nodes section,
   1.144 +it again starts with a header line describing the names of the maps,
   1.145 +but the \c "label" map is not obligatory here. The following lines
   1.146 +describe the arcs. The first two tokens of each line are
   1.147 +the source and the target node of the arc, respectively, then come the map
   1.148 +values. The source and target tokens must be node labels.
   1.149 +
   1.150 +\code
   1.151 + @arcs
   1.152 +         capacity
   1.153 + 1   2   16
   1.154 + 1   3   12
   1.155 + 2   3   18
   1.156 +\endcode
   1.157 +
   1.158 +The \c \@edges is just a synonym of \c \@arcs. The \@arcs section can
   1.159 +also store the edge set of an undirected graph. In such case there is
   1.160 +a conventional method for store arc maps in the file, if two columns
   1.161 +has the same caption with \c '+' and \c '-' prefix, then these columns
   1.162 +can be regarded as the values of an arc map.
   1.163 +
   1.164 +The \c \@attributes section contains key-value pairs, each line
   1.165 +consists of two tokens, an attribute name, and then an attribute
   1.166 +value. The value of the attribute could be also a label value of a
   1.167 +node or an edge, or even an edge label prefixed with \c '+' or \c '-',
   1.168 +which regards to the forward or backward directed arc of the
   1.169 +corresponding edge.
   1.170 +
   1.171 +\code
   1.172 + @attributes
   1.173 + source 1
   1.174 + target 3
   1.175 + caption "LEMON test digraph"
   1.176 +\endcode
   1.177 +
   1.178 +The \e LGF can contain extra sections, but there is no restriction on
   1.179 +the format of such sections.
   1.180 +
   1.181 +*/
   1.182 +}