Rework the LGF section
authorPeter Kovacs <kpeter@inf.elte.hu>
Mon, 22 Feb 2010 00:42:19 +0100
changeset 44a9f8282eb6b7
parent 43 7b789434b1af
child 45 725c60c7492d
Rework the LGF section
lgf.dox
     1.1 --- a/lgf.dox	Mon Feb 22 00:41:19 2010 +0100
     1.2 +++ b/lgf.dox	Mon Feb 22 00:42:19 2010 +0100
     1.3 @@ -18,35 +18,31 @@
     1.4  
     1.5  namespace lemon {
     1.6  /**
     1.7 -[PAGE]sec_lgf[PAGE] Input-Output for Graphs
     1.8 +[PAGE]sec_lgf[PAGE] LEMON Graph Format
     1.9  
    1.10 -\todo Clarify this section.
    1.11 +LEMON provides a versatile file format for storing graphs and related
    1.12 +node and arc maps. Such a format should be really flexible, it should be
    1.13 +able to store arbitrary number of maps of arbitrary value types.
    1.14 +On the other hand, the file size and the ease of processing are also critical
    1.15 +to support huge graphs, which is a major goal of LEMON.
    1.16 +These requirements forbid using complicated and deeply structured formats
    1.17 +like XML. That is why a compact text file format is designed for LEMON
    1.18 +instead of using hierarchical formats (such as GraphML, GXL or GML).
    1.19  
    1.20 -LEMON provides a versatile file format for storing graphs
    1.21 -and related node and arc maps.
    1.22 -Such a format should be really flexible, it should be able to store arbitrary
    1.23 -number of maps of arbitrary value types.
    1.24 -On the other hand, the file size and the ease of processing are also critical
    1.25 -to support storing huge graphs, which is a major goal of LEMON.
    1.26 -These requirements forbid using complicated and deeply structured formats
    1.27 -like XML.
    1.28 -That is why a compact file format is designed for LEMON instead of using
    1.29 -hierarchical formats, such as GraphML, GXL or GML.
    1.30 +The LEMON Graph Format (LGF) consists of several sections, for example a
    1.31 +digraph is stored in a <tt>\@nodes</tt> and an <tt>\@arcs</tt> section.
    1.32 +These parts use column oriented formats, each column belongs to a map in
    1.33 +the graph. The first line of the section associates names to these maps,
    1.34 +which can be used to refer them. Note that this simple idea makes it
    1.35 +possible to extend the files with new maps (columns) at any position
    1.36 +without having to modify the codes using these files.
    1.37 +Other data can also be added to an LGF file as individual properties
    1.38 +in an <tt>\@attributes</tt> section. This part can be used to specify
    1.39 +certain nodes or arcs, or store meta data for the file, such as copyright
    1.40 +notice.
    1.41  
    1.42 -The LEMON Graph Format (LGF) comprises different sections, for
    1.43 -example a digraph is stored in a \c @nodes and an \c @arcs
    1.44 -section. These parts use column oriented formats, each
    1.45 -column belongs to a map in the graph. The first line of the section associate
    1.46 -names to these maps, which can be used to refer them.
    1.47 -Note that this simple idea makes it possible to extend the files with
    1.48 -new maps (columns) at any
    1.49 -position without having to modify the codes using these files.
    1.50 -
    1.51 -The \c label map has special role, it must store unique values, which in turn
    1.52 -can be used to refer
    1.53 -to the nodes and arcs in the file. Finally, the first two column of the
    1.54 -\c @arcs section is anonymous, they indicate the source and target nodes,
    1.55 -respectively.
    1.56 +For example, a shortest path problem, which is represented as a directed
    1.57 +graph with some node and arc maps, can be stored in LGF format as follows.
    1.58  
    1.59  \code
    1.60    @nodes
    1.61 @@ -55,6 +51,7 @@
    1.62    1     (40,120)
    1.63    ...
    1.64    41    (600,100)
    1.65 +
    1.66    @arcs
    1.67              label length
    1.68    0    1    0     16
    1.69 @@ -62,19 +59,21 @@
    1.70    2    12   2     20
    1.71    ...
    1.72    36   41   123   21
    1.73 +
    1.74    @attributes
    1.75    source 0
    1.76 -  target 41
    1.77    caption "A shortest path problem"
    1.78  \endcode
    1.79  
    1.80 -The \ref DigraphReader and \ref DigraphWriter classes are used
    1.81 -to read and write a digraph and corresponding maps. By default, a map
    1.82 -can be used with these classes if its value type has standard I/O
    1.83 -operators (\c operator<<(ostream&, T) and \c operator>>(istream&, T&)).
    1.84 -Otherwise, a function object
    1.85 -can be specified which converts the value type to \c std::string.
    1.86 -The above LGF file can be scanned as follows.
    1.87 +In the <tt>\@nodes</tt> section, the \c label map has special role,
    1.88 +it must store unique values, which in turn can be used to refer to the nodes
    1.89 +in the file.
    1.90 +The first two columns of the <tt>\@arcs</tt> section are anonymous, they
    1.91 +store the source and target nodes, respectively.
    1.92 +
    1.93 +The \ref DigraphReader and \ref DigraphWriter classes can used
    1.94 +to read and write data in LGF format.
    1.95 +For example, the above file can be read as follows.
    1.96  
    1.97  \code
    1.98    ListDigraph g;
    1.99 @@ -83,97 +82,40 @@
   1.100    ListDigraph::Node src;
   1.101    std::string title;
   1.102  
   1.103 -  digraphReader(g, std::cin)
   1.104 -    .nodeMap("coord", coord)
   1.105 +  digraphReader(g, "digraph.lgf")
   1.106 +    .nodeMap("coordinate", coord)
   1.107      .arcMap("length", length)
   1.108 +    .node("source", src)
   1.109      .attribute("caption", title)
   1.110 -    .node("source", src)
   1.111      .run();
   1.112  \endcode
   1.113  
   1.114 -Apart from LGF, the library can also interpret other graph
   1.115 -formats, such as the well-known DIMACS format or the NAUTY graph6
   1.116 -format.
   1.117 +Note that the \ref DigraphReader class is used through the \ref digraphReader()
   1.118 +function with several named parameters.
   1.119  
   1.120 -<hr>
   1.121 +\note By default, a map can be used with these classes if its value type
   1.122 +has standard I/O operators (\c operator<<(ostream&, T) and \c
   1.123 +operator>>(istream&, T&)). Otherwise, a function object can be specified
   1.124 +which converts the value type to \c std::string.
   1.125  
   1.126 -The \e LGF is a <em>column oriented</em>
   1.127 -file format for storing graphs and associated data like
   1.128 -node and edge maps.
   1.129 -
   1.130 -Each line with \c '#' first non-whitespace
   1.131 -character is considered as a comment line.
   1.132 -
   1.133 -Otherwise the file consists of sections starting with
   1.134 -a header line. The header lines starts with an \c '@' character followed by the
   1.135 -type of section. The standard section types are \c \@nodes, \c
   1.136 -\@arcs and \c \@edges
   1.137 -and \@attributes. Each header line may also have an optional
   1.138 -\e name, which can be use to distinguish the sections of the same
   1.139 -type.
   1.140 -
   1.141 -The standard sections are column oriented, each line consists of
   1.142 -<em>token</em>s separated by whitespaces. A token can be \e plain or
   1.143 -\e quoted. A plain token is just a sequence of non-whitespace characters,
   1.144 -while a quoted token is a
   1.145 -character sequence surrounded by double quotes, and it can also
   1.146 -contain whitespaces and escape sequences.
   1.147 -
   1.148 -The \c \@nodes section describes a set of nodes and associated
   1.149 -maps. The first is a header line, its columns are the names of the
   1.150 -maps appearing in the following lines.
   1.151 -One of the maps must be called \c
   1.152 -"label", which plays special role in the file.
   1.153 -The following
   1.154 -non-empty lines until the next section describes nodes of the
   1.155 -graph. Each line contains the values of the node maps
   1.156 -associated to the current node.
   1.157 +The following code demonstrates how the above digraph with the maps and
   1.158 +attributes can be written into the standard output in LGF Format.
   1.159  
   1.160  \code
   1.161 - @nodes
   1.162 - label  coordinates  size    title
   1.163 - 1      (10,20)      10      "First node"
   1.164 - 2      (80,80)      8       "Second node"
   1.165 - 3      (40,10)      10      "Third node"
   1.166 +  digraphWriter(g, std::cout)
   1.167 +    .nodeMap("coordinate", coord)
   1.168 +    .arcMap("length", length)
   1.169 +    .node("source", src)
   1.170 +    .attribute("caption", title)
   1.171 +    .run();
   1.172  \endcode
   1.173 + 
   1.174 +Apart from LGF, the library can also handle other graph
   1.175 +formats, such as the well-known DIMACS format.
   1.176  
   1.177 -The \c \@arcs section is very similar to the \c \@nodes section,
   1.178 -it again starts with a header line describing the names of the maps,
   1.179 -but the \c "label" map is not obligatory here. The following lines
   1.180 -describe the arcs. The first two tokens of each line are
   1.181 -the source and the target node of the arc, respectively, then come the map
   1.182 -values. The source and target tokens must be node labels.
   1.183 -
   1.184 -\code
   1.185 - @arcs
   1.186 -         capacity
   1.187 - 1   2   16
   1.188 - 1   3   12
   1.189 - 2   3   18
   1.190 -\endcode
   1.191 -
   1.192 -The \c \@edges is just a synonym of \c \@arcs. The \@arcs section can
   1.193 -also store the edge set of an undirected graph. In such case there is
   1.194 -a conventional method for store arc maps in the file, if two columns
   1.195 -has the same caption with \c '+' and \c '-' prefix, then these columns
   1.196 -can be regarded as the values of an arc map.
   1.197 -
   1.198 -The \c \@attributes section contains key-value pairs, each line
   1.199 -consists of two tokens, an attribute name, and then an attribute
   1.200 -value. The value of the attribute could be also a label value of a
   1.201 -node or an edge, or even an edge label prefixed with \c '+' or \c '-',
   1.202 -which regards to the forward or backward directed arc of the
   1.203 -corresponding edge.
   1.204 -
   1.205 -\code
   1.206 - @attributes
   1.207 - source 1
   1.208 - target 3
   1.209 - caption "LEMON test digraph"
   1.210 -\endcode
   1.211 -
   1.212 -The \e LGF can contain extra sections, but there is no restriction on
   1.213 -the format of such sections.
   1.214 +For more information, see a more detailed \ref lgf-format
   1.215 +"description of the LGF format" and the \ref io_group module
   1.216 +in the reference manual.
   1.217  
   1.218  [TRAILER]
   1.219  */