# HG changeset patch # User Peter Kovacs # Date 1266795739 -3600 # Node ID a9f8282eb6b7fa6133813df31afd951661ca8fb2 # Parent 7b789434b1af5e7d713356203a80b85847a81b0d Rework the LGF section diff -r 7b789434b1af -r a9f8282eb6b7 lgf.dox --- a/lgf.dox Mon Feb 22 00:41:19 2010 +0100 +++ b/lgf.dox Mon Feb 22 00:42:19 2010 +0100 @@ -18,35 +18,31 @@ namespace lemon { /** -[PAGE]sec_lgf[PAGE] Input-Output for Graphs +[PAGE]sec_lgf[PAGE] LEMON Graph Format -\todo Clarify this section. +LEMON provides a versatile file format for storing graphs and related +node and arc maps. Such a format should be really flexible, it should be +able to store arbitrary number of maps of arbitrary value types. +On the other hand, the file size and the ease of processing are also critical +to support huge graphs, which is a major goal of LEMON. +These requirements forbid using complicated and deeply structured formats +like XML. That is why a compact text file format is designed for LEMON +instead of using hierarchical formats (such as GraphML, GXL or GML). -LEMON provides a versatile file format for storing graphs -and related node and arc maps. -Such a format should be really flexible, it should be able to store arbitrary -number of maps of arbitrary value types. -On the other hand, the file size and the ease of processing are also critical -to support storing huge graphs, which is a major goal of LEMON. -These requirements forbid using complicated and deeply structured formats -like XML. -That is why a compact file format is designed for LEMON instead of using -hierarchical formats, such as GraphML, GXL or GML. +The LEMON Graph Format (LGF) consists of several sections, for example a +digraph is stored in a \@nodes and an \@arcs section. +These parts use column oriented formats, each column belongs to a map in +the graph. The first line of the section associates names to these maps, +which can be used to refer them. Note that this simple idea makes it +possible to extend the files with new maps (columns) at any position +without having to modify the codes using these files. +Other data can also be added to an LGF file as individual properties +in an \@attributes section. This part can be used to specify +certain nodes or arcs, or store meta data for the file, such as copyright +notice. -The LEMON Graph Format (LGF) comprises different sections, for -example a digraph is stored in a \c @nodes and an \c @arcs -section. These parts use column oriented formats, each -column belongs to a map in the graph. The first line of the section associate -names to these maps, which can be used to refer them. -Note that this simple idea makes it possible to extend the files with -new maps (columns) at any -position without having to modify the codes using these files. - -The \c label map has special role, it must store unique values, which in turn -can be used to refer -to the nodes and arcs in the file. Finally, the first two column of the -\c @arcs section is anonymous, they indicate the source and target nodes, -respectively. +For example, a shortest path problem, which is represented as a directed +graph with some node and arc maps, can be stored in LGF format as follows. \code @nodes @@ -55,6 +51,7 @@ 1 (40,120) ... 41 (600,100) + @arcs label length 0 1 0 16 @@ -62,19 +59,21 @@ 2 12 2 20 ... 36 41 123 21 + @attributes source 0 - target 41 caption "A shortest path problem" \endcode -The \ref DigraphReader and \ref DigraphWriter classes are used -to read and write a digraph and corresponding maps. By default, a map -can be used with these classes if its value type has standard I/O -operators (\c operator<<(ostream&, T) and \c operator>>(istream&, T&)). -Otherwise, a function object -can be specified which converts the value type to \c std::string. -The above LGF file can be scanned as follows. +In the \@nodes section, the \c label map has special role, +it must store unique values, which in turn can be used to refer to the nodes +in the file. +The first two columns of the \@arcs section are anonymous, they +store the source and target nodes, respectively. + +The \ref DigraphReader and \ref DigraphWriter classes can used +to read and write data in LGF format. +For example, the above file can be read as follows. \code ListDigraph g; @@ -83,97 +82,40 @@ ListDigraph::Node src; std::string title; - digraphReader(g, std::cin) - .nodeMap("coord", coord) + digraphReader(g, "digraph.lgf") + .nodeMap("coordinate", coord) .arcMap("length", length) + .node("source", src) .attribute("caption", title) - .node("source", src) .run(); \endcode -Apart from LGF, the library can also interpret other graph -formats, such as the well-known DIMACS format or the NAUTY graph6 -format. +Note that the \ref DigraphReader class is used through the \ref digraphReader() +function with several named parameters. -
+\note By default, a map can be used with these classes if its value type +has standard I/O operators (\c operator<<(ostream&, T) and \c +operator>>(istream&, T&)). Otherwise, a function object can be specified +which converts the value type to \c std::string. -The \e LGF is a column oriented -file format for storing graphs and associated data like -node and edge maps. - -Each line with \c '#' first non-whitespace -character is considered as a comment line. - -Otherwise the file consists of sections starting with -a header line. The header lines starts with an \c '@' character followed by the -type of section. The standard section types are \c \@nodes, \c -\@arcs and \c \@edges -and \@attributes. Each header line may also have an optional -\e name, which can be use to distinguish the sections of the same -type. - -The standard sections are column oriented, each line consists of -tokens separated by whitespaces. A token can be \e plain or -\e quoted. A plain token is just a sequence of non-whitespace characters, -while a quoted token is a -character sequence surrounded by double quotes, and it can also -contain whitespaces and escape sequences. - -The \c \@nodes section describes a set of nodes and associated -maps. The first is a header line, its columns are the names of the -maps appearing in the following lines. -One of the maps must be called \c -"label", which plays special role in the file. -The following -non-empty lines until the next section describes nodes of the -graph. Each line contains the values of the node maps -associated to the current node. +The following code demonstrates how the above digraph with the maps and +attributes can be written into the standard output in LGF Format. \code - @nodes - label coordinates size title - 1 (10,20) 10 "First node" - 2 (80,80) 8 "Second node" - 3 (40,10) 10 "Third node" + digraphWriter(g, std::cout) + .nodeMap("coordinate", coord) + .arcMap("length", length) + .node("source", src) + .attribute("caption", title) + .run(); \endcode + +Apart from LGF, the library can also handle other graph +formats, such as the well-known DIMACS format. -The \c \@arcs section is very similar to the \c \@nodes section, -it again starts with a header line describing the names of the maps, -but the \c "label" map is not obligatory here. The following lines -describe the arcs. The first two tokens of each line are -the source and the target node of the arc, respectively, then come the map -values. The source and target tokens must be node labels. - -\code - @arcs - capacity - 1 2 16 - 1 3 12 - 2 3 18 -\endcode - -The \c \@edges is just a synonym of \c \@arcs. The \@arcs section can -also store the edge set of an undirected graph. In such case there is -a conventional method for store arc maps in the file, if two columns -has the same caption with \c '+' and \c '-' prefix, then these columns -can be regarded as the values of an arc map. - -The \c \@attributes section contains key-value pairs, each line -consists of two tokens, an attribute name, and then an attribute -value. The value of the attribute could be also a label value of a -node or an edge, or even an edge label prefixed with \c '+' or \c '-', -which regards to the forward or backward directed arc of the -corresponding edge. - -\code - @attributes - source 1 - target 3 - caption "LEMON test digraph" -\endcode - -The \e LGF can contain extra sections, but there is no restriction on -the format of such sections. +For more information, see a more detailed \ref lgf-format +"description of the LGF format" and the \ref io_group module +in the reference manual. [TRAILER] */