kpeter@31: /* -*- mode: C++; indent-tabs-mode: nil; -*-
kpeter@31:  *
kpeter@31:  * This file is a part of LEMON, a generic C++ optimization library.
kpeter@31:  *
kpeter@31:  * Copyright (C) 2003-2008
kpeter@31:  * Egervary Jeno Kombinatorikus Optimalizalasi Kutatocsoport
kpeter@31:  * (Egervary Research Group on Combinatorial Optimization, EGRES).
kpeter@31:  *
kpeter@31:  * Permission to use, modify and distribute this software is granted
kpeter@31:  * provided that this copyright notice appears in all copies. For
kpeter@31:  * precise terms see the accompanying LICENSE file.
kpeter@31:  *
kpeter@31:  * This software is provided "AS IS" with no warranty of any kind,
kpeter@31:  * express or implied, and with no claim as to its suitability for any
kpeter@31:  * purpose.
kpeter@31:  *
kpeter@31:  */
kpeter@31: 
kpeter@31: namespace lemon {
kpeter@31: /**
kpeter@31: [PAGE]sec_lgf[PAGE] Input-Output for Graphs
kpeter@31: 
kpeter@31: \todo Clarify this section.
kpeter@31: 
kpeter@31: LEMON provides a versatile file format for storing graphs
kpeter@31: and related node and arc maps.
kpeter@31: Such a format should be really flexible, it should be able to store arbitrary
kpeter@31: number of maps of arbitrary value types.
kpeter@31: On the other hand, the file size and the ease of processing are also critical
kpeter@31: to support storing huge graphs, which is a major goal of LEMON.
kpeter@31: These requirements forbid using complicated and deeply structured formats
kpeter@31: like XML.
kpeter@31: That is why a compact file format is designed for LEMON instead of using
kpeter@31: hierarchical formats, such as GraphML, GXL or GML.
kpeter@31: 
kpeter@31: The LEMON Graph Format (LGF) comprises different sections, for
kpeter@31: example a digraph is stored in a \c @nodes and an \c @arcs
kpeter@31: section. These parts use column oriented formats, each
kpeter@31: column belongs to a map in the graph. The first line of the section associate
kpeter@31: names to these maps, which can be used to refer them.
kpeter@31: Note that this simple idea makes it possible to extend the files with
kpeter@31: new maps (columns) at any
kpeter@31: position without having to modify the codes using these files.
kpeter@31: 
kpeter@31: The \c label map has special role, it must store unique values, which in turn
kpeter@31: can be used to refer
kpeter@31: to the nodes and arcs in the file. Finally, the first two column of the
kpeter@31: \c @arcs section is anonymous, they indicate the source and target nodes,
kpeter@31: respectively.
kpeter@31: 
kpeter@31: \code
kpeter@31:   @nodes
kpeter@31:   label coordinate
kpeter@31:   0     (20,100)
kpeter@31:   1     (40,120)
kpeter@31:   ...
kpeter@31:   41    (600,100)
kpeter@31:   @arcs
kpeter@31:             label length
kpeter@31:   0    1    0     16
kpeter@31:   0    2    1     12
kpeter@31:   2    12   2     20
kpeter@31:   ...
kpeter@31:   36   41   123   21
kpeter@31:   @attributes
kpeter@31:   source 0
kpeter@31:   target 41
kpeter@31:   caption "A shortest path problem"
kpeter@31: \endcode
kpeter@31: 
kpeter@31: The \ref DigraphReader and \ref DigraphWriter classes are used
kpeter@31: to read and write a digraph and corresponding maps. By default, a map
kpeter@31: can be used with these classes if its value type has standard I/O
kpeter@31: operators (\c operator<<(ostream&, T) and \c operator>>(istream&, T&)).
kpeter@31: Otherwise, a function object
kpeter@31: can be specified which converts the value type to \c std::string.
kpeter@31: The above LGF file can be scanned as follows.
kpeter@31: 
kpeter@31: \code
kpeter@31:   ListDigraph g;
kpeter@31:   ListDigraph::NodeMap<dim2::Point<int> > coord(g);
kpeter@31:   ListDigraph::ArcMap<int> length(g);
kpeter@31:   ListDigraph::Node src;
kpeter@31:   std::string title;
kpeter@31: 
kpeter@31:   digraphReader(g, std::cin)
kpeter@31:     .nodeMap("coord", coord)
kpeter@31:     .arcMap("length", length)
kpeter@31:     .attribute("caption", title)
kpeter@31:     .node("source", src)
kpeter@31:     .run();
kpeter@31: \endcode
kpeter@31: 
kpeter@31: Apart from LGF, the library can also interpret other graph
kpeter@31: formats, such as the well-known DIMACS format or the NAUTY graph6
kpeter@31: format.
kpeter@31: 
kpeter@31: <hr>
kpeter@31: 
kpeter@31: The \e LGF is a <em>column oriented</em>
kpeter@31: file format for storing graphs and associated data like
kpeter@31: node and edge maps.
kpeter@31: 
kpeter@31: Each line with \c '#' first non-whitespace
kpeter@31: character is considered as a comment line.
kpeter@31: 
kpeter@31: Otherwise the file consists of sections starting with
kpeter@31: a header line. The header lines starts with an \c '@' character followed by the
kpeter@31: type of section. The standard section types are \c \@nodes, \c
kpeter@31: \@arcs and \c \@edges
kpeter@31: and \@attributes. Each header line may also have an optional
kpeter@31: \e name, which can be use to distinguish the sections of the same
kpeter@31: type.
kpeter@31: 
kpeter@31: The standard sections are column oriented, each line consists of
kpeter@31: <em>token</em>s separated by whitespaces. A token can be \e plain or
kpeter@31: \e quoted. A plain token is just a sequence of non-whitespace characters,
kpeter@31: while a quoted token is a
kpeter@31: character sequence surrounded by double quotes, and it can also
kpeter@31: contain whitespaces and escape sequences.
kpeter@31: 
kpeter@31: The \c \@nodes section describes a set of nodes and associated
kpeter@31: maps. The first is a header line, its columns are the names of the
kpeter@31: maps appearing in the following lines.
kpeter@31: One of the maps must be called \c
kpeter@31: "label", which plays special role in the file.
kpeter@31: The following
kpeter@31: non-empty lines until the next section describes nodes of the
kpeter@31: graph. Each line contains the values of the node maps
kpeter@31: associated to the current node.
kpeter@31: 
kpeter@31: \code
kpeter@31:  @nodes
kpeter@31:  label  coordinates  size    title
kpeter@31:  1      (10,20)      10      "First node"
kpeter@31:  2      (80,80)      8       "Second node"
kpeter@31:  3      (40,10)      10      "Third node"
kpeter@31: \endcode
kpeter@31: 
kpeter@31: The \c \@arcs section is very similar to the \c \@nodes section,
kpeter@31: it again starts with a header line describing the names of the maps,
kpeter@31: but the \c "label" map is not obligatory here. The following lines
kpeter@31: describe the arcs. The first two tokens of each line are
kpeter@31: the source and the target node of the arc, respectively, then come the map
kpeter@31: values. The source and target tokens must be node labels.
kpeter@31: 
kpeter@31: \code
kpeter@31:  @arcs
kpeter@31:          capacity
kpeter@31:  1   2   16
kpeter@31:  1   3   12
kpeter@31:  2   3   18
kpeter@31: \endcode
kpeter@31: 
kpeter@31: The \c \@edges is just a synonym of \c \@arcs. The \@arcs section can
kpeter@31: also store the edge set of an undirected graph. In such case there is
kpeter@31: a conventional method for store arc maps in the file, if two columns
kpeter@31: has the same caption with \c '+' and \c '-' prefix, then these columns
kpeter@31: can be regarded as the values of an arc map.
kpeter@31: 
kpeter@31: The \c \@attributes section contains key-value pairs, each line
kpeter@31: consists of two tokens, an attribute name, and then an attribute
kpeter@31: value. The value of the attribute could be also a label value of a
kpeter@31: node or an edge, or even an edge label prefixed with \c '+' or \c '-',
kpeter@31: which regards to the forward or backward directed arc of the
kpeter@31: corresponding edge.
kpeter@31: 
kpeter@31: \code
kpeter@31:  @attributes
kpeter@31:  source 1
kpeter@31:  target 3
kpeter@31:  caption "LEMON test digraph"
kpeter@31: \endcode
kpeter@31: 
kpeter@31: The \e LGF can contain extra sections, but there is no restriction on
kpeter@31: the format of such sections.
kpeter@31: 
kpeter@31: */
kpeter@31: }