lgf.dox
author Peter Kovacs <kpeter@inf.elte.hu>
Mon, 15 Feb 2010 01:47:33 +0100
changeset 31 02083323ff2c
child 32 ef12f83752f6
permissions -rw-r--r--
Add preliminary pages about LGF and tools
kpeter@31
     1
/* -*- mode: C++; indent-tabs-mode: nil; -*-
kpeter@31
     2
 *
kpeter@31
     3
 * This file is a part of LEMON, a generic C++ optimization library.
kpeter@31
     4
 *
kpeter@31
     5
 * Copyright (C) 2003-2008
kpeter@31
     6
 * Egervary Jeno Kombinatorikus Optimalizalasi Kutatocsoport
kpeter@31
     7
 * (Egervary Research Group on Combinatorial Optimization, EGRES).
kpeter@31
     8
 *
kpeter@31
     9
 * Permission to use, modify and distribute this software is granted
kpeter@31
    10
 * provided that this copyright notice appears in all copies. For
kpeter@31
    11
 * precise terms see the accompanying LICENSE file.
kpeter@31
    12
 *
kpeter@31
    13
 * This software is provided "AS IS" with no warranty of any kind,
kpeter@31
    14
 * express or implied, and with no claim as to its suitability for any
kpeter@31
    15
 * purpose.
kpeter@31
    16
 *
kpeter@31
    17
 */
kpeter@31
    18
kpeter@31
    19
namespace lemon {
kpeter@31
    20
/**
kpeter@31
    21
[PAGE]sec_lgf[PAGE] Input-Output for Graphs
kpeter@31
    22
kpeter@31
    23
\todo Clarify this section.
kpeter@31
    24
kpeter@31
    25
LEMON provides a versatile file format for storing graphs
kpeter@31
    26
and related node and arc maps.
kpeter@31
    27
Such a format should be really flexible, it should be able to store arbitrary
kpeter@31
    28
number of maps of arbitrary value types.
kpeter@31
    29
On the other hand, the file size and the ease of processing are also critical
kpeter@31
    30
to support storing huge graphs, which is a major goal of LEMON.
kpeter@31
    31
These requirements forbid using complicated and deeply structured formats
kpeter@31
    32
like XML.
kpeter@31
    33
That is why a compact file format is designed for LEMON instead of using
kpeter@31
    34
hierarchical formats, such as GraphML, GXL or GML.
kpeter@31
    35
kpeter@31
    36
The LEMON Graph Format (LGF) comprises different sections, for
kpeter@31
    37
example a digraph is stored in a \c @nodes and an \c @arcs
kpeter@31
    38
section. These parts use column oriented formats, each
kpeter@31
    39
column belongs to a map in the graph. The first line of the section associate
kpeter@31
    40
names to these maps, which can be used to refer them.
kpeter@31
    41
Note that this simple idea makes it possible to extend the files with
kpeter@31
    42
new maps (columns) at any
kpeter@31
    43
position without having to modify the codes using these files.
kpeter@31
    44
kpeter@31
    45
The \c label map has special role, it must store unique values, which in turn
kpeter@31
    46
can be used to refer
kpeter@31
    47
to the nodes and arcs in the file. Finally, the first two column of the
kpeter@31
    48
\c @arcs section is anonymous, they indicate the source and target nodes,
kpeter@31
    49
respectively.
kpeter@31
    50
kpeter@31
    51
\code
kpeter@31
    52
  @nodes
kpeter@31
    53
  label coordinate
kpeter@31
    54
  0     (20,100)
kpeter@31
    55
  1     (40,120)
kpeter@31
    56
  ...
kpeter@31
    57
  41    (600,100)
kpeter@31
    58
  @arcs
kpeter@31
    59
            label length
kpeter@31
    60
  0    1    0     16
kpeter@31
    61
  0    2    1     12
kpeter@31
    62
  2    12   2     20
kpeter@31
    63
  ...
kpeter@31
    64
  36   41   123   21
kpeter@31
    65
  @attributes
kpeter@31
    66
  source 0
kpeter@31
    67
  target 41
kpeter@31
    68
  caption "A shortest path problem"
kpeter@31
    69
\endcode
kpeter@31
    70
kpeter@31
    71
The \ref DigraphReader and \ref DigraphWriter classes are used
kpeter@31
    72
to read and write a digraph and corresponding maps. By default, a map
kpeter@31
    73
can be used with these classes if its value type has standard I/O
kpeter@31
    74
operators (\c operator<<(ostream&, T) and \c operator>>(istream&, T&)).
kpeter@31
    75
Otherwise, a function object
kpeter@31
    76
can be specified which converts the value type to \c std::string.
kpeter@31
    77
The above LGF file can be scanned as follows.
kpeter@31
    78
kpeter@31
    79
\code
kpeter@31
    80
  ListDigraph g;
kpeter@31
    81
  ListDigraph::NodeMap<dim2::Point<int> > coord(g);
kpeter@31
    82
  ListDigraph::ArcMap<int> length(g);
kpeter@31
    83
  ListDigraph::Node src;
kpeter@31
    84
  std::string title;
kpeter@31
    85
kpeter@31
    86
  digraphReader(g, std::cin)
kpeter@31
    87
    .nodeMap("coord", coord)
kpeter@31
    88
    .arcMap("length", length)
kpeter@31
    89
    .attribute("caption", title)
kpeter@31
    90
    .node("source", src)
kpeter@31
    91
    .run();
kpeter@31
    92
\endcode
kpeter@31
    93
kpeter@31
    94
Apart from LGF, the library can also interpret other graph
kpeter@31
    95
formats, such as the well-known DIMACS format or the NAUTY graph6
kpeter@31
    96
format.
kpeter@31
    97
kpeter@31
    98
<hr>
kpeter@31
    99
kpeter@31
   100
The \e LGF is a <em>column oriented</em>
kpeter@31
   101
file format for storing graphs and associated data like
kpeter@31
   102
node and edge maps.
kpeter@31
   103
kpeter@31
   104
Each line with \c '#' first non-whitespace
kpeter@31
   105
character is considered as a comment line.
kpeter@31
   106
kpeter@31
   107
Otherwise the file consists of sections starting with
kpeter@31
   108
a header line. The header lines starts with an \c '@' character followed by the
kpeter@31
   109
type of section. The standard section types are \c \@nodes, \c
kpeter@31
   110
\@arcs and \c \@edges
kpeter@31
   111
and \@attributes. Each header line may also have an optional
kpeter@31
   112
\e name, which can be use to distinguish the sections of the same
kpeter@31
   113
type.
kpeter@31
   114
kpeter@31
   115
The standard sections are column oriented, each line consists of
kpeter@31
   116
<em>token</em>s separated by whitespaces. A token can be \e plain or
kpeter@31
   117
\e quoted. A plain token is just a sequence of non-whitespace characters,
kpeter@31
   118
while a quoted token is a
kpeter@31
   119
character sequence surrounded by double quotes, and it can also
kpeter@31
   120
contain whitespaces and escape sequences.
kpeter@31
   121
kpeter@31
   122
The \c \@nodes section describes a set of nodes and associated
kpeter@31
   123
maps. The first is a header line, its columns are the names of the
kpeter@31
   124
maps appearing in the following lines.
kpeter@31
   125
One of the maps must be called \c
kpeter@31
   126
"label", which plays special role in the file.
kpeter@31
   127
The following
kpeter@31
   128
non-empty lines until the next section describes nodes of the
kpeter@31
   129
graph. Each line contains the values of the node maps
kpeter@31
   130
associated to the current node.
kpeter@31
   131
kpeter@31
   132
\code
kpeter@31
   133
 @nodes
kpeter@31
   134
 label  coordinates  size    title
kpeter@31
   135
 1      (10,20)      10      "First node"
kpeter@31
   136
 2      (80,80)      8       "Second node"
kpeter@31
   137
 3      (40,10)      10      "Third node"
kpeter@31
   138
\endcode
kpeter@31
   139
kpeter@31
   140
The \c \@arcs section is very similar to the \c \@nodes section,
kpeter@31
   141
it again starts with a header line describing the names of the maps,
kpeter@31
   142
but the \c "label" map is not obligatory here. The following lines
kpeter@31
   143
describe the arcs. The first two tokens of each line are
kpeter@31
   144
the source and the target node of the arc, respectively, then come the map
kpeter@31
   145
values. The source and target tokens must be node labels.
kpeter@31
   146
kpeter@31
   147
\code
kpeter@31
   148
 @arcs
kpeter@31
   149
         capacity
kpeter@31
   150
 1   2   16
kpeter@31
   151
 1   3   12
kpeter@31
   152
 2   3   18
kpeter@31
   153
\endcode
kpeter@31
   154
kpeter@31
   155
The \c \@edges is just a synonym of \c \@arcs. The \@arcs section can
kpeter@31
   156
also store the edge set of an undirected graph. In such case there is
kpeter@31
   157
a conventional method for store arc maps in the file, if two columns
kpeter@31
   158
has the same caption with \c '+' and \c '-' prefix, then these columns
kpeter@31
   159
can be regarded as the values of an arc map.
kpeter@31
   160
kpeter@31
   161
The \c \@attributes section contains key-value pairs, each line
kpeter@31
   162
consists of two tokens, an attribute name, and then an attribute
kpeter@31
   163
value. The value of the attribute could be also a label value of a
kpeter@31
   164
node or an edge, or even an edge label prefixed with \c '+' or \c '-',
kpeter@31
   165
which regards to the forward or backward directed arc of the
kpeter@31
   166
corresponding edge.
kpeter@31
   167
kpeter@31
   168
\code
kpeter@31
   169
 @attributes
kpeter@31
   170
 source 1
kpeter@31
   171
 target 3
kpeter@31
   172
 caption "LEMON test digraph"
kpeter@31
   173
\endcode
kpeter@31
   174
kpeter@31
   175
The \e LGF can contain extra sections, but there is no restriction on
kpeter@31
   176
the format of such sections.
kpeter@31
   177
kpeter@31
   178
*/
kpeter@31
   179
}