Introduction
Grammar for generic path strings
Canonical form
Header synopsis
Class path
Member functions
Non-member functions
Validity checking functions
Rationale
Path decomposition examples
Filesystem Library functions traffic in objects of class path, provided by this header. The header also supplies non-member functions for error checking.
For actual operations on files and directories, see boost/filesystem/operations.hpp documentation.
For file I/O stream operations, see boost/filesystem/fstream.hpp documentation.
The Filesystem Library's Common Specifications apply to all member and non-member functions supplied by this header.
Class path provides for portable mechanism for representing paths in C++ programs, using a portable generic path string grammar. Class path is concerned with the lexical and syntactic aspects of a path. The path does not have to exist in the operating system's filesystem, and may contain names which are not even valid for the current operating system.
Rationale: If Filesystem functions trafficked in std::strings or C-style strings, the functions would provide only an illusion of portability since the function calls would be portable but the strings they operate on would not be portable.
An object of class path can be conceptualized as containing a sequence of strings, where each string contains the name of a directory, or, in the case of the string representing the element farthest from the root in the directory hierarchy, the name of a directory or file. Such a path representation is independent of any particular representation of the path as a single string.
There is no requirement that an implementation of class path actually contain a sequence of strings, but conceptualizing the contents as a sequence of strings provides a completely portable way to reason about paths.
So that programs can portably express paths as a single string, class path defines a grammar for a portable generic path string format, and supplies constructor and append operations taking such strings as arguments. Because user input or third-party library functions may supply path strings formatted according to operating system specific rules, an additional constructor is provided which takes a system-specific format as an argument.
Access functions are provided to retrieve the contents of a object of class path formatted as a portable path string, a directory path string using the operating system's format, and a file path string using the operating system's format. Additional access functions retrieve specific portions of the contained path.
The grammar is specified in extended BNF, with terminal symbols in quotes:
path ::= [root] [relative-path] // an empty path is validroot ::= [root-name] [root-directory]root-directory ::= "/"relative-path ::= path-element { "/" path-element } ["/"]path-element ::= name | parent-directoryparent-directory ::= ".."name ::= char { char }
The following are not valid name char's: x01-x1F, <, >, :, ", /,
\, |, *, ?
. Although these characters are supported by some operating
systems, they are disallowed by so many operating systems that they are banned
altogether.
root-name grammar is implementation-defined. root-name must not be present in generic input (the undecorated path constructors); it may be part of the strings returned by path member functions, and may be present in the argument to path constructors with the native decorator.
Although implementation-defined, it is desirable that root-name have a grammar which is distinguishable from other grammar elements, and follow the conventions of the operating system.
The optional trailing "/" in a relative-path is allowed as a notational convenience. It has no semantic meaning and is discarded in conversions to canonical form.
Whether or not a generic path string is actually portable to a particular operating system will depend on the names used. See the Portability Guide.
Adjacent name, parent-directory elements in m_name
are recursively removed.
relative-path does not have a trailing "/".
namespace boost { namespace filesystem { enum path_format { native }; class path { public: // compiler generates copy constructor, // copy assignment, and destructor // constructors: path(); path( const std::string & src ); path( const char * src ); path( const std::string & src, path_format ); path( const char * src, path_format ); // append operations: path & operator /= ( const path & rhs ); path operator / ( const path & rhs ) const; // conversion functions: const std::string & string() const; std::string native_file_string() const; std::string native_directory_string() const; // decomposition functions: path root_path() const; std::string root_name() const; std::string root_directory() const; path relative_path() const; std::string leaf() const; path branch_path() const; // query functions: bool empty() const; bool is_complete() const; bool has_root_path() const; bool has_root_name() const; bool has_root_directory() const; bool has_relative_path() const; bool has_leaf() const; bool has_branch_path() const; // iteration: typedef implementation-defined iterator; iterator begin() const; iterator end() const; private: std::vector<std::string> m_name; // for exposition only }; path operator / ( const char * lhs, const path & rhs ); path operator / ( const std::string & lhs, const path & rhs ); // Also see Undocumented non-member functions below } }
For the sake of exposition, class path member functions are described as if the class contains a private member std::vector<std::string> m_name. Actual implementations may differ.
Class path member, or non-member operator/, functions may throw a filesystem_error exception if the path is not in the syntax specified for the grammar.
Note: There is no guarantee that a path object represents a path which is considered valid by the current operating system. A path might be invalid to the operating system because it contains invalid names (too long, invalid characters, and so on), or because it is a partial path still as yet unfinished by the program. An invalid path will normally be detected at time of use, such as by one of the Filesystem Library's operations or fstream functions.
Portability Warning: There is no guarantee that a path object represents a path which would be portable to another operating system. A path might be non-portable because it contains names which the operating systems considers too long or contains invalid characters. Validity checking functions are supplied to ensure names in paths are as portable as desired, but they must be explicitly called by the user.
Several path member functions return representations of m_name in formats specific to the operating system. These formats are implementation defined. If an m_name element contains characters which are invalid under the operating system's rules, and there is an unambiguous translation between the invalid character and a valid character, the implementation is required to perform that translation. For example, if an operating system does not permit lowercase letters in file or directory names, these letters will be translated to uppercase if unambiguous. Such translation does not apply to generic path string format representations.
The rule-of-thumb is to use string() when a generic string representation of the path is required, and use either native_directory_string() or native_file_string() when a string representation formatted for the particular operating system is required.
The difference between the representations returned by string(), native_directory_string(), and native_file_string() are illustrated by the following code:
path my_path( "foo/bar/data.txt" ); std::cout << "string------------------: " << my_path.string() << '\n' << "native_directory_string-: " << my_path.native_directory_string() << '\n' << "native_file_string------: " << my_path.native_file_string() << '\n';
On POSIX systems, the output would be:
string------------------: foo/bar/data.txt native_directory_string-: foo/bar/data.txt native_file_string------: foo/bar/data.txt
On Windows, the output would be:
string------------------: foo/bar/data.txt native_directory_string-: foo\bar\data.txt native_file_string------: foo\bar\data.txt
On classic Mac OS, the output would be:
string------------------: foo/bar/data.txt native_directory_string-: foo:bar:data.txt native_file_string------: foo:bar:data.txt
On a hypothetical operating system using OpenVMS format representations, it would be:
string------------------: foo/bar/data.txt native_directory_string-: [foo.bar.data.txt] native_file_string------: [foo.bar]data.txt
Note that that because OpenVMS uses period as both a directory separator character and as a separator between filename and extension, native_directory_string() in the example produces a useless result. On this operating system, the programmer should only use this path as a file path. (There is a portability recommendation to not use periods in directory names.)
POSIX and other UNIX-like operating systems have a single root, while most other operating systems have multiple roots. Multi-root operating systems require a root-name such as a drive, device, disk, volume, or share name for a path to be resolved to an actual specific file or directory. Because of this, the root() and root_directory() functions return identical results on UNIX and other single-root operating systems, but different results on multi-root operating systems. Thus use of the wrong function will not be apparent on UNIX-like systems, but will result in non-portable code which will fail when used on multi-root systems. UNIX programmers are cautioned to use particular care in choosing between root() and root_directory(). If undecided, use root().
The same warning applies to has_root() and has_root_directory().
path();Effects: Default constructs an object of class path.
path( const std::string & src ); path( const char * src );Precondition: src conforms to the generic path string grammar relative-path syntax with optional root-directory prefix, and contains no embedded '\0' characters.
Effects: For each src element,
m_name.push_back( element )
.Postcondition:
m_name
is in canonical form.Rationale: These constructors are not explicit because an intended use is automatic conversion of strings to paths.
path( const std::string & src, path_format ); path( const char * src, path_format );Precondition: src conforms to the operating system's grammar for path strings, and contains no embedded '\0' characters.
Effects: For each src element,
m_name.push_back( element )
.Postcondition:
m_name
is in canonical form.
path & operator/=( const path & rhs );Effects: If any of the following conditions are met, then m_name.push_back("/").
- has_relative_path().
- !is_absolute() && has_root_name(), and the operating system requires the system-specific root be absolute
Then append
rhs.m_name
tom_name
.(Footnote: Thus on Windows, (path("//share") /= "foo").string() is "//share/foo")
Returns:
*this
Postcondition:
m_name
is in canonical form.Rationale: It is not considered an error for
rhs
to include aroot-name
because it might relative, and thus valid. For example, on Windows, the follow must succeed:path p( "c:", native ); p /= "/foo"; assert( p.string() == "c:/foo" );
const path operator/ ( const path & rhs ) const;Returns:
path( *this ) /= rhs
Rationale: Operator / is supplied because together with operator /=, it provides a convenient way for users to supply paths with a variable number of elements. For example,
initial_directory() / "src" / test_name
. Operator+ and operator+= were considered as alternatives, but deemed too easy to confuse with those operators for std::string. Operator<< and operator=<< were until during public review it was pointed out that / and /= matched the generic path syntax.Note: Also see non-member operator/ functions.
const std::string & string() const;Returns: The contents of
m_name
, formatted according to the rules of the generic path string grammar.Note: If any m_name elements originated from the system specific constructors, there is no guarantee that the returned string is unambiguous according to the grammar. A root-name indistinguishable from a relative-path name, a name containing "/", a name "..", and a root-name beyond the first element all could cause ambiguities. Such an ambiguous representation might still be useful for some purposes, such as display. If no m_name elements originated from the system specific constructors, the returned string is always unambiguous.
See: Representation example above.
std::string native_file_string() const;Returns: The contents of
m_name
, formatted in the system-specific representation of a file path.See: Representation example above.
Naming rationale: The name is deliberately ugly to warn users that this function yields non-portable results.
const std::string native_file_string() const;Returns: The contents of
m_name
, formatted in the system-specific representation of a directory path.See: Representation example above.
Naming rationale: The name is deliberately ugly to warn users that this function yields non-portable results.
path root_path() const;Returns:
root_name() / root_directory()
Portably provides a copy of a path's full root path, if any. See Path decomposition examples.
std::string root_name() const;Returns: If
!m_name.empty() && m_name[0]
is a root-name, returns m_name[0], else returns a null string.Portably provides a copy of a path's root-name, if any. See Path decomposition examples.
std::string root_directory() const;Returns: If the path contains root-directory, then
string("/")
, elsestring()
.Portably provides a copy of a path's root-directory, if any. The only possible results are "/" or "". See Path decomposition examples.
path relative_path() const;Returns: A new path containing only the relative-path portion of the source path.
Portably provides a copy of a path's relative portion, if any. See Path decomposition examples.
std::string leaf() const;Returns:
empty() ? std::string() : m_name.back()
A typical use is to obtain the undecorated name of a directory entry from the path returned by a directory_iterator. See Path decomposition examples.
path branch_path() const;Returns:
m_name.size() <= 1 ? path("") : x
, wherex
is a path constructed from all the elements ofm_name
except the last.A typical use is to obtain the parent path for a path supplied by the user. See Path decomposition examples.
bool empty() const;Returns:
m_name.empty()
.Naming rationale: Because the equivalent function for standard library container is named empty(), prior versions with other names caused numerous typos. The problem was acerbated because tests for path emptyness are often used near, or in the same expression, with tests for string emptyness,
bool is_complete() const;Returns: For single-root operating systems,
has_root_directory()
. For multi-root operating systems,has_root_directory() && has_root_name()
.Naming rationale: The alternate name, is_absolute(), causes confusion and controversy because on multi-root operating systems some people believe root_name() should participate in is_absolute(), and some don't.
bool has_root_path() const;Returns:
has_root_name() || has_root_directory()
bool has_root_name() const;Returns:
!root_name().empty()
bool has_root_directory() const;Returns:
!root_directory().empty()
bool has_relative_path() const;Returns:
!relative_path().empty()
bool has_leaf() const;Returns:
!leaf().empty()
bool has_branch_path() const;Returns:
!branch_path().empty()
typedef implementation-defined iterator;
A const iterator meeting the C++ Standard Library requirements for bidirectional iterators (24.1). The iterator is a class type (so that operator++ and -- will work on temporaries). The value, reference, and pointer types are std::string, const std::string &, and const std::string *, respectively.
iterator begin() const;
Returns:
m_path.begin()
iterator end() const;
Returns:
m_path.end()
path operator / ( const char * lhs, const path & rhs );
path operator / ( const std::string & lhs, const path & rhs );Returns:
path( lhs ) /= rhs
The header boost/filesystem/path.hpp also supplies several non-member functions which can be used to verify that a path meets certain requirements. These subsidiary functions are undocumented pending more research and discussion, and should not be relied upon as they are likely to change.
Function naming: Class path member function names and operations.hpp non-member function names were chosen to be somewhat distinct from one another. The objective was to avoid cases like foo.empty() and empty( foo ) both being valid, but with completely different semantics. At one point path::empty() was renamed path::is_null(), but that caused many coding typos because std::string::empty() is often used nearby.
Decomposition functions: Decomposition functions are provided because without them it is impossible to write portable path manipulations. Convenience is also a factor.
Const vs non-const returns: In some earlier versions of the library, member functions returned values as const rather than non-const. See Scott Myers, Effective C++, Item 21. The const qualifiers were eliminated (1) to conform with C++ Standard Library practice, (2) because non-const returns allow occasionally useful expressions, and (3) because the number of coding errors eliminated were deemed rare. A requirement that path::iterator not be a non-class type was added to eliminate errors non-const iterator errors.
It is often useful to extract specific elements from a path object. While any decomposition can be achieved by iterating over the elements of a path, convenience functions are provided which are easier to use, more efficient, and less error prone.
The first column of the table gives the example path, formatted by the string() function. The second column shows the values which would be returned by dereferencing each element iterator. The remaining columns show the results of various expressions.
p.string() | Elements | p.root_ path() |
p.root_ name() |
p.root_ directory() |
p.relative_ path() |
p.root_ directory() / p.relative_ path() |
p.root_ name() / p.relative_ path() |
p.branch_ path() |
p.leaf() |
All systems | |||||||||
/ |
/ |
/ |
"" |
/ |
"" |
/ |
"" |
"" |
/ |
foo |
foo |
"" |
"" |
"" |
foo |
foo |
foo |
"" |
foo |
/foo |
/,foo |
/ |
"" |
/ |
foo |
/foo |
foo |
/ |
foo |
foo/bar |
foo,bar |
"" |
"" |
"" |
foo/bar |
foo/bar |
foo/bar |
foo |
bar |
/foo/bar |
/,foo,bar |
/ |
"" |
/ |
foo/bar |
/foo/bar |
foo/bar |
/foo |
bar |
Windows | |||||||||
c: |
c: |
c: |
c: |
"" |
"" |
"" |
c: |
"" |
c: |
c:foo |
c:,foo |
c: |
c: |
"" |
foo |
foo |
c:foo |
c: |
foo |
c:/ |
c:,/ |
c:/ |
c: |
/ |
"" |
/ |
c: |
c: |
/ |
c:/foo |
c:,/,foo |
c:/ |
c: |
/ |
foo |
/foo |
c:foo |
c:/ |
foo |
//shr |
//shr |
//shr |
//shr |
"" |
"" |
"" |
//shr |
"" |
//shr |
//shr/ |
//shr,/ |
//shr/ |
//shr |
/ |
"" |
/ |
//shr |
//shr |
/ |
//shr/foo |
//shr, |
//shr/ |
//shr |
/ |
foo |
/foo |
//shr/foo |
//shr/ |
foo |
prn: |
prn: |
prn: |
prn: |
"" |
"" |
"" |
prn: |
"" |
prn: |
© Copyright Beman Dawes, 2002
Revised 11 March, 2003