‘filesysobjects.__init__’ - Module¶
The package ‘filesysobjects’ provides utilities for the handling of file system like resource paths as class trees containing files as objects.
The displayed numeric values for the enums are for debugging support only and may change apperantly, use the symbolic names only.
Python Version¶
- V3K: Python3.5+, else Python2.7
- ISSTR: string and unicode for type comparison
- unicode: for Python3, remaps unicode to str
Platform Definitions¶
The internal representation of the platform parameter is an int used as bit-array for binary logic operations. The most interfaces support the bit-array representation as well as the alternatively string name macros as defined by the sys.platform interface. The bit-mask representation sould be preferred, but requires the import of filesysobjects, while the string macros just require the import of the interfaces. These may also provide advace when used for custom types or resource paths, e.g. SNMP.
Structure of Bit-Masks¶
The predefined bitmasks are provided as a label of form RTE_<name>, which covers the grouping of bit-mask blocks and the increments within these groups. The comitted interface is the NAMED ENUMERATION only, NOT the numeric value. Therefore the values must never be accessed explicitly.
The general algorithm of the calculation for the bit-mask is a combination of category bit-masks for bit-blocks and the addition of sub-blocks for context-bitmasks for the family grouping it’s members. This combines the performance of logical bit-operations with the reduction of the number of required bits. This is required due to the vast number of combinations, which else would lead to bit arrays of astronomical dimensions.
The current implementation covers all major - practically almost all - filesystems, UNC, and URI resources with a required number of 15-bits, where the 16-th is reserved for future use. Thus may fit also to 16-bit based platforms such as embedded systems. This provides more than 30.000 variants platforms for resource paths, which seems future proof.
Bit-Mask Definitions¶
The following additional definitions are introduced.
The following aliasses are defined in addition:
sys.platform alias win32 win darwin osx The bit-mask provides the bit for the OS as well as the bit for the base category and the family. The masks contain the actual runtime environment of the execution platform as well as the virtual runtime environment of the resource path. The values representing the platform could be used as source and target platform.
Syntax Domain sys.platform Category Family Windows win RTE_WIN32 RTE_WIN32 Cygwin cygwin RTE_POSIX RTE_CYGWIN Linux linux, linux2 RTE_POSIX RTE_LINUX Solaris sunos RTE_POSIX RTE_SOLARIS BSD bsd RTE_POSIX RTE_BSD OS-X darwin RTE_POSIX RTE_OSX UNC n.a. RTE_UNC RTE_OSX URI n.a. RTE_URI all URIs with schemes GENERIC n.a. RTE_GENERIC abstract representation, network, local, and drive For example the RTE_FILEURI defines the URI in accordance to [RFC8089] for the data environment, while the the actual runtime execution environemnt could be e.g. RTE_WIN32 or RTE_LINUX. When the execution environment bit is provided only, the major filesystem type is set for the resource path type.
Enum Values:
Base type blocks:
RTE_CYGWIN - cygwin: Cygwin [CYGWIN]
RTE_POSIX - posix: Posix systems using fcntl [POSIX].
RTE_WIN32 - win: All Windows systems [MS-DTYP]
With back-slash ‘\\’:
RTE_URI: URI - [RFC3986]
RTE_GENERIC: Undefined platform for special cases
POSIX base system platforms with implied filesystem resource path types:
- RTE_BSD - bsd: BSD, - OpenBSD, FreeBSD, NetBSD - as Posix system [POSIX].
- RTE_DARWIN - darwin: Darwin/OS-X, as Posix system [POSIX], no macpath-legacy.
- RTE_LINUX - linux: Linux with specific add-ons - OS, DIST, GNU - as Posix system [POSIX].
- RTE_OSX - osx: Darwin/OS-X, as Posix system [POSIX], no macpath-legacy.
- RTE_SOLARIS - solaris: UNIX/Solaris, as Posix system [POSIX].
Virtual runtime domains of specific resource path syntax:
RTE_FILEURI - fileuri: file-URI [RFC8089]
Traditional representation
RTE_FILEURI0 - fileuri0: file-URI [RFC8089] Appendix E
Minimal representation:
file:/path/to/file file:c:/path/to/file file://host.example.com/path/to/file
RTE_FILEURI4 - fileuri4: file-URI [RFC8089] Appendix E
Traditional representation:
file:///path/to/file file:///c:/path/to/file file:////host.example.com/path/to/file
RTE_FILEURI5 - fileuri5: file-URI [RFC8089] Appendix E
With extra slash ‘/’:
file:///path/to/file file:///c:/path/to/file file://///host.example.com/path/to/file
RTE_HTTP - https: virtual add-on bit for URI specific handling, see [RFC3986]
RTE_HTTPS - http: virtual add-on bit for URI specific handling, see [RFC3986]
RTE_GENERIC: Undefined platform for special cases.
Control Variables:
- RTE: Current runtime-environment variable.
Calculation of Bit-Masks¶
A typical example for the base of the mapping and algorithms is:
# category: posix RTE_POSIX = 8192 #: Posix systems using fcntl [POSIX]. # family: OS-X # bit-block: Apple - OS-X RTE_DARWIN = RTE_POSIX + 1 #: Darwin/OS-X, as Posix system [POSIX], no macpath-legacy. RTE_OSX = RTE_POSIX + 2 #: Darwin/OS-X, as Posix system [POSIX], no macpath-legacy. # family: Sun - Solaris RTE_SOLARIS = RTE_POSIX + 16 #: UNIX/Solaris, as Posix system [POSIX]. # family: BSD RTE_BSD = RTE_POSIX + 32 #: BSD, - OpenBSD, FreeBSD, NetBSD - as Posix system [POSIX]. # family: Linux RTE_LINUX = RTE_POSIX + 64 #: Linux with specific add-ons - OS, DIST, GNU - as Posix system [POSIX]. # members" Linux RTE_CENTOS = RTE_LINUX + 1 #: CentOS RTE_CENTOS4 = RTE_LINUX + 2 #: CentOS-4 RTE_CENTOS5 = RTE_LINUX + 3 #: CentOS-5 RTE_CENTOS6 = RTE_LINUX + 4 #: CentOS-6 RTE_CENTOS7 = RTE_LINUX + 5 #: CentOS-7 RTE_FEDORA = RTE_LINUX + 32 #: Fedora RTE_FEDORA19 = RTE_LINUX + 33 #: Fedora-19 RTE_FEDORA27 = RTE_LINUX + 34 #: Fedora-27 RTE_DEBIAN = RTE_LINUX + 64 #: Debian RTE_DEBIAN6 = RTE_LINUX + 65 #: Debian - squeeze RTE_DEBIAN7 = RTE_LINUX + 66 #: Debian - wheezy RTE_DEBIAN8 = RTE_LINUX + 67 #: Debian - jessy RTE_DEBIAN9 = RTE_LINUX + 68 #: Debian - stretch
The calculations are for OS and distributions:
# explicit
if RTE & RTE_POSIX: # use category
if RTE & RTE_LINUX: # use family
if RTE & RTE_CENTOS: # use distro
if RTE & RTE_CENTOS7: # use release
# hierarchical
if RTE & RTE_POSIX: # use category
if RTE & RTE_LINUX: # use family
# do s.th. ...
if RTE & RTE_CENTOS7: # use release
# do s.th. ...
elif RTE & RTE_BSD: # use family
# do s.th. else...
if RTE & RTE_OPENBSD: # use release
# do s.th. else...
The calculations are for URI and schemes:
if RTE & RTE_URI: # use category
if RTE & RTE_HTTP: # use scheme
Node Types¶
File system node type enums as bit values [code].
- T_ALL(1): All files and empty directories.
- T_DEV(64): Devices nodes.
- T_DIR(4): Directories.
- T_EXP(256): Exported filesystems.
- T_FILE(2):Files.
- T_HARDL(32): Hard links.
- T_LOCAL(512): Local file system entries only.
- T_MNT(128): Mount points.
- T_NODES(8): Files and empty directories. Is superposed by T_FILES and T_DIR.
- T_SYML(16): Symbolic links.
Search Parameters¶
Search directions control parameters for findpattern(). [code]
- L_MEM_CACHE_ONE(4): Caches one node-level only.
- L_TDOWN_WALK(0): See os.walk(topdown=True)
- L_UP_EXT(3): Caches M_UP_WALK, and provides same behavior as M_TDOWN_WALK.
- L_UP_WALK(1): See os.walk(topdown=False)
Search Flow Control¶
Bit values for the search proceeding behaviour of findpattern() [code].
- M_ACCURATE(4): ffs.
- M_ALL(32): Returns all matched results.
- M_FILTERS(1): Applies to all filters.
- M_FILTPAR(3): Applies all filters and parameters.
- M_FIRST(8): Breaks after first match.
- M_IGNORE(0): Filters and parameters are ignored.
- M_LAST(16): Matches all, but returns the last only.
- M_NEST(64): Clears nested directory path matches.
- M_NOCANON(128): Do not convert input paths to canonical absolute.
- M_PARAMS(2): Applies to all parameters.
- M_REL(256): Keep relative names.
Control of Pattern Application¶
The main design target of the filesysobjects includes the efficient path resolution by arbitrary match pattern, which is very close to the search of resource trees with applied match filters of arbitrary pattern. Both rely deeply on the parameter sets for their performance.
Another aspect is the partial ambiguity of globs and regexpr, which in addition is superposed by the different valid character sets of the various OSs. This is even worst in case of mounted shares between different OSs. Thus the final resolution of globs and regexpr require actual file system access in order to reolve the ambiguity. For wildcard expansion this results in final iterationn on the filesystem, which could result in large sets of matched nodes. Thus it could cost for large sets of nodes a significant amount of performance.
The provided bitmask constants control the applied algorithms.
- W_GLOB(1) - considers the provided expressions as glob*s or *literal
- W_LITERAL(0) - considers the provided expressions as literal match-patterns
- W_RE(2) - considers the provided expressions as re, glob, or literal
- W_RE_FULL(16) - disables performance optimization, enables full scale re syntax including groups and ‘|‘ (or)
The following expression demonstrates the issue of name resolution in case of ambiguity.
1 | filepathname = '/path/to/.*[d].*'
The expression is a valid glob and a valid re on all platforms - in addition a literal on seemingly all platforms too.
static name resolution
For a generic path expansion function it is not possible to decide static offline, what actually matches.
dynamic name resolution
In case of dynamic name binding resolution it is still not possible to determine accurately, which type(s) were intended by the caller.
(pf, **kw)[source]¶ Evaluates the resulting parameters for the source platform.
See also tpf and spf.
- Args:
- spf:
Source platform, defines the input syntax domain. For the syntax refer to API in the manual at spf.
For additi0onal details refer to tpf and spf, paths.getspf(), normapppathx(), normpathx().
- Returns:
The path syntax parameters for the target platform:
return (sep, pathsep, spf, rte) sep := [/\\] pathsep := [:;] spf := string-representation string-representation := predefined | custom predefined := ( <rte_map.keys()> ) rte := bitmask bitmask := ( <RTE_*> )
- Raises:
- pass-through
(pf, **kw)[source]¶ Evaluates the resulting parameters for the target platform.
See also tpf and spf.
- Args:
- pf:
Target platform for the file pathname. Supports multiple entries, where the first has precedence.
For details refer to ‘normpathx] <#normpathx>`_.
- kw:
- apppre:
Application prefix.
apppre = (True|False)
default := False
- pathsep:
Prepends to the path separator set by the spf, thus has precedence over the standard separator of the source platform. The resulting first character decides the type of the scanner APPPATHSCANNER.
pathsep := (';' | ':')
default = ‘’
- Returns:
The path syntax parameters for the target platform:
return (sep, pathsep, tpf, rte, apppre) sep := [/\\] pathsep := [:;] tpf := string-representation rte := bitmask apppre := (True | False) string-representation := predefined | custom predefined := ( <rte_map.keys()> ) bitmask := ( <RTE_*> )
- Raises:
- pass-through