With time, Windows API comes closer to the old Unix paradigm "Everything is a file". Therefore, this whole section dedicated to file management will cover firstly the file management, but also some other objects like directories, and even devices, which are manipulated in Windows in a rather coherent way. We'll see later on some other objects fitting (more or less) in this picture (pipes or consoles to name a few).
First of all, Wine, while implementing the file interface from Windows, needs to maps a file name (expressed in the Windows world) onto a file name in the Unix world. This encompasses several aspects: how to map the file names, how to map access rights (both on files and directories), how to map physical devices (hardisks, but also other devices - like serial or parallel interfaces - and even VxDs).
Let's first review a bit the various forms Windows uses when it comes to file names.
At the beginning was DOS, where each file has to sit on a drive, called from a single letter. For separating device names from directory or file names, a ':' was appended to this single letter, hence giving the (in)-famous C: drive designations. Another great invention was to use some fixed names for accessing devices: not only where these named fixed, in a way you couldn't change the name if you'd wish to, but also, they were insensible to the location where you were using them. For example, it's well known that COM1 designates the first serial port, but it's also true that c:\foo\bar\com1 also designates the first serial port. It's still true today: on XP, you still cannot name a file COM1, whatever the directory!!!
Well later on (with Windows 95), Microsoft decided to overcome some little details in file names: this included being able to get out of the 8+3 format (8 letters for the name, 3 letters for the extension), and so being able to use "long names" (that's the "official" naming; as you can guess, the 8+3 format is a short name), and also to use very strange characters in a file name (like a space, or even a '.'). You could then name a file My File V0.1.txt, instead of myfile01.txt. Just to keep on the fun side of things, for many years the format used on the disk itself for storing the names has been the short name as the real one and to use some tricky aliasing techniques to store the long name. When some newer disk file systems have been introduced (NTFS with NT), in replacement of the old FAT system (which had little evolved since the first days of DOS), the long name became the real name while the short name took the alias role.
Windows also started to support mounting network shares, and see them as they were a local disk (through a specific drive letter). The way it has been done changed along the years, so we won't go into all the details (especially on the DOS and Win9x side).
The introduction of NT allowed a deep change in the ways DOS had been handling devices:
There's no longer a forest of DOS drive letters (even if the assign was a way to create symbolic links in the forest), but a single hierarchical space.
This hierarchy includes several distinct elements. For example, \Device\Hardisk0\Partition0 refers to the first partition on the first physical hard disk of the system.
This hierarchy covers way more than just the files and drives related objects, but most of the objects in the system. We'll only cover here the file related part.
This hierarchy is not directly accessible for the Win32 API, but only the NTDLL API. The Win32 API only allows to manipulate part of this hierarchy (the rest being hidden from the Win32 API). Of course, the part you see from Win32 API looks very similar to the one that DOS provided.
Mounting a disk is performed by creating a symbol link in this hierarchy from \Global??\C: (the name seen from the Win32 API) to \Device\Harddiskvolume1 which determines the partition on a physical disk where C: is going to be seen.
Network shares are also accessible through a symbol link. However in this case, a symbol link is created from \Global??\UNC\host\share\ for the share share on the machine host) to what's called a network redirector, and which will take care of 1/ the connection to the remote share, 2/ handling with that remote share the rest of the path (after the name of the server, and the name of the share on that server).
Note: In NT naming convention, \Global?? can also be called \?? to shorten the access.
All of these things, make the NT system pretty much more flexible (you can add new types of filesystems if you want), you provide a unique name space for all objects, and most operations boil down to creating relationship between different objects.
Let's end this chapter about files in Windows with a review of the different formats used for file names:
c:\foo\bar is a full path.
\foo\bar is an absolute path; the full path is created by appending the default drive (ie. the drive of the current directory).
bar is a relative path; the full path is created by adding the current directory.
c:bar is a drive relative path. Note that the case where c: is the drive of the current directory is rather easy; it's implemented the same way as the case just below (relative path). In the rest of this chapter, drive relative path will only cover the case where the drive in the path isn't the drive of the default directory. The resolution of this to a full pathname defers according to the version of Windows, and some parameters. Let's take some time browsing through these issues. On Windows 9x (as well as on DOS), the system maintains a process wide set of default directories per drive. Hence, in this case, it will resolve c:bar to the default directory on drive c: plus file bar. Of course, the default per drive directory is updated each time a new current directory is set (only the current directory of the drive specified is modified). On Windows NT, things differ a bit. Since NT implements a namespace for file closer to a single tree (instead of 26 drives), having a current directory per drive is a bit ackward. Hence, Windows NT default behavior is to have only one current directory across all drives (in fact, a current directory expressed in the global tree) - this directory is of course related to a given process -, c:bar is resolved this way:
If c: is the drive of the default directory, the final path is the current directory plus bar.
Otherwise it's resolved into c:\bar.
In order to bridge the gap between the two
implementations (Windows 9x and NT), NT adds a bit of
complexity on the second case. If the
=C:
environment variable is defined, then
it's value is used as a default directory for drive
C:. This is handy, for example,
when writing a DOS shell, where having a current drive
per drive is still implemented, even on NT. This
mechanism (through environment variables) is implemented
on CMD.EXE, where those variables are
set when you change directories with the
cd. Since environment variables are
inherited at process creation, the current directories
settings are inherited by child processes, hence
mimicing the behavior of the old DOS shell. There's no
mechanism (in NTDLL or
KERNEL32) to set up, when current
directory changes, the relevant environment variables.
This behavior is clearly band-aid, not a full featured
extension of current directory behavior.
\\host\share is UNC (Universal Naming Convention) path, ie. represents a file on a remote share.
\\.\device denotes a physical device installed in the system (as seen from the Win32 subsystem). A standard NT system will map it to the \??\device NT path. Then, as a standard configuration, \??\device is likely to be a link to in a physical device described and hooked into the \Device\ tree. For example, COM1 is a link to \Device\Serial0.
On some versions of Windows, paths were limited to
MAX_PATH
characters. To circumvent this,
Microsoft allowed paths to be 32,767
characters long, under the conditions that the path is
expressed in Unicode (no Ansi version), and that the path is
prefixed with \\?\. This convention is
applicable to any of the cases described above.
To summarize, what we've discussed so, let's put everything into a single table...
Table 8-1. DOS, Win32 and NT paths equivalences
Type of path | Win32 example | NT equivalent | Rule to construct |
---|---|---|---|
Full path | c:\foo\bar.txt | \Global??\C:\foo\bar.txt | Simple concatenation |
Absolute path | \foo\bar.txt | \Global??\J:\foo\bar.txt | Simple concatenation using the drive of the default directory (here J:) |
Relative path | gee\bar.txt | \Global??\J:\mydir\mysubdir\gee\bar.txt | Simple concatenation using the default directory (here J:\mydir\mysubdir) |
Drive relative path | j:gee\bar.txt |
|
|
UNC (Uniform Naming Convention) path | \\host\share\foo\bar.txt | \Global??\UNC\host\share\foo\bar.txt | Simple concatenation. |
Device path | \\.\device | \Global??\device | Simple concatenation |
Long paths | \\?\... | With this prefix, paths can take up to
32,767 characters, instead of
MAX_PATH for all the others). Once
the prefix stripped, to be handled like one of the
previous ones, just providing internal buffers large
enough).
|
We'll mainly cover in this section the way Wine opens a file (in the Unix sense) when given a Windows file name. This will include mapping the Windows path onto a Unix path (including the devices case), handling the access rights, the sharing attribute if any...
First of all, we described in previous section the way to convert any path in an absolute path. Wine implements all the previous algorithms in order to achieve this. Note also, that this transformation is done with information local to the process (default directory, environment variables...). We'll assume in the rest of this section that all paths have now been transformed into absolute from.
When Wine is requested to map a path name (in DOS form, with a drive letter, e.g. c:\foo\bar\myfile.txt), Wine converts this into the following Unix path $(WINEPREFIX)/dosdevices/c:/foo/bar/myfile.txt. The Wine configuration process is responsible for setting $(WINEPREFIX)/dosdevices/c: to be a symbolic link pointing to the directory in Unix hierarchy the user wants to expose as the C: drive in the DOS forest of drives.
This scheme allows:
a very simple algorithm to map a DOS path name into a Unix one (no need of Wine server calls)
a very configurable implementation: it's very easy to change a drive mapping
a rather readable configuration: no need of sophisticated tools to read a drive mapping, a ls -l $(WINEPREFIX)/dosdevices says it all.
This scheme is also used to implement UNC path names. For example, Wine maps \\host\share\foo\bar\MyRemoteFile.txt into $(WINEPREFIX)/dosdevices/unc/host/share/foo/bar/MyRemoteFile.txt. It's then up to the user to decide where $(WINEPREFIX)/dosdevices/unc/host/share shall point to (or be). For example, it can either be a symbolic link to a directory inside the local machine (just for emulation purpose), or a symbolic link to the mount point of a remote disk (done through Samba or NFS), or even the real mount point. Wine will not do any checking here, nor will help in actually mounting the remote drive.
We've seen how Wine maps a drive letter or a UNC path onto the Unix hierarchy, we now have to look on a the filename is searched within this hierarchy. The main issue is about case sensivity. Here's a reminder of the various properties for the file systems in the field.
Table 8-2. File systems' properties
FS Name | Length of elements | Case sensitivity (on disk) | Case sensitivity for lookup |
---|---|---|---|
FAT, FAT16 or FAT32 | Short name (8+3) | Names are always stored in upper-case | Case insensitive |
VFAT | Short name (8+3) + alias on long name | Short names are always stored in upper-case. Long names are stored with case preservation. | Case insensitive |
NTFS | Long name + alias on short name (8+3). | Long names are stored with case preservation. Short names are always stored in upper-case. | Case insentivite |
Linux FS (ext2fs, ext3fs, reiserfs...) | Long name | Case preserving | Case sensitive |
Case sensitivity vs. preservation: When we say that most systems in NT are case insensitive, this has to be understood for looking up for a file, where the matches are made in a case insensitive mode. This is different from VFAT or NTFS "case preservation" mechanism, which stores the file names as they are given when creating the file, while doing case insensitive matches.
As Windows, at the early days, didn't support the notion of symbolic links on directories, lots of applications (and some old native DLLs) are not ready for this feature. Mainly, they imply that the directory structure is a tree, which has lots of consequences on navigating in the forest of directories (ie: there cannot be two ways for going from directory to another, there cannot be cycles...). In order to prevent some bad behavior for such applications, Wine sets up an option. By default, symbolic links on directories are not followed by Wine. There's an options to follow them (see the Wine User Guide), but this could be harmful.
Wine considers that Unix file names are long filename. This seems a reasonable approach; this is also the approach followed by most of the Unix OSes while mounting Windows partitions (with filesystems like FAT, FAT32 or NTFS). Therefore, Wine tries to support short names the best it can. Basically, they are two options:
The filesystem on which the inspected directory lies in a real Windows FS (like FAT, or FAT32, or NTFS) and the OS has support to access the short filename (for example, Linux does this on FAT, FAT32 or VFAT). In this case, Wine makes full use of this information and really mimics the Windows behavior: the short filename used for any file is the same than on Windows.
If conditions listed above are not met (either, FS has no physical short name support, or OS doesn't provide the access access to the short name), Wine decides and computes on its own the short filename for a given long filename. We cannot ensure that the generated short name is the same than on Windows (because the algorithm on Windows takes into account the order of creation of files, which cannot be implemented in Wine: Wine would have to cache the short names of every directory it uses!). The short name is made up of part of the long name (first characters) and the rest with a hashed value. This has several advantages:
The algorithm is rather simple and low cost.
The algorithm is stateless (doesn't depend of the other files in the directory).
The algorithm isn't the same as on Windows, which means a program cannot use short names generated on Windows. This could happen when copying an existing installed program from Windows (for example, on a dual boot machine).
Two long file names can end up with the same short name (Windows handles the collision in this case, while Wine doesn't). We rely on our hash algorithm to lower at most this possibility (even if it exists).
Wine also allows in most file API to give as a parameter a full Unix path name. This is handy when running a Wine (or Winelib) program from the command line, and one doesn't need to convert the path into the Windows form. However, Wine checks that the Unix path given can be accessed from one of the defined drives, insuring that only part of the Unix / hierarchy can be accessed.
As a side note, as Unix doesn't widely provide a Unicode interface to the filenames, and that Windows implements filenames as Unicode strings (even on the physical layer with NTFS, the FATs variant are ANSI), we need to properly map between the two. At startup, Wine defines what's called the Unix Code Page, that's is the code page the Unix kernel uses as a reference for the strings. Then Wine uses this code page for all the mappings it has to do between a Unicode path (on the Windows side) and a Ansi path to be used in a Unix path API. Note, that this will work as long as a disk isn't mounted with a different code page than the one the kernel uses as a default.
We describe below how Windows devices are mapped to Unix devices. Before that, let's finish the pure file round-up with some basic operations.
Now that we have looked how Wine converts a Windows pathname into a Unix one, we need to cover the various meta-data attached to a file or a directory.
In Windows, access rights are simplistic: a file can be read-only or
read-write. Wine sets the read-only flag if the file doesn't have
the Unix user-write flag set. As a matter of fact, there's no way
Wine can return that a file cannot be read (that doesn't exist under
Windows). The file will be seen, but trying to open it will return
an error. The Unix exec-flag is never reported. Wine doesn't use
this information to allow/forbid running a new process (as Unix does
with the exec-flag). Last but not least: hidden files. This exists
on Windows but not really on Unix! To be exact, in Windows, the
hidden flag is a metadata associated to any file or directoy; in
Unix, it's a convention based on the syntax of the file name
(whether it starts with a '.' or not). Wine implements two behaviors
(chosen by configuration). This impacts file names and directory
names starting by a '.'. In first mode
(ShowDotFile
is FALSE
), every
file or directory starting by '.' is returned with the hidden flag
turned on. This is the natural behavior on Unix (for
ls or even file explorer). In the second mode
(ShowDotFile
is TRUE
), Wine
never sets the hidden flag, hence every file will be seen.
Last but not least, before opening a file, Windows makes use of sharing attributes in order to check whether the file can be opened; for example, a process, being the first in the system to open a given file, could forbid, while it maintains the file opened, that another process opens it for write access, whereas open for read access would be granted. This is fully supported in Wine by moving all those checks in the Wine server for a global view on the system. Note also that what's moved in the Wine server is the check, when the file is opened, to implement the Windows sharing semantics. Further operation on the file (like reading and writing) will not require heavy support from the server.
The other good reason for putting the code for actually opening a file in the server is that an opened files in Windows is managed through a handle, and handles can only be created in Wine server!
Just a note about attributes on directories: while we can easily map
the meaning of Windows' FILE_ATTRIBUTE_READONLY
on a file, we cannot do it for a directory. Windows' semantic (when
this flag is set) means do not delete the directory, while the
w
attribute in Unix means don't write nor
delete it. Therefore, Wine uses an asymetric mapping here: if the
directory (in Unix) isn't writable, then Wine reports the
FILE_ATTRIBUTE_READONLY
attribute; on the other
way around, when asked to set a directory with
FILE_ATTRIBUTE_READONLY
attribute, Wine simply
does nothing.
Reading and writing are the basic operations on files. Wine of
course implements this, and bases the implementation on client
side calls to Unix equivalents (like read()
or write()
). Note, that the Wine server is
involved in any read or write operation, as Wine needs to
transform the Windows-handle to the file into a Unix file
descriptor it can pass to any Unix file function.
This is major operation in any file related operation. Basically,
each file opened (at the Windows level), is first opened in the
Wine server, where the fd is stored. Then, Wine (on client side)
uses recvmsg()
to pass the fd from the wine
server process to the client process. Since this operation could
be lengthy, Wine implement some kind of cache mechanism to send it
only once, but getting a fd from a handle on a file (or any other
Unix object which can be manipulated through a file descriptor)
still requires a round trip to the Wine server.
Windows provides file locking capabilities. When a lock is set
(and a lock can be set on any contiguous range in a file), it
controls how other processes in the system will have access to the
range in the file. Since locking range on a file are defined on a
system wide manner, its implementation resides in
wineserver. It tries to make use Unix file
locking (if the underlying OS and the mounted disk where the file
sits support this feature) with fcntl()
and
the F_SETLK
command. If this isn't
supported, then wineserver just pretends it
works.
There's no need (so far) to implement support (for files and
directories) for DeviceIoControl()
, even if
this is supported by Windows, but for very specific needs
(like compression management, or file system related information).
This isn't the case for devices (including disks), but we'll cover
this in the hereafter section related to devices.
Wine doesn't do any buffering on file accesses but rely on the underlying Unix kernel for that (when possible). This scheme is needed because it's easier to implement multiple accesses on the same file at the kernel level, rather than at Wine levels. Doing lots of small reads on the same file can turn into a performance hog, because each read operation needs a round trip to the server in order to get a file descriptor (see above).
Windows introduced the notion of overlapped I/O. Basically, it just means that an I/O operation (think read / write to start with) will not wait until it's completed, but rather return to the caller as soon as possible, and let the caller handle the wait operation and determine when the data is ready (for a read operation) or has been sent (for a write operation). Note that the overlapped operation is linked to a specific thread.
There are several interests to this: a server can handle several
clients without requiring multi-threading techniques; you can
handle an event driven model more easily (ie how to kill properly
a server while waiting in the lengthy read()
operation).
Note that Microsoft's support for this feature evolved along the various versions of Windows. For example, Windows 95 or 98 only supports overlapped I/O for serial and parallel ports, while NT supports also files, disks, sockets, pipes, or mailslots.
Wine implements overlapped I/O operations. This is mainly done by queueing in the server a request that will be triggered when something the current state changes (like data available for a read operation). This readiness is signaled to the calling processing by queueing a specific APC, which will be called within the next waiting operation the thread will have. This specific APC will then do the hard work of the I/O operation. This scheme allows to put in place a wait mechanism, to attach a routine to be called (on the thread context) when the state changes, and to be done is a rather transparent manner (embedded any the generic wait operation). However, it isn't 100% perfect. As the heavy operations are done in the context of the calling threads, if those operations are lengthy, there will be an impact on the calling thread, especially its latency. In order to provide an effective support for this overlapped I/O operations, we would need to rely on Unix kernel features (AIO is a good example).
We've covered so far the ways file names are mapped into Unix paths. There's still need to cover it for devices. As a regular file, devices are manipulated in Windows with both read / write operations, but also control mechanisms (speed or parity of a serial line; volume name of a hard disk...). Since, this is also supported in Linux, there's also a need to open (in a Unix sense) a device when given a Windows device name. This section applies to DOS device names, which are seen in NT as nicknames to other devices.
Firstly, Wine implements the Win32 to NT mapping as described above, hence every device path (in NT sense) is of the following form: /??/devicename (or /DosDevices/devicename). As Windows device names are case insensitive, Wine also converts them to lower case before any operation. Then, the first operation Wine tries is to check whether $(WINEPREFIX)/dosdevices/devicename exists. If so, it's used as the final Unix path for the device. The configuration process is in charge of creating for example, a symbolic link between $(WINEPREFIX)/dosdevices/PhysicalDrive0 and /dev/hda0. If such a link cannot be found, and the device name looks like a DOS disk name (like C:), Wine first tries to get the Unix device from the path $(WINEPREFIX)/dosdevices/c: (i.e. the device which is mounted on the target of the symbol link); if this doesn't give a Unix device, Wine tries whether $(WINEPREFIX)/dosdevices/c:: exists. If so, it's assumed to be a link to the actual Unix device. For example, for a CD Rom, $(WINEPREFIX)/dosdevices/e:: would be a symbolic link to /dev/cdrom. If this doesn't exist (we're still handling the a device name of the C: form), Wine tries to get the Unix device from the system information (/etc/mtab and /etc/fstab on Linux). We cannot apply this method in all the cases, because we have no insurance that the directory can actually be found. One could have, for example, a CD Rom which he/she want only to use as audio CD player (ie never mounted), thus not having any information of the device itself. If all of this doesn't work either, some basic operations are checked: if the devicename is NUL, then /dev/null is returned. If the device name is a default serial name (COM1 up to COM9) (resp. printer name LPT1 up to LPT9), then Wine tries to open the Nth serial (resp. printer) in the system. Otherwise, some basic old DOS name support is done AUX is transformed into COM1 and PRN into LPT1), and the whole process is retried with those new names.
To sum up:
Table 8-3. Mapping of Windows device names into Unix device names
Windows device name | NT device name | Mapping to Unix device name |
---|---|---|
<any_path>AUX | >\Global??\AUX | Treated as an alias to COM1 |
<any_path>PRN | \Global??\PRN | Treated as an alias to LPT1 |
<any_path>COM1 | \Global??\COM1 | $(WINEPREFIX)/dosdevices/com1 (if the symbol link exists) or the Nth serial line in the system (on Linux, /dev/ttyS0). |
<any_path>LPT1 | \Global??\LPT1 | $(WINEPREFIX)/dosdevices/lpt1 (if the symbol link exists) or the Nth printer in the system (on Linux, /dev/lp0). |
<any_path>NUL | \Global??\NUL | /dev/null |
\\.\E: | \Global??\E: | $(WINEPREFIX)/dosdevices/e:: (if the symbolic link exists) or guessing the device from /etc/mtab or /etc/fstab. |
\\.\<device_name> | \Global??\<device_name> | $(WINEPREFIX)/dosdevices/<device_name> (if the symbol link exists). |
Now that we know which Unix device to open for a given Windows device, let's cover the operation on it. Those operations can either be read / write, io control (and even others).
Read and write operations are supported on Real disks & CDROM devices, under several conditions:
Foremost, as the ReadFile()
and
WriteFile()
calls are mapped onto the
Unix read()
and
write()
calls, the user (from the Unix
perspective of the one running the Wine executable) must have
read (resp. write) access to the device. It wouldn't be wise
to let a user write directly to a hard disk!!!
Blocks' size for read and write but be of the size of a physical block (generally 512 for a hard disk, depends on the type of CD used), and offsets must also be a multiple of the block size.
Wine also reads (if the first condition above about access rights is met) the volume information from a hard disk or a CD ROM to be displayed to a user.
Wine also recognizes VxD as devices. But those VxD must be the
Wine builtin ones (Wine will never allow to load native VxD). Those
are configured with symbolic links in the
$(WINEPREFIX)/dosdevices/ directory, and point
to the actual builtin DLL. This DLL exports a single entry point,
that Wine will use when a call to
DeviceIoControl
is made, with a handle opened
to this VxD. This allows to provide some kind of compatibility for
old Win9x apps, still talking directly to VxD. This is no longer
supported on Windows NT, newest programs are less likely to make use
of this feature, so we don't expect lots of development in this
area, even though the framework is there and working. Note also that
Wine doesn't provide support for native VxDs (as a game, report how
many times this information is written in the documentation; as an
advanced exercise, find how many more occurrences we need in order to
stop questions whether it's possible or not).