background preloader

Filesystems compatibility amongst Linux, Windows and OS/X

Facebook Twitter

Filesystem Tutorial. Introduction Preliminaries Reporting the size of a file - (tut1.cpp) Using status queries to determine file existence and type - (tut2.cpp) Directory iteration plus catching exceptions - (tut3.cpp) Using path decomposition, plus sorting results - (tut4.cpp) Class path: Constructors, including Unicode - (tut5.cpp) Class path: Generic format vs.

Filesystem Tutorial

Native format Class path: Iterators, observers, composition, decomposition, and query - (path_info.cpp) Error reporting Introduction This tutorial develops a little command line program to list information about files and directories - essentially a much simplified version of the POSIX ls or Windows dir commands. We'll start with the simplest possible version and progress to more complex functionality.

Along the way we'll digress to cover topics you'll need to know about to understand Boost.Filesystem. Source code for each of the tutorial programs is available, and you are encouraged to compile, test, and experiment with it. Things That Shouldn't Be in File Names for $1,000 Alex. > Queue Jeopardy music and image of Alex Trebek < Space, mixed case, slash, backslash, question mark, colon, asterisk, quotation mark and control codes.

Dan Pouliot. Linux/Windows/Unix/... file names: Which characters are allowed? Which are unescaped. NTFS. NTFS (New Technology File System[1]) is a proprietary file system developed by Microsoft.[1] Starting with Windows NT 3.1, it is the default file system of Windows NT family.[7] History[edit] In the mid-1980s, Microsoft and IBM formed a joint project to create the next generation of graphical operating system.


The result of the project was OS/2, but Microsoft and IBM disagreed on many important issues and eventually separated: OS/2 remained an IBM project and Microsoft worked on Windows NT. The OS/2 file system HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS.[8] Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition identification type code (07).

Developers[edit] How to make filenames NTFS compatible. Let's assume you've a bunch of files (in a directory tree) on a linux/unix system and you'd like to copy them over to a Windows NTFS filesystem.

How to make filenames NTFS compatible

The latter allows a lot less characters in filenames (and directory names), then linux/unix. The following code goes through the entire tree (starting with the current working directory) and removes all invalid characters from directory entries. Note that it relies on a few non-standard extensions (eg. not all find implementations have a -print0 option. P.S.: I used David's writeup on how to process directory entries correctly and the Wikipedia article on NTFS for the list of valid characters. P.S.2: Beware that simply removing invalid characters might result in data loss since several filenames can be converted to the same string this way.

Recommendations for Limitations on Image Filenaming for managing image collections and image databases. Recommendations for Limitations on Image Filenaming To reinterate from the filenaming page, the most important thing that a filename can do for your image collection is to provide a form of unique identification ( or [UID] ) for each digital "asset.

Recommendations for Limitations on Image Filenaming for managing image collections and image databases

" However if you wish to be able to exchange your image files with clients or colleagues (often using different computer operating systems), then you need to observe some standards for cross-platform compatibility to ensure maximum portability. Here are some recommendations to avoid potential problems. File System Functionality Comparison. The following tables list functionality and feature support comparisons for the four main Windows file systems, NTFS, exFAT, UDF, and FAT32: Functionality Windows Server 2003 and Windows XP: The NTFS last access time stamp field is updated.

File System Functionality Comparison

Limits Journaling and Change Log Block Allocation Features Security. Naming Files, Paths, and Namespaces. All file systems supported by Windows use the concept of files and directories to access data stored on a disk or device.

Naming Files, Paths, and Namespaces

Windows developers working with the Windows APIs for file and device I/O should understand the various rules, conventions, and limitations of names for files and directories. Data can be accessed from disks, devices, and network shares using file I/O APIs. Files and directories, along with namespaces, are part of the concept of a path, which is a string representation of where to get the data regardless if it's from a disk or a device or a network connection for a specific operation. For additional information, see the following subsections:

Linux Box Admin. Exclusive content published on July 10, 2006 In 1995, Microsoft added long file name support to Windows, allowing more descriptive names than the limited 8.3 DOS format.

Linux Box Admin

C# - How check if given string is legal (allowed) file name under Windows. Information about the characters that you cannot use in site names, folder names, and file names in SharePoint. XL2000: Workbook Name Contains Invalid Characters. In Microsoft Excel, you may see any of the following behaviors: The name of the workbook may contain either of the bracket characters: [ ]-or-The name on the sheet tab for the workbook contains the file name extension and a ] character preceding the sheet names, for example, .xls]Sheet1-or-When you rename a sheet that contains the file name extension and ] character, you receive the following error message: While renaming a sheet or chart, you entered an invalid name.

XL2000: Workbook Name Contains Invalid Characters

Try one of the following: - Make sure the name you entered does not exceed 31 characters. - Make sure the name does not contain any of the following characters: : \ / ? * [ or ] - Make sure you did not leave the name blank. C# - How check if given string is legal (allowed) file name under Windows. C# - How check if given string is legal (allowed) file name under Windows. Mac/Linux/Windows file name friction. In 1993, Microsoft added long file name support to Windows NT 3.1, allowing more descriptive names than the limited 8.3 DOS format.

Mac/Linux/Windows file name friction

Mac users scoffed, having had long file names for nearly a decade, and because Windows still stored a DOS file name in the background. Linux was born with long file name a couple of years before it showed up in Windows. Today, long file names are well supported by all three operating systems though key differences remain. Linux is the most sensitive One of first culture shocks for people moving from Windows to Linux is the case sensitivity of file names. The Mac OS X HFS+ and Windows NTFS file systems are case preserving, but not case sensitive.

Linux filename guidelines. A file name, also called a filename, is a string (i.e., a sequence of characters) that is used to identify a file.

Linux filename guidelines

A file is a collection of related information that appears to the user as a single, contiguous block of data and that is retained in storage, e.g., a hard disk drive (HDD), floppy disk, optical disk or magnetic tape. Names are given to files on Unix-like operating systems to enable users to easily identify them and to facilitate finding them again in the future. However, file names are only a convenience for users, and such operating systems identify files by their inodes, which are numbers that are stored on the HDD in inode tables and which exist for all types of files, rather than by their names or locations in directories. This is somewhat analogous to the domain names that are used on the Internet to identify web sites.

Non-decodable Bytes in System Character Interfaces. PEP Index> PEP 383 -- Non-decodable Bytes in System Character Interfaces File names, environment variables, and command line arguments are defined as being character data in POSIX; the C APIs however allow passing arbitrary bytes - whether these conform to a certain encoding or not. This PEP proposes a means of dealing with such irregularities by embedding the bytes in character strings in such a way that allows recreation of the original byte string. The C char type is a data type that is commonly used to represent both character data and bytes. Certain POSIX interfaces are specified and widely understood as operating on character data, however, the system call interfaces make no assumption on the encoding of these data, and pass them on as-is. With Python 3, character strings use a Unicode-based internal representation, making it difficult to ignore the encoding of byte strings in the same way that the C interfaces can ignore the encoding.

Glindra: Command Line File Handling and ASCII Tools. Filename Cleanup Options. Options These options can be specified with the cop and rena commands, and are used to clean up filenames. They would typically be used with a wildcard input file specification. None of these options are on by default, so these transformations will not be carried out unless you explicitly ask for them. Example > rena '*.*' -lower -portable Changes all filenames in the directory to lower case, and converts national letters and characters that are likely to cause problems on some platforms.

Detox(1): clean up filenames. Fixing Unix/Linux/POSIX Filenames: Control Characters (such as Newline), Leading Dashes, and Other Problems. David A. Wheeler.