Basics ====== The **fluentfs** library provides a functional and fluent interface for filesystem interactions. If you want to follow along, you will need to import fluentfs first:: import fluentfs as fs File-like objects ----------------- There are currently four classes for representing "file-like objects" - ``FileLike``, ``Dir``, ``File`` and ``TextFile``. Files and directories ~~~~~~~~~~~~~~~~~~~~~ A ``Dir`` represents a directory:: >>> fs.Dir("/home/username/somedir") Dir("/home/username/somedir") A ``File`` represents a regular file (either a binary or a text file):: >>> fs.File("/home/username/somedir/myfile.txt") File("/home/username/somedir/myfile.txt") Both ``Dir`` and ``File`` are ``FileLike`` objects (i.e. ``Dir`` and ``File`` inherit from ``FileLike``):: >>> issubclass(fs.Dir, fs.FileLike) True >>> issubclass(fs.File, fs.FileLike) True The FileLike class ~~~~~~~~~~~~~~~~~~ Generally speaking, a ``FileLike`` object is anything that is "like a file", i.e. has the ``path`` and ``name`` properties. The ``path`` property is the *maximally expanded* path of the respective ``FileLike`` object. This means that ``path`` is absolute and all environment variables, special characters (like `~`) are expanded. The ``name`` property is the base name of the ``FileLike`` object, i.e. the file name or the directory base name. For example, let's say that `~` points to `/home/username` and you have an environment variable `DIRNAME` with the value `somedir`. Then you can do the following:: >>> dir = fs.Dir("~/$DIRNAME") Dir("/home/username/somedir") >>> dir.path '/home/username/somedir' >>> dir.name 'somedir' Both ``Dir`` and ``File`` implement additional methods and properties which are useful when dealing with files or directories. For example:: >>> file = dir.File("myfile.txt") File("/home/username/somedir/myfile.txt") >>> file.extension 'txt' >>> file.mod_time datetime.datetime(2022, 10, 26, 16, 37, 37, 533307) You can have a look at the guide or at the documentation of ``Dir`` and ``File`` for more. Text files ~~~~~~~~~~ Text files are represented by ``TextFile`` (which inherits from ``File``). You can create a ``TextFile`` from a ``File`` object using the ``text_file`` function:: >>> text_file = file.text_file() TextFile("/home/username/somedir/myfile.txt") By default the encoding is assumed to be UTF-8, but you can pass an ``encoding`` argument to ``text_file`` to change that. Note that the ``text_file`` function doesn't attempt to check if the underlying file is actually a text file in the given encoding (since that would require trying to decode the entire file, which could get very expensive). Therefore a call to ``text_file`` will always succeed, but if the underlying file isn't actually a text file, you will get errors when calling functions that require decoding the content of the file. For a simple example, let us say that this is the content of the UTF-8 encoded file ``myfile.txt`` on a system where line endings are represented by ``\n``: | Roses are red. | Violets are blue. | fluentfs is a great library, | and I am too lazy to rhyme something at you. Then this would be the output of some of the text file functions and properties:: >>> text_file.content 'Roses are red.\nViolets are blue.\nfluentfs is a great library,\nand I am too lazy to rhyme something at you.\n' >>> text_file.encoding 'utf-8' >>> text_file.char_count 104 >>> text_file.line_count 4 >>> text_file.word_count 21 >>> text_file.max_line_len 44 You can have a look at the documentation of ``TextFile`` for more. Functional iterators -------------------- The FunctionalIterator class ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The functions and properties we had a look at so far returned a single object. For example the ``word_count`` property returned the number of words in a file. Let's try to get the actual words of a text file:: >>> text_file.words If a function in ``fluentfs`` returns multiple objects, it usually doesn't return a ``list`` or ``tuple`` or ``set`` - instead it returns an instance of ``FunctionalIterator``. You can convert such an object to a list using the ``list`` function:: >>> text_file.words.list() ['Roses', 'are', 'red.', 'Violets', 'are', 'blue.', 'fluentfs', 'is', 'a', 'great', 'library,', 'and', 'I', 'am', 'to', 'lazy', 'to', 'rhyme', 'something', 'at', 'you.'] The FileIterator class ~~~~~~~~~~~~~~~~~~~~~~ The most important instances of ``FunctionalIterator`` are iterators of file-like objects. A ``FileIterator`` object represents an iterator which iterates over files and implements various useful methods for working with files. The most common way to get a ``FileIterator`` is via the ``files`` property of a ``Dir`` object. Do note that ``files`` will contain the files of the given directory along with all its **subdirectories**. Let's assume that the ``/home/username/somedir`` has the text files ``a.txt``, ``b.txt``, ``c.py`` and the binary files ``bin1`` and ``bin2``:: files = fs.Dir("/home/username/somedir").files >>> files.list() >>> [File("/home/username/somedir/a.txt"), File("/home/username/somedir/b.txt"), File("/home/username/somedir/bin1"), File("/home/username/somedir/bin2"), File("/home/username/somedir/c.py")] You can then call various useful methods on the ``FileIterator`` object:: >>> files.len() 5 There are two important things to realize about ``FunctionalIterator``. First, it has a lot of higher-order functions available which makes it very flexible and powerful - thus the **functional** part of fluentfs. Seconds, a lot of methods implemented by it return instances of ``FunctionalIterator`` (or its subclasses like ``FileIterator``) which enables you to build function chains - thus the **fluent** part of fluentfs. Combined together, this means that you can build filter-map-reduce chains of functions:: >>> fs.Dir("/home/username/somedir").files.filter(lambda f: f.extension == "txt") >>> fs.Dir("/home/username/somedir").files.filter(lambda f: f.extension == "txt").map(lambda f: f.byte_count) >>> fs.Dir("/home/username/somedir").files.filter(lambda f: f.extension == "txt").map(lambda f: f.byte_count).reduce(lambda x, y: x + y, 0) 70 Instead of writing your own functions to pass to filter, map and reduce you can also often use functions already provided for you by fluentfs to accomplish the same task:: >>> fs.Dir("/home/username/somedir").files.filter_extension("txt").map_byte_count().sum() 70 The TextFileIterator class ~~~~~~~~~~~~~~~~~~~~~~~~~~ A ``TextFileIterator`` object represents an iterator which iterates over text files and implements various useful methods for working with text files. For example it has the ``map_line_count`` method, which a regular ``FileIterator`` does not have (since binary files usually do not have the concept of lines). You can get a ``TextFileIterator`` from a ``FileIterator`` by calling ``text_file_iterator`` on it:: text_file_it = some_iterator_with_text_files.text_file_iterator() Just like with ``text_file()`` this will always succeed, but subsequent method calls may fail if the underlying files are not actually text files. This means that you are responsible to remove all binary files from the iterator before calling ``text_file_iterator`` (see below for how to do that). You can use the filter function to ensure that you file iterator contains text files only. For example if you know that all files with the ``txt`` extension are text files (which they hopefully are), you can do the following:: >>> fs.Dir(".").files.filter_extension("txt").text_file_iterator() As soon as you have a ``TextFileIterator``, you can call the respective methods. This is how you would get the total number of lines in all your ``txt`` files:: >>> fs.Dir(".").files.filter_extension("txt").text_file_iterator().map_line_count().sum() 6 Note that the ``text_file_iterator`` function takes two arguments - the encoding and whether to raise an error if a file cannot be decoded. By default ``encoding`` is assumed to be UTF-8 and ``raise_on_decode_error`` is assumed to be ``True`` (i.e. if a file cannot be decoded, an error will be raised). Further reading --------------- All of this allows you to accomplish common tasks in a single line of clear and readable code. That's the power of a functional and fluent interface for the filesystem! To go beyond the basics, check out the guide. To view some recipes for common tasks, check out the recipes. You can also have a look at the API documentation.