The glob Call in C Programming

The glob Call in C Programming

The glob Call in C Programming

The glob call provides a sorted list of path names matched by the pattern. It also contains a number of fields that record information about the scan, including a pointer to a word vector (pglob->gl_pathv) and its size.

The glob call can fail for many reasons. For example, the underlying functions might not be available or they may fail to allocate memory.

Pathnames

The glob Call command in the earliest versions of Unix, as implemented by the shell, expands wildcard characters used in unquoted arguments to a program. Its functionality was later provided as a C library function, glob(), for use by other programs such as the shell.

The function glob() takes a pointer to a pathname pattern that is to be expanded, and matches all accessible pathnames against it. It then develops a list of all the pathname components that match. To do this, it passes the pattern to a function that implements the rules of pathname pattern expansion used by the shell, such as wordexp(3).

This function calls Abs, Base and Clean to extract the components of the pathname that match the pattern, eliminating multiple separators where possible, such as or , and replacing any doubled letters with single ones. It also calls fnmatch() to test for whether the string matches the pattern, as well as opmatch() to apply a specific set of globbing rules.

Finally, a list of pointers to the pathnames that match the pattern is built and stored in dynamically allocated storage. It is sorted in the order specified by the current setting of the LC_COLLATE category (see the XBD specification, LC_COLLATE). The number of matched pathnames is returned in the field pglob->gl_pathc and a list of pointers to the pathnames in pglob->gl_pathv.

Options

The glob() function has several options that allow the user to customize how matches are found and treated. For example, if an application wants to match only files that were modified recently or have a specific permission mode, these can be set by using the optional globOptions object.

glbOptions The globOptions object is an array of parsed immutable Pattern objects. This object is used to configure the behavior of glob() and is exposed as a TypeScript interface.

options The following options are available:

symlinksDotRelative: Prepends all relative path strings with./ (or on Windows systems). This option is used by applications that need to match relative patterns and want to prevent them from matching the empty string.

nobrace: Do not expand a,b brace sets in directory matches. This option is useful for applications that need to match a wildcard but do not need to handle tilde expansion or parameter expansion.

nomagic: This option forces the glob() implementation to behave in a manner that is as faithful as possible to Bash pattern expansion semantics. This will reduce performance, but may be necessary in some cases.

npm package npm-glob provides a high-performance, native glob implementation for Node. It aims to be as close to the behavior of the shell as possible and uses an optimization pass to make sure it is as fast as possible while maintaining full compatibility with the POSIX standard.

Errors

glob() searches a set of directories and reports the number of pathnames it finds that match pattern. It stores the results in dynamically allocated storage, which must be freed by calling globfree(). glob() does not do any tilde expansion or parameter substitution; you must use wordexp(3) for that.

When an error occurs, glob() sets errfunc to point to the function that should handle it (see glob(7) for details). Errno is also set for any errors encountered by malloc(3) or opendir(3).

If errfunc returns 0 the scan continues; if it returns non-zero, it stops scanning and sets GLOB_ERR. If the error is in a readdir() call, the value stored in the directory’s cache will be used; otherwise a new directory will be searched.

globbing is susceptible to race conditions, since it involves walking directories and the contents of files may change between when glob looks at them and when it reports its findings. To reduce the chance of races, globbing is implemented in a multithreaded library, libglob(), which uses locks to protect the underlying memory.

glob() supports the double-star character **, but only if it is the only element in a path part. This is to avoid ambiguity with back-slashes, which are interpreted as escape characters in some systems. Also, symlinked directories are not traversed as part of a **, to prevent infinite loops and duplicate file names.

Returns

The function glob() searches for all pathnames matching the pattern in the current directory. It puts the result in a vector and stores both its address and size (len(the number of elements, not counting the terminating null pointer)) in the variable *vector-ptr.

Normally, glob sorts the path names alphabetically before returning them. You can turn this off with the flag GLOB_NOSORT to save processing time. Generally, it’s a good idea to let glob sort them–your application will have a better feel for the rate of progress as it scans through the list of matches.

If the scan fails, glob calls the callback function errfunc with the arguments epath, a pointer to the path which failed, and eerrno, the value of eerrno returned by one of the functions opendir(3), readdir(3), or stat(2). The errfunc function can set GLOB_ERR to indicate that the error was caused by an invalid pattern or by a file that cannot be opened.

Each pathname in the list which matched the pattern has a slash appended if GLOB_APPEND is set. This enables applications to use the same pattern in multiple calls. However, the new pathnames from a subsequent call must not be sorted together with the previous ones–this mimics how the shell handles multiple tilde expansions and parameter substitutions on a single command line.

Admin

Read Previous

What Is a Digital Twin?

Read Next

Anshuman Singh Biography

Leave a Reply

Your email address will not be published. Required fields are marked *