ILD

kernel file operation data struct and api
作者:Yuan Jianpeng 邮箱:yuanjp89@163.com
发布时间:2025-2-20 站点:Inside Linux Development

本文学习如何在内核进行文件的读写等操作。


内核结构体

本文又引入路径、文件相关的结构体如下:

struct filename

这个结构体是内核的路径的一种表示。内核提供接口将路径字符串转换成这个结构体。



struct filename *
getname(const char __user * filename)
{
        return getname_flags(filename, 0);
}

struct filename *
getname_kernel(const char * filename)


getname将用户态的字符串指针转换成struct filename

getname_kernel将内核态的字符串指针转换成struct filename

这两个接口定义在fs/namei.c


struct path

这个结构体表示内核的一个路径。前面已经学习过了。它只包含两个成员:vfsmount/dentry。表示挂载点及内部的目录树。


内核导出了kern_path接口,将路径字符串转换成path

int kern_path(const char *name, unsigned int flags, struct path *path)
{
        struct filename *filename = getname_kernel(name);
        int ret = filename_lookup(AT_FDCWD, filename, flags, path, NULL);

        putname(filename);
        return ret;

}
EXPORT_SYMBOL(kern_path);

如果路径是来自用户态的指针,则使用

int user_path_at(int dfd, const char __user *name, unsigned flags,
                 struct path *path)
{
        struct filename *filename = getname_flags(name, flags);
        int ret = filename_lookup(dfd, filename, flags, path, NULL);

        putname(filename);
        return ret;
}
EXPORT_SYMBOL(user_path_at);

struct file

这个结构体表示一个打开的文件。定义在linux/fs.h,这个结构体非常大。


open/read/write

介绍内核如何进行文件的打开和读写。open接口定义在fs/open.c,read/write接口定义在fs/read_write.c

可以阅读这些源码文件,查看内核导出了哪些接口。

1 sys_open/sys_read系统调用接口

有些资料,使用sys_open/sys_read的方式:

int fd = sys_open(filename, O_RDONLY, 0);


这个是系统调用函数。但是在arm64内核上,开启了CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y,就没有这个接口了。见参考【4】。


include/linux/syscalls.h头文件中:

#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx);

只有没有定义CONFIG_ARCH_HAS_SYSCALL_WRAPPER,才会包含这个定义。定义了这个宏,会使用arch里面的:

#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
#include <asm/syscall_wrapper.h>
#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */


见:arch/arm64/include/asm/syscall_wrapper.h


所以在arm64架构上,这个方法不行。读取/proc/kallsyms,有下面这些接口:

# cat /proc/kallsyms  | grep sys_open
ffff80008003b754 W compat_sys_open_by_handle_at
ffff80008003bbdc W __arm64_sys_open_by_handle_at
ffff8000800c7374 t do_sys_openat2
ffff8000800c74dc T do_sys_open
ffff8000800c7514 T __arm64_sys_open
ffff8000800c7538 T __arm64_sys_openat
ffff8000800c755c T __arm64_sys_openat2
ffff8000800e74e0 T __arm64_sys_open_tree
ffff80008010b6b8 t proc_sys_open


根据参考文档中的信息,这些接口接收的用户态的指针,需要

mm_segment_t old_fs = get_fs();

set_fs(KERNEL_DS);


2 filep_open()

这个是正经的内核打开文件的接口,定语在fs/open.c

/**
 * filp_open - open file and return file pointer
 *
 * @filename:   path to open
 * @flags:      open flags as per the open(2) second argument
 * @mode:       mode for the new file if O_CREAT is set, else ignored
 *
 * This is the helper to open a file from kernelspace if you really
 * have to.  But in generally you should not do this, so please move
 * along, nothing to see here..
 */
struct file *filp_open(const char *filename, int flags, umode_t mode)
{
        struct filename *name = getname_kernel(filename);
        struct file *file = ERR_CAST(name);

        if (!IS_ERR(name)) {
                file = file_open_name(name, flags, mode);
                putname(name);
        }
        return file;
}
EXPORT_SYMBOL(filp_open);

内核读写,提供了:

ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)

ssize_t kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)


vfs_read/vfs_write是给用户态接口用的。它的参数为用户指针。


关闭file接口:filp_close()


3 kernel_read_file_from_path()/kernel_read_file_from_path_initns()

定义在fs/kernel_read_file.c


比如内核的firmware加载模块,会调用这个接口:

drivers/base/firmware_loader/main.c
fw_get_filesystem_firmware()


stat

使用:

$ grep EXPORT_SYMBOL fs/*.c

可以找出fs导出了哪些接口。


比如要读取一个文件的属性,fs/stat.c导出了接口:

int vfs_getattr(const struct path *path, struct kstat *stat,
                u32 request_mask, unsigned int query_flags)
{
        int retval;

        if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC))
                return -EPERM;

        retval = security_inode_getattr(path);
        if (retval)
                return retval;
        return vfs_getattr_nosec(path, stat, request_mask, query_flags);
}
EXPORT_SYMBOL(vfs_getattr)


create

fs/namei.c导出了创建普通文件的接口:

/**
 * vfs_create - create new file
 * @idmap:      idmap of the mount the inode was found from
 * @dir:        inode of the parent directory
 * @dentry:     dentry of the child file
 * @mode:       mode of the child file
 * @want_excl:  whether the file must not yet exist
 *
 * Create a new file.
 *
 * If the inode has been found through an idmapped mount the idmap of
 * the vfsmount must be passed through @idmap. This function will then take
 * care to map the inode according to @idmap before checking permissions.
 * On non-idmapped mounts or if permission checking is to be performed on the
 * raw inode simply pass @nop_mnt_idmap.
 */     
int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
               struct dentry *dentry, umode_t mode, bool want_excl)
{
        int error;

        error = may_create(idmap, dir, dentry);
        if (error)
                return error;

        if (!dir->i_op->create)
                return -EACCES; /* shouldn't it be ENOSYS? */

        mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG);
        error = security_inode_create(dir, dentry, mode);
        if (error)
                return error;
        error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
        if (!error)
                fsnotify_create(dir, dentry);
        return error;
}
EXPORT_SYMBOL(vfs_create);

vfs_create()的第一个参数是mnt_idmap。这个机制用来实现文件系统的用户id和内核实际id不一致的场景,比如docker。如果没有启用这个机制,可以传nop_mnt_idmap。


第二个参数是父目录的inode。

第三个参数是子文件的dentry,由于子文件还不存在,所以这是一个nagative dentry。


内核提供接口从路径创建dentry,可以参考do_mknodat()函数调用:filename_create()

 dentry = filename_create(dfd, name, &path, lookup_flags);

参考

【1】Linux Journal. Driving Me Nuts - Things You Never Should Do in the Kernel.

https://www.linuxjournal.com/article/8110


【2】Chris. Writing to a file from the Kernel.

https://benninger.ca/posts/writing-to-a-file-from-the-kernel/


【3】Slavaim.

https://github.com/slavaim/Linux-kernel-modules/blob/master/readfile/readfile.c


【4】Dominik Brodowski. syscalls: introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER

https://lkml.org/lkml/2018/4/5/143


【5】idmappings

https://www.kernel.org/doc/html/latest/filesystems/idmappings.html


【6】 Jake Edge. ID-mapped mounts

https://lwn.net/Articles/896255/



Copyright © linuxdev.cc 2017-2024. Some Rights Reserved.