What is the Linux fdatasync command really? It is not...
Read More
The Linux fdatasync command is one of those Linux topics that sounds more obscure than it really is.
At first glance, it looks like another storage command sitting somewhere beside sync and fsync, but there is an important distinction to make:
fdatasync is most commonly known as fdatasync(), a Linux system call used by applications, not usually a day to day terminal command that administrators type directly.
That distinction matters.
fdatasync() is about making file data durable. It tells Linux to push modified file data from memory to the underlying storage device, but without necessarily flushing every piece of file metadata unless that metadata is required to retrieve the data correctly.
In simple terms:
fdatasync() is like saying, “Make sure the actual file data is safely written to storage, but do not do unnecessary extra work unless it is needed.”
Why fdatasync Exists
Linux uses memory aggressively to improve performance.
When an application writes data to a file, that data may not immediately land on physical storage. It can sit in memory buffers first. This is normal and usually beneficial because storage devices are slower than RAM, and batching writes improves performance.
The problem is that there is a gap between:
“the application wrote the data”
and
“the data is safely on persistent storage.”
If the system crashes, loses power, or the storage device fails during that window, data may be lost or left incomplete.
This is where synchronization mechanisms like sync, fsync(), and fdatasync() come in.
fdatasync() is a more targeted version of that idea. It focuses on one open file descriptor and is designed for applications that care about data durability but do not always need every metadata change written immediately.
fdatasync vs fsync
The easiest way to understand fdatasync() is to compare it with fsync().
fsync() flushes both modified file data and associated metadata.
fdatasync() is similar, but more selective.
It does not flush modified metadata unless that metadata is required so the file data can be retrieved correctly later. For example, timestamp updates such as access time or modification time do not normally need to be flushed just to read the file data later. However, file size changes may require metadata flushing because the system needs that information to retrieve the file properly.
That gives us the practical difference:
fsync() is stronger and broader.
fdatasync() is narrower and can be more efficient.
Why Metadata Matters
Metadata is information about the file rather than the file’s actual contents.
Examples include:
- File size.
- Timestamps.
- Ownership.
- Permissions.
- Directory entries.
- Inode related information.
If you update a file’s contents, the data itself matters most. But sometimes metadata matters too.
If a file grows from 1 MB to 2 MB, the file size metadata must be durable as well. Otherwise, after a crash, the system may not know how to retrieve the newly written data correctly.
That is why fdatasync() does not ignore metadata completely. It simply avoids flushing metadata that is not required for data retrieval.
This is the key point:
fdatasync() is not “unsafe fsync.”
It is a more focused sync operation.
Is fdatasync A Normal Linux Command?
Not usually. For most Linux users, fdatasync is not a standalone command like ls, cp, sync, or rsync.
It is primarily a system call or programming interface used by software that needs durability guarantees.
That said, some user facing Linux tools expose fdatasync style behavior through options.
For example, GNU sync supports the -d or --data option, which uses fdatasync(2) to sync only file data and any metadata required for filesystem consistency.
Example:
sync -d important-file.txt
This is probably the most practical shell level way many administrators will encounter fdatasync behavior.
Another example is dd, which supports conv=fdatasync.
Example:
dd if=image.iso of=/dev/sdX bs=4M status=progress conv=fdatasync
That tells dd to make sure the output file data is physically written before the command exits.
fdatasync vs sync
This is where the naming can get confusing.
sync is the user facing command many Linux users already know. It asks the kernel to flush cached writes to persistent storage.
When files are specified, GNU Coreutils says sync synchronizes those files using fsync(2) by default, and the --data option switches to fdatasync(2).
So the simple breakdown is:
sync is the shell command.
fsync() is a system call that flushes file data and metadata.
fdatasync() is a system call that flushes file data and only the necessary metadata.
This is why my article What Is The Linux Sync Command & What Does It Do? is a natural companion to this topic because it explains how Linux cached writes work from the administrator’s perspective.
When fdatasync Makes Sense
fdatasync() makes sense when an application needs stronger data safety but also wants to avoid unnecessary metadata write overhead.
Databases
Databases care deeply about durability.
When a transaction is committed, the database needs confidence that the important data or journal entry has reached storage.
Using fdatasync() can help reduce unnecessary metadata syncing while still pushing critical data out of volatile memory.
Log Files
For certain logging workloads, the actual log contents may matter more than every timestamp update.
In those cases, fdatasync() can be a reasonable balance between safety and performance.
Backup and Imaging Workflows
Tools that write large files, disk images, or backup archives may expose fdatasync behavior to ensure output data is committed before the process finishes.
This is why dd conv=fdatasync is useful when writing images or large output files.
Application Level File Updates
Applications that perform careful file writes may use fdatasync() as part of a safer write process.
For example, an application may write data, flush it, and then perform controlled rename or metadata operations depending on the required durability model.
When fsync Is Still Better
fdatasync() is useful, but it is not always the right tool.
Use fsync() when metadata durability matters too.
For example, if your process creates a new file, renames a file, updates file permissions, or depends on directory entries being durable, fsync() and sometimes an additional directory fsync() may be required.
That detail matters in crash safe application design.
It is one of those Linux reliability details that separates:
“it worked in testing”
from
“it survived a real power loss.”
For a deeper look at the broader fsync() side of this topic, this related article fits perfectly: What Is The Linux fsync Command?
Performance vs Durability
The reason fdatasync() exists is performance.
Flushing data to persistent storage is expensive compared to writing into memory. If an application calls fsync() or fdatasync() too often, performance can drop heavily, especially on slower disks or busy filesystems.
But avoiding sync operations completely can risk data loss during crashes.
That is the tradeoff:
Performance wants fewer forced flushes.
Durability wants more intentional flushes.
Good system design sits in the middle.
You do not want to blindly sync everything all the time, but you also do not want important data sitting only in memory when the system goes down.
A Simple Way To Think About fdatasync
Think of it like this:
sync says: “Flush pending writes broadly.”
fsync() says: “Flush this file’s data and metadata.”
fdatasync() says: “Flush this file’s data, plus only the metadata needed to read it correctly.”
That is the entire idea.
It is not flashy. It is not something most users will type every day. But it is a very important part of how Linux handles reliability, storage safety, and performance sensitive workloads.
Why Linux Admins Should Understand fdatasync
Even if you never call fdatasync() directly in C code, understanding it helps you make better decisions as a Linux administrator.
It helps you understand why tools like dd, databases, backup utilities, and filesystems sometimes offer different sync modes.
It also helps you avoid assuming that “write completed” always means “data is safely on disk.”
That assumption can be dangerous.
If you are working with production systems, backup storage, removable media, VM images, database files, or critical configuration updates, these details matter.
This topic also pairs naturally with filesystem design. If you want to go deeper into how different filesystems behave, this article is a strong supporting read because filesystem behavior affects how data integrity, metadata, journaling, and reliability are handled in practice:
Ext4 vs Btrfs vs XFS vs ZFS: A Linux File System Comparison for Beginners
And for a deeper dive into ZFS specifically:
Final Thoughts
The Linux fdatasync command is really the Linux fdatasync() system call.
It exists for a simple but important reason: to make file data durable without always forcing every metadata change to storage.
Compared with fsync(), it can reduce disk activity.
Compared with sync, it is more targeted.
Compared with doing nothing, it gives applications a stronger durability point.
For most users, fdatasync will appear indirectly through tools like:
sync -d FILE
or:
dd conv=fdatasync
But the concept underneath is worth understanding properly.
Linux is fast because it caches aggressively. fdatasync() is one of the mechanisms that helps bring order, safety, and predictability back into that performance-driven world.
Small detail.
Big reliability impact.
Call To Action
Have you ever copied a file, written an image, or updated a critical config and assumed the data was already safely committed to disk?
Understanding sync, fsync(), and fdatasync() can save you from painful surprises later.
Share your thoughts in the comments below, and follow Eagle Eye Technology for more practical Linux, cybersecurity, and infrastructure content built for real-world operators.
And remember:
The Singularity is always watching.
What Is The Linux fsync Command?
What is the Linux fsync command really, and how is...
Read MoreSecuring Corporate Data for AI Agents: Why Governance Must Come Before Autonomy
AI agents can boost speed and automation, but giving them...
Read MoreHow to Design Your Tech Stack for AI Success
AI does not fix a messy stack. It magnifies it....
Read More
Leave a Reply