Managing Large Files
Guide to manage large files in Outpost Hub repositories.
Git is optimized for tracking changes in text files, which makes it inefficient at handling large binary files. Modifying a binary file requires Git to store a complete new version of that file, leading to substantial increases in repository size. This can slow down operations like cloning, fetching, and pulling, as the entire history of the repository, including all previous versions of large files, must be downloaded.
To address these limitations, Outpost restricts the size of files you can track in regular Git repositories.
To maintain performance and reliability, Outpost limits the size of files allowed in repositories. Files larger than 10 MiB are blocked. If you try to add or update a file exceeding this size, you will receive a warning or error from Git.
To manage files larger than 10 MiB, use Data Version Control (DVC). For more details, refer to "DVC."
We recommend keeping repositories small, ideally under 1 GB. Smaller repositories are faster to clone and easier to maintain.
To prevent repositories from becoming too large due to external dependencies, use a package manager. Popular options include Bundler, Node's Package Manager, and Maven. These tools manage dependencies without needing to include them directly in your Git repositories.
For handling large files efficiently in a Git environment, utilize Data Version Control (DVC). DVC is designed for versioning large datasets and machine learning models, replacing large files in the repository with small reference files. These reference files point to the actual files stored on a separate server, keeping the repository size manageable and speeding up operations by downloading only the necessary versions of large files when needed.
How to Use Data Version Control (DVC) with Outpost?