Data compliance is all about adhering to laws, regulations, standards, and internal policies regarding data use. Organizations must comply with regulations like the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), the California Consumer Privacy Act (CCPA) and SOC2 standards to protect sensitive information and maintain trust. Data compliance plays […] The post How lakeFS Helps Ensure Data Compliance appeared first on Git for Data -...| Git for Data – lakeFS
lakeFS Enterprise offers a fully standards-compliant implementation of the Apache Iceberg REST Catalog, enabling Git-style version control for structured data at scale. This integration allows teams to use Iceberg-compatible tools like Spark, Trino, and PyIceberg without any vendor lock-in or proprietary formats. By treating Iceberg tables as versioned entities within lakeFS repositories and branches, users […] The post Versioned Data with Apache Iceberg Using lakeFS Iceberg REST Catalog ap...| Git for Data – lakeFS
Yesterday, OpenAI launched gpt-oss-120b and gpt-oss-20b, marking the company’s first open-weight models since GPT-2 in 2019. This strategic shift represents far more than a product release—it signals a fundamental transformation in how large organizations, particularly in regulated industries, approach AI infrastructure and data management. OpenAI’s Strategic Return to Open Source The gpt-oss models—gpt-oss-120b and gpt-oss-20b—are […] The post OpenAI’s Open Source Revolution: W...| Git for Data – lakeFS
A behind-the-scenes look at the design decisions, architecture, and lessons learned while bringing the Apache Iceberg REST Catalog to lakeFS. When we first announced our native lakeFS Iceberg REST Catalog, we focused on what it means for data teams: seamless, Git-like version control for structured and unstructured data, at any scale. But how did we […] The post How We Built Our lakeFS Iceberg Catalog appeared first on Git for Data - lakeFS.| Git for Data – lakeFS
Learn about our vision for how to close the AI data infrastructure gap using our funding round to promote enterprise data version control best practices. Read on to learn more.| Git for Data - lakeFS
Open source software has fundamentally reshaped technology—delivering unmatched flexibility, low friction, and rapid innovation. For some teams, it’s a philosophical commitment. For others, it’s the fastest path to building. lakeFS supports both models. For most data teams, the journey starts with open source and evolves over time. lakeFS open source offers a robust foundation for […] The post The Evolving Equation: When Do You Move From Open Source to Enterprise with Data Version Con...| Git for Data – lakeFS
An AI Factory with data versioning doesn't just run smoother. It fundamentally changes how teams interact with their data. Read more.| Git for Data - lakeFS
Learn how to build a solid AI infrastructure for efficiently developing and deploying AI and machine learning (ML) applications. Read more.| Git for Data - lakeFS
ML reproducibility pillars require a disciplined approach to managing input data, code, and execution environments. Read more.| Git for Data - lakeFS