Databricks launches open-source project to drain all your data swamps into info lakes

Databricks launches open-source project to drain all your data swamps into info lakes

5 years ago
Anonymous $9jpehmcKty

https://www.theregister.co.uk/2019/04/24/databricks_open_source_project/

American startup Databricks, established by the original authors of the Apache Spark framework, has launched an open source project designed to solve the reliability issues plaguing data swamps – those huge (cess)pools of raw corporate data that are supposed to deliver value from analytics.

The Delta Lakes project is deployed on top of the existing data lake, requiring no change to the underlying architecture. It is compatible with batch and streaming data, can check data quality and schema, and doesn’t allow broken datasets to mess with the algorithms.