Data Engineering
Created:
News
2022-10-24 : The Evolution of The Data Engineer: A Look at The Past, Present & Future | Airbyte; via HN2020-12-14 : What Is Data Engineering and Is It Right for You? – Real Python2024-02-07 : A tale of three data platforms –The convergent evolution of BigQuery, Snowflake and Databricks demonstrates a broader tendency in the development of cloud platforms
Software and Libraries
- Weld is a runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a small common intermediate representation, similar to CUDA and OpenCL.
- Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations.
- frankmcsherry/timely-dataflow: A modular implementation of timely dataflow in Rust
- DataFusion: Big Data Platform for Rust
Articles
Books
- 📖 Data Engineering Design Patterns (DEDP); WIP; Dec 2023. The discussion at HN around how the definition of data engineer has changed is an interesting one.
- Data Engineering With Rust
- Data Engineering Cookbook
Courses
Data Engineering Concentration - Data Science, MS | University of San Francisco