The Community repo is to store all the information about openEuler Community, inclouding governance, SIGs(project teams), Communications and etc.
packaging and testing of the Apache Hadoop ecosystem
Velox is a composable execution engine distributed as an open source C++ library. It provides reusable, extensible, and high-performance data processing components that can be (re-)used to build data management systems focused on different analytical workloads, including batch, interactive, stream processing, and AI/ML.
A Spark SQL execution engine with vectorization optimization, which is used to replace the original execution engine of Spark SQL and provides higher performance.
This repository contains common information and common tools of bigdata.
Bigtop-manager provides a modern, low-threshold web application to simplify the deployment and management of components for bigtop, similar to Apache Ambari and Cloudera Manager.
Apache Iceberg is a new table format for storing large, slow-moving tabular data.
Delta Lake is an open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, Ruby, and Python.
Apache Durid is a real-time database to power modern analytics application.
A unified analytics engine for large-scale data processing
Alluxio (formerly known as Tachyon) is a virtual distributed storage system.
The Apache Hive (TM) data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.
A software platform for processing vast amounts of data
Information summary and discussion platform for CloudNative SIG
a high-performance service for building distributed applications
Apache Calcite is a dynamic data management framework.
Apache Ambari is a tool for provisioning, managing, and monitoring Apache Hadoop clusters.
A free and open source distributed realtime computation system