The role of TDP is to provide a trusted and reliable platform to store and process a large amount of data. The feature selection, the choice of components, and the recommended usages are driven by the use-cases where the Hadoop ecosystem excels. The platform provides a robust core that is meant to be extended with complementary open-source and proprietary solutions for additional value.

Common usages of TDP


Secure and reliable storage expandable to multiple petabytes.

Data ingestion

Scalable platform to handle large volumes of ingested data, in all possible forms, coming in batch and streaming

Data refinery

Data engineering with distributed processing engines including Spark and MapReduce, data cleaning, and data publication.

Data lake

Management for structured, unstructured, and semi-structured data with enterprise-level governance.

Data mining and data science

Ad-hoc exploration, web UI for data access and visualization.


Storage of medium to cold data while preserving processing possibilities.