Next-Generation Petabyte Scale Backup and Archive Platform

About the Client

A leading software and high-tech company in the USA specializing in providing scalable and high-performance storage solutions for big data applications and cloud environments.


Company’s Request

The client required a solution capable of delivering high-performance block and object storage services with limitless scalability to support the next generation of OpenStack clouds, petabyte-scale active archives, and big data applications. The solution needed to be built from the ground up, leveraging industry-standard Linux servers and incorporating innovative technologies such as IP and Cloud Copy On Write (CCOW) to achieve reliability, functionality, and cost efficiencies.

Technology Set

CentOs, Ubuntu 14, RHEL 7.0
Chosen for stability and enterprise support, these Linux distributions provide a reliable base for developing high-performance applications and are optimized for data-intensive operations for backup and archive systems.
C, and C++
Offer performance efficiency essential for system operations and data processing.
Selected for its ability to handle asynchronous tasks and network requests efficiently, enhancing backend interaction capabilities.
.deb and .rpm
These packages allow for straightforward software distribution and installation across different Linux environments, providing easy deployment and consistent operation across platforms.
Cloud Copy On Write (CCOW)
Integrated for its innovative data management that improves data integrity and efficiency. CCOW enables immediate data duplication and versioning for enhanced recovery and high availability.
Jenkins CI
Automation testing confirms consistency and reliability, while Jenkins CI automates builds and tests, improving development efficiency and speeding up the deployment cycle by continuously integrating feedback and updates.

We engineered the core functionality for Cloud Copy On Write, a forward-thinking technology currently pending patent approval, designed to ensure safe and efficient data management.

 CCOW employs an innovative approach where new data is written to new storage locations rather than overwriting existing data, with updates made to the metadata pointers subsequently. This method minimizes the risk of data corruption and enables quicker, more reliable data recovery processes. 

Implementing CCOW was required for building a solution capable of efficiently handling and scaling to petabyte-sized data volumes for clients managing large-scale data operations.

A significant challenge was developing a system that could efficiently manage petabyte-scale data volumes without performance degradation. To overcome this, we optimized our data handling algorithms for high efficiency and implemented a distributed architecture. This architecture distributes the data across multiple server nodes, balancing the load and enhancing data retrieval and backup speeds by leveraging parallel processing techniques.

To provide reliability, we developed an extensive suite of automated tests covering a range of functionalities from basic data operations to complex disaster recovery scenarios. Automating these tests enabled consistent evaluation of the system under diverse conditions. We integrated these tests with Jenkins Continuous Integration (CI) to automate their execution at various stages of the development lifecycle. This strategy improved development productivity and enhanced the quality of the code by allowing early detection and resolution of issues. The Jenkins CI pipeline was configured to provide real-time feedback on the health of the codebase, enabling quick adjustments.

During the project, our team worked on the integration across various platforms, including OpenStack, Linux, VMware, and Windows, to guarantee functionality in diverse IT environments. This integration involved developing custom adapters and APIs for compatibility and optimized performance across these different operating systems and platforms. 

We also fortified our solution with advanced encryption standards and comprehensive access controls in response to increasing data security concerns and stringent compliance mandates. For encryption, we utilized AES-256 across all data at rest, and for data in transit, we employed TLS 1.2 protocols with mutual authentication to provide secure data exchanges.

Value Delivered

Data Security and Efficiency
We've implemented cutting-edge data protection technologies to safeguard large data sets. This focus secures valuable business data against breaches and losses and optimizes storage resources, reducing unnecessary data duplication and improving retrieval times.
Enhanced System Reliability
The integration of Cloud Copy On Write (CCOW) technology alongside stringent testing protocols has dramatically improved the reliability of our clients' storage systems, making it well-suited for important data backup and archival operations.
Reduced TCO
Our solution minimizes manual intervention by automating key aspects of the deployment and testing phases across various platforms. This simplifies the operational processes and significantly reduces the total cost of ownership by reducing labor costs and downtime.
Improved Regulatory Compliance
Solution incorporates compliance with the latest data protection regulations, which is essential for businesses operating in regulated industries. This helped our client avoid hefty fines and reputational damage associated with non-compliance.