Big Data

Innovative Database Column Expansion and Automation for Scalable System

Database architecture is a sophisticated discipline that demands extreme expertise, strategic insight, and flexibility. Crafting an optimal innovative solution requires careful assessment of the system’s current state, aligning with evolving business needs, and balancing security, performance, and budget constraints. In my role as a Database Architect at a leading brewery in the United States (US), I undertook a highly complex project focused on expanding database columns across all relational databases and integrated systems. This initiative required meticulous planning and adaptation to ensure seamless integration and performance across the entire database infrastructure.

The need for expansion

The brewery system, spanning nearly 25 enterprise-wide production relational databases (Oracle), supports 128 mission-critical applications developed in a variety of technologies. Originally designed almost two decades ago, the brewery includes a pivotal database column, “Product_Id”, with a 2-byte character length. However, as business demands and product lines expanded, this character limit became inadequate to accommodate new product launches. The Database Column Expansion Project was thus initiated to enhance the brewery system, expanding the Product ID field from 2 bytes to 4 bytes—a change essential for supporting future product introductions. While it may seem straightforward, this expansion presented intricate challenges across the enterprise.

With inconsistent naming conventions across tables, identifying every instance of the impacted Product_Id column was a complex and intensive task. 

Planning for a challenging staged remediation with minimal downtime

The associated 128 applications, developed in various programming languages, included extensive custom logic and hardcoded dependencies on the Product_Id length. To ensure stability and futureproofing, each application had to be carefully refactored to adopt reusable coding practices, removing dependencies on fixed column lengths. This was particularly challenging, as not all applications could be remediated simultaneously, necessitating a phased and carefully coordinated approach.

Given that the brewery system is integral to enterprise operations, downtime had to be limited to under 30 minutes to prevent significant revenue loss. Additionally, any modifications made had to ensure that the system’s performance was not compromised, even under the high transaction volume that the brewery system handles.

The brewery system processes a substantial volume of transactions across its databases, demanding rigorous planning and efficient execution to avoid disruptions during the column expansion. Therefore, this  project required not only technical proficiency, but also an intricate understanding of cross-system dependencies, proactive impact assessment, and phased implementation planning to maintain operational stability and performance while supporting future growth.

Impact analysis and discovery

Facilitating this process required an in-depth impact analysis that was conducted in a structured, two-step process:

  1. Identification of impacted columns and associated objects across all databases
  2. Identification of impacted columns within the application source code.

To streamline this analysis, a list of probable variations of the column name (Product_Id) was developed through discussions with senior architects and business users. Given the vast scope, manually inspecting each table and related database object (such as views, materialized views, indexes, triggers, stored procedures, and packages) was impractical. Therefore, a custom utility was created using Oracle PL/SQL, leveraging stored procedures and packages. This utility performed a comprehensive scan across data dictionary views and other database object source codes to produce a preliminary list of potentially impacted columns, narrowing down the scope for more detailed analysis.

Additionally, the utility was enhanced to gather unique data values from the identified columns and validate them against standard values expected for the impacted column (Product_Id). This required the dynamic execution of SQL statements and complex logic, all implemented in Oracle PL/SQL stored procedures and packages, to accurately identify and verify the impacted columns.

This pluggable impact analysis utility generated detailed reports, listing affected tables and other database objects, as well as frequently executed SQL statements that used the impacted columns. Ultimately, the analysis identified over 3,000 impacted database objects. A similar impact assessment was carried out within the application layer to ensure comprehensive coverage.

Designing the novel database solution

The solution became increasingly challenging due to the need for remediated applications to read and write with the expanded Product_Id column, while non-remediated applications continued to interact with the original 2-byte Product_Id column. To address this, I proposed and successfully implemented a novel database solution, which was reviewed and approved by the architecture board at the brewery. This solution involved a three-phase structure consisting of a Foundation Phase and Conversion Phase followed by the End Stage. The entire solution relied on the power of database views, “Instead of “ triggers, synonyms, user IDs, and object access permissions. Each phase encompassed the following steps:

Phase 1- Foundation Phase

  • Created individual database view object for all impacted tables across all databases
  • Developed “Instead of” triggers for each newly created view to handle DML operations
  • Switched the existing synonyms (or created new ones) to point to the newly created views, replacing references to the impacted tables
  • Replicated the permissions granted to the impacted tables, applying them to the corresponding database view objects.

Phase 2 – Conversion Phase

  • Remediated the impacted tables by expanding the database columns to 4 bytes
  • Created a lookup table to map between the 2-byte and 4-byte Product_Id column values, used when launching new products via the remediated applications
  • The views created for each impacted table were designed to return the 2-byte value to non-remediated applications, referencing the lookup table, while writing to the impacted tables with the expanded values via the “Instead of” trigger during updates or inserts.

Phase 3- End Stage

  • The remediated application now directly references the expanded 4-byte column in the original impacted tables.

Core Concept:

  • Remediated applications using the original impacted table directly, accessing the 4-byte Product_Id column.
  • Non-remediated applications use synonyms pointing to the views, which present the data as if it were still 2 bytes. The synonyms act as the bridge between the two approaches, enabling both types of applications to function simultaneously without any awareness of the underlying changes.

This solution ensured zero downtime and enabled both remediated and non-remediated applications to seamlessly benefit from the Product_Id expansion—an outstanding achievement in both design and execution.

In developing this solution, it was necessary to resolve an additional design consideration: 

should we create a separate table and copy existing data while maintaining synchronization for incremental updates? In this case, the answer was no. Given the project’s scale—over 3,000+ tables and numerous dependent database objects—this approach was not viable because duplicating data would unnecessarily increase storage costs, and the solution’s complexity would increase significantly. Additionally, procuring and maintaining replication software and continuous data sync would add substantial overhead. The approved design avoids these pitfalls by utilizing a more streamlined and efficient approach with views and synonyms to ensure the smooth operation of both remediated and non-remediated applications.

Implementation and automation

The development of source code for each database view object, “Instead of” triggers, and other associated components would require several hours per day, and with more than 3,000 impacted tables, it was not feasible to rely solely on manual effort. To address this, I developed a robust automation framework using Oracle PL/SQL and UNIX Shell scripting, leveraging comprehensive stored procedures and packages to automate the creation and compilation of more than 3,000 database components—views, triggers, synonyms, and object permissions. This automation accelerated the development process and ensured the accurate creation and testing of database objects.

The framework was further enhanced to automate database testing, verifying the existence of newly created components, and confirming that both remediated and non-remediated data was correctly retrieved based on application user IDs. The architecture overhaul of automating the creation and compilation of over 3,000 components saved the company approximately $475,000 USD  in manual labor costs. This reusable, scalable framework not only improved the efficiency of the current project, but also provided a long-term solution for future database architecture challenges.

Performance Tuning

Post-rollout, the Oracle optimizer might have chosen different execution plans, potentially affecting performance, but it was critical to ensure that performance remained optimal. A comprehensive performance tuning initiative was launched, combining proactive and incremental tuning with continuous monitoring. We collected frequently used SQL queries from all applications across production databases and generated AWR (Automatic Workload Repository) and ASH (Active Session History) reports to identify and analyze performance bottlenecks.

Several tuning practices were applied, including SQL query optimization, index creation, and rebuilding for more effective use of indexes; partition management for improved data handling; and performance hints to boost efficiency. Additionally, disk I/O and storage were optimized with careful partitioning and tablespace management, while wait events and locks were monitored to avoid delays. Background processes were fine-tuned for optimal log management and memory flushing. Furthermore, query rewriting and application code changes were implemented to enhance overall performance.

The outcome of these efforts was remarkable, with a significant improvement in the daily operations and performance system that far exceeded initial expectations.

Key results and impact

This project successfully delivered a groundbreaking solution for database column expansion across multiple systems, enabling seamless scalability for future product launches. Through the automation of database component creation and compilation, we achieved significant cost savings and enhanced operational efficiency. Extensive performance tuning was executed to dramatically optimize data processing speeds, ensuring smooth and rapid system performance.

Furthermore, a sustainable data migration framework was developed that benefitted the current system while also providing a model for future scalability, both within and outside the organization. This innovative approach allowed the client to expand their product range significantly, without disrupting their existing infrastructure.

With minimal downtime and zero defects, the new architecture has provided unparalleled flexibility and scalability, positioning the organization for continued growth and success. This project is a testament to the power of creative, high-impact solutions that drive long-term business value.

Database Column Expansion

Sanjay Puthenpariyarath

Written By Sanjay Puthenpariyarath: Sanjay Puthenpariyarath is recognized as an expert data engineer for his original contributions and success in designing and implementing scalable data architecture solutions for Fortune 500 companies in the banking telecom, and e-commerce industries. For nearly two decades, he has specialized in big data processing, data pipeline development, cloud data engineering, data migration, and database performance tuning, using cutting-edge technologies that have enabled him to optimize data workflows and achieve significant improvements in financial and operational outcomes. Sanjay received a Bachelor of Engineering degree in Electronics and Communication Engineering from Anna University, India, and earned a Master of Science degree in Information Technology from the University of Massachusetts, Lowell (US). As a senior leader, he enjoys mentoring data engineers, promoting data-driven organizational cultures, and delivering complex projects on time and within budget. 

Comments
To Top

Pin It on Pinterest

Share This