With the introduction of Teradata Vantage, the platform for pervasive data intelligence, the analytic opportunity has become more powerful and more integrated. Vantage will allow for the expansion of business users and data scientists that need to analyze heterogeneous data to drive powerful answers that drive business outcomes.
A key to tapping into all this opportunity is to have an environment where constant change is not only encouraged, but accelerated. This means that companies can no longer afford to have systems and databases managed “after the fact” or only managed when someone complains that it is not working correctly.
This is the dawn of incorporating Artificial Intelligence and Machine Learning to optimize the platform running the analytics. The future of Vantage -- which Teradata is working on now -- will make it a true autonomous platform that will take your business to new heights.
Automation vs. Autonomous
It is first helpful to understand the goal state. We are targeting an autonomous platform, not an automated platform and that is a big difference. An autonomous platform is one that is aware of all operations that need to occur not only to maintain a system but to have that system become predictive and self-regulating to that prediction. Let’s take a quick example of the self-driving car.
One can easily automate speed (i.e. cruise control), lane control, braking, and comfort features like lighting or seat preferences. These are tasks that drivers do on a routine basis and car manufacturers automated those tasks.
But an autonomous car is one that is self-driving. The driver tells it where to go and the car manages the whole trip. Better yet, the car learns that every morning at 8 AM the driver (well, now passenger!) needs to be driven to work so perhaps at 7:55 AM the car is warmed up and the cabin is set to a comfortable temperature all ready to go.
It is this subtle, but powerful difference where the power of AI and Machine Learning comes into play. The autonomous platform will take all relevant data into account, understand the environment that must be ready for coming workloads, and proactively tune to meet coming needs.
This is the dawn of incorporating Artificial Intelligence and Machine Learning to optimize the platform running the analytics.
A better way to work…
Here is a quick example of the future autonomous platform at work:
Let’s take a package delivery company that is experiencing delays due to extreme weather conditions and there is a need for a quick turn around on route optimization analytics which include weather and traffic data which is not currently in the systems.
Using the ETL.ai tool, the raw CSV data is quickly interrogated, and tables are created, ready for loading. As this data is being accessed, the index.ai and stats.ai tools evaluate the queries and create new indexing and collect relevant statistics to improve run times.
With all this new workload the SLG.ai tool sees that the new workloads are creating demand which is causing issues between 4 PM and 5 PM and automatically increases or realigns resources ahead of time to ensure business users can meet their goals
In short, what used to take weeks to build and then continually manage has been reduced to hours and the on-going operationalization and management has be taken over by the platform directly.
You have to start with the basics….
As I read all the articles about autonomous work for databases and how all the work of the DBA is going away, I have to smile a bit, nod my head, and think “welcome to the club”. The Teradata database led the way in self-managed systems since our beginnings. The key was understanding not what needed to be automated but more importantly we figured out what could be eliminated. The real first step in getting to automation, and more importantly, to becoming autonomous, is to get rid of steps that should not need to happen in the first place.
Teradata had a completely different approach to data and file system management and removed the need to worry about data block placement, data order, free space management, etc. It was not a point of automating the tasks, it was about eliminating them.
Then you add in intelligence….
With the foundational infrastructure in place, and with the advancement in CPU and machine learning techniques, now is the time to add intelligence and make it autonomous. An autonomous platform is not about just automating simplistic DBA operations, it is about making the environment self-aware and adaptive such that it anticipates the next need and adjusts itself before the need occurs. The beauty of this under the Teradata database is that once the system determines what needs to be done, much of the work is logical settings not physical re-organization of data and objects.
Let’s take a closer look at the types of intelligent automation Teradata has coming out in the initial release. These items span the process from data ingest all the way to ensuring critical service level goals are met.
While not an automated function like the following items, AutoDBA is the infrastructure or user interface into the Autonomous Platform. AutoDBA will be a table attribute that tells the system that this object falls under the control of the platform. Within the functions, there will be degrees of automation from a simple understand and advise, to a full understand and act mode of operations. The end state is to have the following functions, and many others, to fall under this control though the initial release will focus on the index and statistics management.
The ETL function will simplify the process of generating schema for incoming data. It is about getting new data into the environment quickly and correctly. The ETL.ai function will interrogate a data source (say a CSV or JSON file) and make the judgements on the table and column attributes. In total, it intelligently automates the tasks necessary to create a physical model, and effectively integrate the use of data.
All systems need to collect statistics to run at the optimal level. The more information the optimizer has on the demographics of the data the better the plan will be. The stats.ai function is all about providing “fewer and better” recommendations based on query logs and other data available to the system. These recommendations will also include updating or removal of statistics as data changes over time.
The decision of what indexes to have on a system can become complex as more data, more tables, and more queries are added to the environment. The index.ai function again uses the workload logs and physical model relationships to recommend what type of indexes can be helpful in meeting user demand and, if desired, will automatically enact these recommendations.
Teradata has several options for data and table compression. The system already has automatic block level compression by default. However, companies can still leverage our Value List Compression (VLC) options where the data for specific columns can be further compressed within the table header.
The compression.ai function will interrogate the columns and find the most frequent values and determine which will provide the best compression rates.
The ultimate objective of all the above efforts, and the reason for having an analytic environment, is to serve the user communities. That can either be an actual person submitting a query or an application that requires sub second response times.
The SLG.ai function takes the first step down a long path. This function uses Machine Learning to make predictions of the coming week’s workload and align resources in advance to ensure the critical service level goals are met. Once the system is aligned to the predictions, the function will track queries and response times to understand variance in the predictions as well as report on those that are falling out of the service level goal.
The Journey will continue...
As with many advancements made by Teradata, the autonomous platform is not an event but a philosophy that is integral to our development and direction. When the foundations of the AutoDBA and the various functions above are in place, we intend to incorporate other areas. There is more data in the environment, and more functionality that we can further leverage in the prediction and execution of aligning resources.
On the roadmap are items such as better space management, autonomous management of the in memory spaces based on workload needs, or using the multiple hash map feature to optimize table and system operations. These are just a few examples of coming capabilities on the journey towards the “error free platform”
Answers enabled with Intelligence...
What is the real goal behind the autonomous push? For many systems the desire is to remove the mundane, day to day tasks of the system and database administers. For Teradata, the goal is to continually focus on the end user experience of running analytics to get answers that enable action. As companies scale out and operationalize the insights gained by data scientist, the users and operations teams can be confident new workloads will not interfere with existing priorities.
By integrating AI and Machine Learning within Vantage, there will be a “define and forget” strategy which allows companies to set their priorities and the system will manage itself to the optimal mix. This will enable companies to rise above the complexity, cost, and inadequacy of today’s analytics landscape to deliver pervasive data intelligence.
Since 1987, Rob has contributed to virtually every aspect of the data warehouse and analytical arenas. Rob’s work has been dedicated to helping companies become data-driven and turn business insights into action. Currently, Rob works to help companies not only create the foundation but also incorporate the principles of a modern data architecture into their overall analytical processes.