SQL Server - Building Dynamic Partition in One Click
6 minute read
Partitioning is a very powerful and widely useful process in retrieving data from enormous tables. It allows us to split a very large table into smaller groups of rows which are called partitions. That splitting ensures enhanced query performance and reduced IO operations.
In this modern time when the world is surrounded by a huge amount of different kinds of data, we need to know what kind of method will be used to store them and which method will be the best later to retrieve them. Whether column store index, partitioning, indexing table or some other options will be used depends on database, tables and queries information.
There are also different types of business plans whose biggest problem is what to do with all those data and in which amount to store them. Some companies want to have all data in one place and store them in appropriate tables, while others want to track just certain data in a specific period because of lack of a memory or other. For example, they want to keep data of a certain interval and archive the other ones.
If you are facing this kind of problem, you can make it easier for your business by seeking our Data Engineering Team.
In this post, the main goal will be tracking data in a specific time interval, based on different data types. For example, some company wants just the last 6 months to be partitioned in a table and other ones to be archived to cheaper and slower storage.
Introduction into partitions
Partitions in SQL Server are created in a specific way. They can’t be created directly on tables like in some other database systems, but when created they are easier to maintain, especially if we are dealing with a huge number of tables.
I’ll start some technical intro on how to create a partition and connect it with the table.
The first step will be to create partition function on specific datatype, range and boundary value(s) and partition schema that maps partition function on filegroup(s).
After creating the partition function and scheme, the next question would be: How can we connect the created partition scheme with a new or existing table?
There are several ways for that. Which one to use depends on your business plan. Will your table be already created, will it be empty or filled with data, or you will create and fill your table after creating a partition scheme? In most situations, business flow uses all mentioned methods sooner or later.
The most important question is how to track all those changes on tables and partitions when you create new partition boundaries and delete old ones? If you do it manually it would take a lot of time to create or alter partition one by one and to connect every table with an existing partition. The best solution, in my opinion, would be creating procedures that will dynamically create and maintain partition functions, scheme and tables.
How to create different types of partitions on hundreds of tables in one click?
We at CROZ like to develop programs that are more dynamic in nature. That means faster execution, better readability of what we have done and less maintenance of program code.
As we all know, neither one of the complex dynamic solutions doesn’t come just in one click. At CROZ Data Engineering department we know how to create a program containing that solution and is executable in one click. Especially if your business demands different kinds of partitions that are based on weeks, months, years or even different types of data.
Here is a short overview of that program. To start, we need help tables to store data about tables we want to be partitioned and partitions information. Those tables help you keep track of the state of tables and partitions. Also, they are fundamentals for creating one-click dynamic procedures.
Then you need 2 procedures.
The first one will describe creating partitions with information stored in help tables and later partitioning table, if the table exists.
The second procedure is in charge of updating those created partition functions and schemes. It consists of 3 steps: adding a new boundary, transferring the oldest data in the archive and deleting the oldest boundary.
After working with and analyzing how partitioning works, I came across wonderful things they offer in SQL Server. When using partitioning your performance of querying large tables is better, queries can access partitions in parallel, faster data deletion and data load, better manageability and many others.
If you come this far enough and you are still interested in one-click partitioning, maybe you should try it on your database model and see how it fits. CROZ Data Engineering team is always there to help!
Converge all of your data sources into a polyglot platform capable of providing every insight about your business! Get in touch!
QED2023 is very special as we will gather for the 15th time!
That’s why this year’s topic will be VALUE!