Faster Primary Key creation or enabling

Today, I have seen an application running a script to enable primary key constraint for a long time. The question is how to make it faster. In Data warehousing systems, it is normal that we disable Constraints, drop indexes and re-enable them after loading data. So, making them faster is very important as the other jobs are dependent on the initial data load.

I know this is nothing new for a DBAs, but many developers tends to over look this.

Here is my test case, initially prepared a new table with 7 million records in it.

SQL> create sequence my_num;

SQL> create table my_table as select my_num.nextval, OWNER,OBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID,

OBJECT_TYPE,CREATED from dba_objects;

SQL> insert into my_table select my_num.nextval,OWNER,OBJECT_NAME,OBJECT_ID,
DATA_OBJECT_ID,OBJECT_TYPE,CREATED from my_table; 2

138236 rows created.

SQL> commit ;

SQL> select count(*) from my_table;

COUNT(*)
———-
7282377

Now my_table got 7.2 million records, good enough to show slow primary key creation 🙂

SQL> alter table my_table add constraint pk_my_Pkey primary key (NEXTVAL,OBJECT_ID);

Table altered.

Elapsed: 00:06:35.15

The normal primary key creation took 6 minutes and 35.15 seconds. To create primary key it took 6.35 minutes!, I felt that is way too high. So, how to improve the constraint creation timings. The primary key creation to got two steps

A unique index creation
Enabling the constraint

The maximum time took was to create the index not to enable the primary key. If we make the index creation faster, all set. So, I decided to break the constraint creation into 2 parts a) Index creation b) Enabling primary key. Considering the type of tables – daily data load, can be re-created easily etc we need not keep recovery is in mind – if we lose the table we will create it. Also, we have enough CPU resources while other jobs are waiting to complete the data load.

As the system resources under utilised, I will use enough parallelism
As the table can be re-crated easily, I will disable redo generation while index creation.

With the above assumtpions in hand, I had created the index and primary key as follows.

SQL> create index my_index on my_table (NEXTVAL,OBJECT_ID) NOLOGGING parallel 8;

Index created.

Elapsed: 00:02:06.75

SQL> alter table my_table enable constraint pk_my_Pkey using index my_index;

Table altered.

Elapsed: 00:00:05.02

Perfect, the constraint creation completed more efficiently with in 2.06.83 minutes, means 33% of the actual time!

Comments (15) Leave a comment

Chandra

April 15, 2014 at 7:24 am

Reply

Good topic
Cristina Abbà

March 26, 2015 at 5:26 pm

Reply

Great idea!!!
Feng

July 18, 2016 at 9:19 pm

Reply

A Primary Key constraint and a Clustered Index is not really the same in SQL Server. By default SQL Server enforces the Primary Key constraint with a Unique Clustered Index. But if you want you could instead enforce it with a Unique Non-Clustered Index. But that approach wouldn’t really make sense as a default, because you need to have specific problems to justify that approach.
- Ora600Tom
  
  July 19, 2016 at 6:26 am
  
  Reply
  
  Not sure about SQL Server as I have tested this approach in Oracle. Sorry, I should have mentioned about it.
jeff1089

August 14, 2017 at 2:24 pm

Reply

Have you tested this with dropping the index and recreating it instead of merely enabling it?
- Ora600Tom
  
  August 24, 2017 at 1:50 am
  
  Reply
  
  Yes, it took more than 6 minutes – in all those cases, it create a new index serially
Vasiliy

August 23, 2017 at 5:27 am

Reply

It is worth noting that Oracle (11.2.0.4) still reads the whole table serially when it enables a primary key constraint despite the existence of an appropriate unique index. I don’t know why it does this. So on a huge table enabling a constraint is still long enough. But I approve that the whole process is faster when you have pre-created an index in parallel though.
- Ora600Tom
  
  August 24, 2017 at 1:52 am
  
  Reply
  
  I will check this, not sure why oracle has to scan the table when unique index is existing.
  - Vasiliy
    
    August 24, 2017 at 3:01 am
    
    By the way, in your example the index is not unique, so Oracle has to scan something to detect duplicates. I bet it scans the table, but it would be cheaper to scan the index. So would it be in the case of the unique index (but still it’d be more expensive than doing nothing).
  - Vasiliy
    
    August 24, 2017 at 8:31 am
    
    I am sorry. My bad. It was a flawed schema design. I blindly believed that primary key’s target columns had been constrained as not null (but they really should have been). I am currently in the middle of this process. I’ll tell if it doesn’t help (I believe it does).
    
    In my own example when the index is not unique Oracle reads the entire index (FFS) and checks it for duplicates (SORT). As expected.
Ora600Tom

August 24, 2017 at 11:36 am

Reply

Index FFS (scan through the leaf nodes) for non- unique indexes is expected to validate the data. I agree.
venkatbollu

July 6, 2021 at 5:43 am

Reply

Approach is looking good and interesting for me, But one question here – Are we creating a PK or enabling it back? I understood it as Enabling the PK using a simple Index but heading is confusing.

Another question is Are you leaving the index after enabling the PK? Do we need it further when I have PK on this combination?
- Thomas Saviour
  
  July 20, 2021 at 12:23 pm
  
  Reply
  
  It is creating a Primary key – and a Unique index is part of the primary key and should exist.
vasista

July 13, 2021 at 12:49 pm

Reply

For daily load jobs in DW, we have the same process similiar to steps mentioned above. with the data of 900 million records,a job is able to create indexes with 10 to 15 minutes ,but constraint creation post indexes creation is taking longer time like above 70 minutes.Request you to help here to speed up the operation
- Thomas Saviour
  
  July 20, 2021 at 12:22 pm
  
  Reply
  
  There is a problem in my index creation – create a “CREATE UNIQUE INDEX…” which may fix the issue. My bad in the original post.