Recently I started work on a project to migrate a client from Dynamics CRM 4.0 on-premise to Dynamics CRM Online. Since Microsoft does not allow customers to upload their existing databases to CRM online, part of the exercise was to scope the amount of data that would need to be migrated from the Dynamics CRM 4.0 database to CRM Online.
The database over the years had grown to 145GB! For a small-sized client, they seemed to have a lot of data.
Upon further investigation I found about 8 million records in the System Job table. This is not completely uncommon especially for Dynamics CRM 3.0 and 4.0 deployments where there is no bulk delete mechanism in the user interface.
Thankfully, Microsoft provides a SQL script (to be used at one’s own risk) to clean up old system jobs.
After running the script, I was down to 5 million records in the System job table and only down to about 130GB.
Upon further investigation, due to some specialized integrations that were no longer running optimally, the majority of the system jobs were in a “pending” state. These would not get deleted by the Microsoft script nor would they ever change status.
We still needed to get the database down to something reasonable in order to easily transport to a test environment to upgrade the customizations, as well to improve performance and reduce backup space and time.
I had the following options:
- Remove the records using a SQL Delete statement.
- Cancel the System Jobs records from within the CRM User interface, 250 records at a time.
- Write custom code to cancel and remove the System Jobs records.
- Use a data tool.
Despite the fact that Microsoft had provided a script to get rid of completed system jobs, mucking around with the database directly has severe risks and could render your system damaged and unsupported. Even though it would involve simply updating the status and statecode fields, I wanted to play it safe.
Sitting and cancelling the system jobs 250 at a time would take days, if not weeks of tedious clicking.
Writing custom code was beginning to look like an option. However, I decided to take a look at available tools to see if there was an even easier way.
Years ago I used SQL Server Integration Services (SSIS) to integrate a series of other systems to feed data to CRM. I had written VB code that would call the CRM web services to update or insert data in CRM and read the data from the various systems using ODBC.
In a nutshell, SSIS can retrieve data and run through all the records and perform various data operations.
Since that time, a couple of commercial tools have come on the market that make using SSIS and CRM so much easier.
I downloaded and installed the Kingswaysoft SSIS Integration Toolkit. For my purposes, the free developer edition suited my needs, but I would recommend getting the full version for other integration projects.
I set up a new SSIS project in Visual Studio, and added a new connector. The Kingswaysoft provides a DynamicsCRM connection manager.
One of the great things about the Kingsway connector is that it is backwards compatible (to CRM 3.0). This was still a CRM 4.0 system, which means a lot of the modern tools no longer work. However, I was able to choose the SOAP 2007 endpoint and connect to the CRM 4.0 system.
I then setup my data flow tasks. I ended up with 4 tasks;
- Retrieve stuck CRM system jobs (Get Stuck System Jobs)
- Set the status and statecode fields to “Cancelled and Completed” (Set Cancel Status Fields)
- Update the system job with the new status values. (Cancel Jobs)
- If any errors, dump to Excel (Error Output)
The first step involved setting up a FetchXML query to retrieve the suspended system jobs (statecode = 1 and statuscode = 10).
Side note: In CRM 4.0 you cannot download the FetchXML from Advanced Finds. Some of you may remember the “Stunnware Tools” for querying CRM data and generating FetchXML and C# code. I used that tool to create my FetchXML statement for the Microsoft Dynamics CRM source task.
The next step was to set the Statecode and Statuscode values for each System Job record and set them 32 and 3 (Cancelled and Completed). Once the system job status was changed then the Microsoft script would then be able to delete the record.
For this I used the “Derived Column Transformation” control to set the new values:
Once the object had the values changed, I still needed to write that back to CRM. The final step was to take the output from the Transformation step and update Dynamics CRM. This would update the values using the supported Dynamics CRM API, so I wasn’t writing directly to the SQL tables.
In the event that something went wrong, instead of abandoning the process, I had any errors go to an Excel file for later review.
Setting up this job took no time at all. I ran a few tests by limiting the number of records returned by my FetchXML statement. Once I confirmed that everything was working, I started the job and approximately 2 days later all the suspended system jobs were cancelled.
I ran the script as per Microsoft KB and after a few hours all the records were deleted. I then shrunk the CRM database using the BDCC ShrinkDatabase command in SQL and was down to a (somewhat) more reasonable 30GB. I have since found that much of the remaining size is due to attachments and addressing that separately and now have the size down to about 6GB.
If you need to do any kind of mass updating on CRM records using a method that using supported API calls, then one of the options you should consider is SQL Server Integrations services and the Kingswaysoft SSIS Integration toolkit. I know I will be looking at this tool for future integrations and database work and potentially as an option to migrate the CRM 4.0 data to CRM Online.