Make SQL Data Easier to Merge (Static Data)
Currently, SQL Data files contain a bunch of insert statements, but the order of the insert statements can be inconsistent. For example, new records might get inserted on the top or the bottom. Also, multiple SET IDENTITYINSERT statements sometimes get created. This makes it difficuly to merge data changes between multiple branches in TFS Source Control. If the Insert statements had a consistent order and only included one SET IDENTITYINSERT diffs and merges would be easier.
Thanks for this suggestion and for the many comments and upvotes. I realize that this is a pain point.
I have a few shorter-term workarounds to summarize as well as some information on the longer roadmap in this update. I know these shorter-term workarounds aren’t perfect (I summarize the pros and cons), but I’m posting them as they may help a few folks.
Workaround 1) When data changes to static data need to be made, use a “relink the table” pattern
One can “cleanly rescript” a static data table in SQL Source Control by:
- Unlinking the static data table
- Committing
- Relinking the static data table
- Committing
Pro: This works with the GUI and requires no special knowledge or comfort with TSQL. This may help folks with just a few static data tables.
Con: This requires extra steps and results in extra commits in the history, which I realize can be annoying, particularly if the team doesn’t squash commit history when merging or do similar things.
Workaround 2) Use a post-deployment script to manage static data tables instead of linking them
Another option is to not use the “link static data tables” feature at all, and instead manage the static data table yourself with a custom script.
I’ve added a code sample called “post-deployment script to manage static data for a table with an identity column” into our reference on post-deployment scripts: https://documentation.red-gate.com/soc7/common-tasks/working-with-pre-post-deployment-scripts/static-data
Pro: This gives you complete control over the TSQL.
Con: Your post-deployment script might become quite complex and create its own issue in merge conflicts, depending on how many static data tables you have and whether you are using the post-deployment script for other things.
An alternate approach using SQL Change Automation
If customers have licenses for SQL Change Automation, the SCA Migrations approach does not have this issue and has additional features for flexibility with static data:
- Since migrations only script out the data which has changed, this scripting issue of the entire table doesn’t occur in SCA – it only ever needs to generate commands for what has been modified since the last migration
- Supports column filtered static data tables in the SCA plugin in SSMS
- Supports multiple post-deployment scripts, in case there is a preference to manage static data tables in dedicated post-deployment scripts
- Allows approaches like bulk loading larger static data tables by supporting SQLCMD variables in migration and post-deployment scripts
If there are other use cases where a migrations approach would add value to the team – such as easily being able to run data modification statements as part of deployments, or being able to customize the TSQL for schema changes – then changing from SQL Source Control to SQL Change Automation might make sense.
Roadmap
I’ve chatted with our engineering teams and Redgate does have the desire to better support this scenario longer term. This is not a “quick fix” scenario and will likely require a rethinking of the way static data is managed within the tool.
While the “rethinking” bit of this has begun, we don’t specifically have this on the SQL Source Control roadmap (https://www.red-gate.com/products/sql-development/sql-source-control/roadmap) at this point. We do have improved support for use with Git feature branches as a future candidate on the roadmap, and we do know from research with many customers that static data is quite important to folks. While I can’t make any promises, I think we are likely to be able to offer an improvement in this area.
-
Chris Crook commented
But seriously, fix this.
Even the SQL Data Compare GUI as part of SQL Source Control would be a huge step forward. Not sure what the hold up is on RedGate's end.
We can haz open source and implement ourselves?
-
Anonymous commented
this would serve as a good migration example... deleting statics my no be possible if they are in any way connected by key constraints when reverting to and older version but. update and add if missing would work i think. any column drop warnings could also then be omitted on version-controlled static data-tables when reverting back :)
MERGE #Item AS TARGET
USING #ItemSource AS SOURCE
ON ( TARGET.Id = SOURCE.Id )
WHEN MATCHED THEN
UPDATE SET TARGET.Name = SOURCE.Name ,
TARGET.Something = SOURCE.Something
WHEN NOT MATCHED THEN
INSERT ( Id, Name, Something )
VALUES ( SOURCE.Id ,
SOURCE.Name ,
SOURCE.Something
); -
CShen commented
@PaulS, I think this is long over-due enhancement RedGate need to make. I can track code change, schema change fairly easily and know who did the changes. However, when come to data, we have to compare each version in the history to get that kind of information. It's very very painful.
-
Phil Collins commented
Currently the RedGateDatabaseInfo.xml file has a node called <DataFileSet>, underneath which is a node (<DataFile>) for each table that is static data tracked.
The top node under the <DataFileSet> (before the individual <DataFile>'s) is a count of all the <DataFile> nodes below.
Can this count be done in the code?
I am forever solving merge conflict that are just that count node having the wrong value.
-
JackAce commented
This is a huge issue for our company, which uses Git and works exclusively in feature branches. Our version controlled data is a nightmare to merge and resolving this issue should be a simple enhancement on your end. Please implement this as soon as possible.
-
Philippe Dansereau commented
I'm currently achieving that feature thru some contortions: When order gets mixed up by RGSS, I use Red Gate Data compare to compare my branch and an empty db. I then ask data compare for a script. It complies and gives me a nicely ordered script that I then cut & paste back in my version control x_Data.sql file. So, since Data Compare can do it, it should be easy to add the feature in RGSS, I hope.
-
AdminPaul Stephenson (Admin, Redgate) commented
Good spot Graham. I've merged the two suggestions now.
-
Graham Clark commented
This is the same as http://redgate.uservoice.com/forums/39019-sql-source-control/suggestions/2024519-make-sql-data-easier-to-merge so should be merged. Hopefully the order of the requests won't make that difficult!
-
Graham Clark commented
This is the same as http://redgate.uservoice.com/forums/39019-sql-source-control/suggestions/3082371-diffs-when-ssc-changes-order-of-rows-in-static-tab so should be merged. Hopefully the order of the requests won't make that difficult!
-
Justin Jones commented
We are using XML data that needs to be versioned and updated through our branching process. This issue is severely impeding our development process.
-
Anonymous commented
This is very, very frustrating! We were hoping to become more efficient by switching to this tool, but this issue has resulted in the opposite.
-
Anonymous commented
This is a large problem we have as well. We find that when the insert happens it will insert them in odd orders, then if someone updates data in the table, it will then reorder the records thus moving the data around. When going to merge this can be a huge hassle!
-
CShen commented
Putting the latest change on the top is useless to me. I really care what has changed, so that order the data in a static way will be extremely useful to me. At least, please give us an option.
-
This definitely needs addressing. I've aged ten years in the last year because of this issue
-
Sea Shore commented
I'm out of votes... but this really needs sorting out... big waste of time!
-
Bernard commented
It will be better if the static data link feature in SQL Source Control is implemented in a similar way as SQL Data Compare where we can have a visual row by row comparisons and selectively commit / undo the changes.
-
David Madrian commented
Yes, we're having exactly the same problem and frustration.