Merging Active Record models
We’ve been working on a project that involves importing a massive amount of data from multiple sources. The data is somewhat complicated, so we occasionally end up with duplicate records that need merged together. The data is highly normalized, so there are a bunch of associations that also need merged. If you’ve ever done this by hand, you know how painful it can be.
To alleviate that pain, I introduce you to merger, a Rails plugin for merging Active Record models.
The plugin is pretty simple right now. All it does is:
- Given a set of records, picks the oldest record (the one with the lowest id) as the one to keep
- Moves any associated
has_many
andhabtm
records from the duplicates to the record that is being kept - Deletes the duplicate records
We intend to add a lot to it, including:
- Strategies for choosing which record to keep
- Strategies for merging the individual attributes of the records
- Recursively merge associations based on certain attributes
- Options for what to do with the duplicate records
Check it out on Github and let us know what you think.
- Photo adapted from http://flickr.com/photos/xrrr/2478140383/