db:dump should dump data #196

fdr · 2015-08-04T19:52:49Z

https://github.com/interagent/pliny/blob/master/lib/pliny/tasks/db.rake#L89

When one has a some seed data added via migrations over time, it conflicts poorly with the schema-only dump done here. That's because one is obliged to add them to seeds.rb or schema.sql by hand and then make idempotent migrations foreverafter, or risk losing them the next time the schema is compacted (one is free to speculate how many errors are going to creep in from having to update seeds.rb and complicate migrations for any seed rows, I think it's unnecessarily many)

It would be better if a fresh database were prepared, migrated, and then dumped with data. This would also avoid the special code here to keep track of migrations specially.

The text was updated successfully, but these errors were encountered:

pedro · 2015-08-06T01:37:47Z

We inherited this model from Rails and it has worked nice in my experience: the basic idea is that migrations should only change structure, not data. I think this is a good approach because data migrations tend to be much slower and are a good candidate to run out of band. Not to mention you may want to use migrate data using your models so any default attributes and validations apply.

Are seeds not a good fit for the kind of data migration you're running?

fdr · 2015-08-06T02:14:12Z

Let me state the general problem, putting aside how schema.sql is generated:

The general problem is that one writes data migrations to, say, update a table that's mostly a lookup table, with say, twenty records in it (then over the years, you write migrations to modify, add, or remove those records systematically in some way).

When schema.sql is generated, it loses the elements of that lookup table, at the same time you remove all the migration files.

No big deal, so you put the lookup stuff in seeds. But then the migration, which you need for production, cannot be applied properly in test unless you take pains to make it idempotent.

The solution I chose in one instance was to make seeds.rb a subset of the records I wanted in production, and then let the migration take care of modifying the result of seeds.rb. The objection is that the next time schema.sql is generated and the migrations truncated, that careful arrangement will be lost ("two sources of truth").

What I see as a non-solution is, while making migrations that modify hand-maintained tables, one has to modify seeds.rb and absorb extra complexity in the migrations forever after if one does not also want to leave an obscure bug behind for re-compaction via schema.sql.

fdr · 2015-08-06T02:30:27Z

Another reasonable answer is: don't brainlessly do schema dumps and delete migration files assuming it will work, particularly if diff $pg_dump_before $pg_dump_after do not match.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db:dump should dump data #196

db:dump should dump data #196

fdr commented Aug 4, 2015

pedro commented Aug 6, 2015

fdr commented Aug 6, 2015

fdr commented Aug 6, 2015

db:dump should dump data #196

db:dump should dump data #196

Comments

fdr commented Aug 4, 2015

pedro commented Aug 6, 2015

fdr commented Aug 6, 2015

fdr commented Aug 6, 2015