Allow some migrations to run online / without downtime #4985

phiresky · 2024-08-20T09:42:29Z

Requirements

Is this a feature request? For questions or discussions use https://lemmy.ml/c/lemmy_support
Did you check to see if this issue already exists?
Is this only a feature request? Do not put multiple feature requests in one issue.
Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.
Do you agree to follow the rules in our Code of Conduct?

Is your proposal related to a problem?

We've had a multiple cases now where there's a migration that is backwards-compatible, meaning that the new version can run fine without the migration having been run, with the worst case being that some results are slightly wrong or minor degradation of functionality. Many of them are even forward-compatible as well, so the old version of lemmy can also deal with the new schema.

The most recent example in 0.19.6: https://github.com/LemmyNet/lemmy/blob/4018049d10ef84968b31e03d385fdd6874e5782b/migrations/2024-07-01-014711_exponential_controversy/up.sql#L1-L16

This migration purely updates many rows with a new value. Currently this migration runs during a full downtime of lemmy, resulting in 30min+ where the instance is unavailable. In addition, afterwards it needs to catch up with the federation resulting in high instance load and potentially slowness for a while.

Describe the solution you'd like.

The example migration above could be written in a way that it does not impact site operation, by processing the data in batches. For example:

CREATE OR REPLACE PROCEDURE fix_controversy_rank_2024_07_01 (start_id bigint, end_id bigint)
    AS $$
DECLARE
    batch_size int := 10 ^ 4;
BEGIN
    LOOP
        RAISE NOTICE 'Processing controversy rank starting from ID: %', start_id;
        UPDATE
            post_aggregates
        SET
            controversy_rank = CASE WHEN downvotes <= 0
                OR upvotes <= 0 THEN
                0
            ELSE
                (upvotes + downvotes) ^ CASE WHEN upvotes > downvotes THEN
                    downvotes::float / upvotes::float
                ELSE
                    upvotes::float / downvotes::float
                END
            END
        WHERE
            upvotes > 0
            AND downvotes > 0
            AND id BETWEEN start_id AND start_id + batch_size;
        -- go to next batch
        start_id := start_id + batch_size;
        EXIT
        WHEN start_id > end_id;
        COMMIT; -- this commit is very important so that each batch happens in a separate transaction
        -- Exit when start_id exceeds end_id
    END LOOP;
END;
$$
LANGUAGE plpgsql;

DO $$
DECLARE
    end_id bigint;
BEGIN
    SELECT
        COALESCE(MAX(id), 1) INTO end_id
    FROM post_aggregates;
    RAISE NOTICE 'Processing batch starting from ID 1 to ID: %', end_id;
    CALL fix_controversy_rank_2024_07_01 (1::bigint, end_id::bigint);
END
$$;

Then the issue becomes, how do we run it? The problem is that we will have some migrations that can run online, but some that need to be offline. The diesel migration runner can not make a distinction between these. In #4673, @dullbananas is trying to add a custom migration runner which might be useful for this.

We could put online migration into a separate directory or add something like -- lemmy-online-migration -- to the top of the files. Those files would be run separately and after server start, instead of before it.

Describe alternatives you've considered.

we can keep it as is, resulting in more and more downtimes as tables increase in size
we could remove some of those migrations and instead publish in the changelog something like "please run this query manually at your convenience"

Additional context

No response

The text was updated successfully, but these errors were encountered:

dullbananas · 2024-08-20T13:06:24Z

Migrations should create procedures in a separate schema called "run_after_migrations", and the lemmy server should spawn a task that runs and drops the procedures. It will need to be specified that these procedures must be idempotent because some procedures like this controversy rank fixer need to commit changes before the procedure is dropped. Migrations can update or delete an old procedure if it becomes incompatible with the current schema.

dessalines · 2024-09-10T17:13:09Z

Question about that migration specifically: Does it need to run on all history? Or would a limited set, like the last week of posts only work?

phiresky added the enhancement New feature or request label Aug 20, 2024

dullbananas added the area: database label Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow some migrations to run online / without downtime #4985

Allow some migrations to run online / without downtime #4985

phiresky commented Aug 20, 2024

dullbananas commented Aug 20, 2024

dessalines commented Sep 10, 2024

Allow some migrations to run online / without downtime #4985

Allow some migrations to run online / without downtime #4985

Comments

phiresky commented Aug 20, 2024

Requirements

Is your proposal related to a problem?

Describe the solution you'd like.

Describe alternatives you've considered.

Additional context

dullbananas commented Aug 20, 2024

dessalines commented Sep 10, 2024