Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow some migrations to run online / without downtime #4985

Open
5 tasks done
phiresky opened this issue Aug 20, 2024 · 2 comments
Open
5 tasks done

Allow some migrations to run online / without downtime #4985

phiresky opened this issue Aug 20, 2024 · 2 comments
Labels
area: database enhancement New feature or request

Comments

@phiresky
Copy link
Collaborator

Requirements

  • Is this a feature request? For questions or discussions use https://lemmy.ml/c/lemmy_support
  • Did you check to see if this issue already exists?
  • Is this only a feature request? Do not put multiple feature requests in one issue.
  • Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.
  • Do you agree to follow the rules in our Code of Conduct?

Is your proposal related to a problem?

We've had a multiple cases now where there's a migration that is backwards-compatible, meaning that the new version can run fine without the migration having been run, with the worst case being that some results are slightly wrong or minor degradation of functionality. Many of them are even forward-compatible as well, so the old version of lemmy can also deal with the new schema.

The most recent example in 0.19.6: https://github.com/LemmyNet/lemmy/blob/4018049d10ef84968b31e03d385fdd6874e5782b/migrations/2024-07-01-014711_exponential_controversy/up.sql#L1-L16

This migration purely updates many rows with a new value. Currently this migration runs during a full downtime of lemmy, resulting in 30min+ where the instance is unavailable. In addition, afterwards it needs to catch up with the federation resulting in high instance load and potentially slowness for a while.

Describe the solution you'd like.

The example migration above could be written in a way that it does not impact site operation, by processing the data in batches. For example:

CREATE OR REPLACE PROCEDURE fix_controversy_rank_2024_07_01 (start_id bigint, end_id bigint)
    AS $$
DECLARE
    batch_size int := 10 ^ 4;
BEGIN
    LOOP
        RAISE NOTICE 'Processing controversy rank starting from ID: %', start_id;
        UPDATE
            post_aggregates
        SET
            controversy_rank = CASE WHEN downvotes <= 0
                OR upvotes <= 0 THEN
                0
            ELSE
                (upvotes + downvotes) ^ CASE WHEN upvotes > downvotes THEN
                    downvotes::float / upvotes::float
                ELSE
                    upvotes::float / downvotes::float
                END
            END
        WHERE
            upvotes > 0
            AND downvotes > 0
            AND id BETWEEN start_id AND start_id + batch_size;
        -- go to next batch
        start_id := start_id + batch_size;
        EXIT
        WHEN start_id > end_id;
        COMMIT; -- this commit is very important so that each batch happens in a separate transaction
        -- Exit when start_id exceeds end_id
    END LOOP;
END;
$$
LANGUAGE plpgsql;

DO $$
DECLARE
    end_id bigint;
BEGIN
    SELECT
        COALESCE(MAX(id), 1) INTO end_id
    FROM post_aggregates;
    RAISE NOTICE 'Processing batch starting from ID 1 to ID: %', end_id;
    CALL fix_controversy_rank_2024_07_01 (1::bigint, end_id::bigint);
END
$$;

Then the issue becomes, how do we run it? The problem is that we will have some migrations that can run online, but some that need to be offline. The diesel migration runner can not make a distinction between these. In #4673, @dullbananas is trying to add a custom migration runner which might be useful for this.

We could put online migration into a separate directory or add something like -- lemmy-online-migration -- to the top of the files. Those files would be run separately and after server start, instead of before it.

Describe alternatives you've considered.

  • we can keep it as is, resulting in more and more downtimes as tables increase in size
  • we could remove some of those migrations and instead publish in the changelog something like "please run this query manually at your convenience"

Additional context

No response

@phiresky phiresky added the enhancement New feature or request label Aug 20, 2024
@dullbananas
Copy link
Collaborator

Migrations should create procedures in a separate schema called "run_after_migrations", and the lemmy server should spawn a task that runs and drops the procedures. It will need to be specified that these procedures must be idempotent because some procedures like this controversy rank fixer need to commit changes before the procedure is dropped. Migrations can update or delete an old procedure if it becomes incompatible with the current schema.

@dessalines
Copy link
Member

Question about that migration specifically: Does it need to run on all history? Or would a limited set, like the last week of posts only work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: database enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants