oushudb backup restore#

概述#

OushuDB 提供 oushudb backupoushudb restore 命令,用于对全库或指定库表进行备份与恢复。命令利用 pg_dump 导出元数据,同时在计算节点并行将数据导出至 HDFS,支持增量备份,支持 ORC 和 Hudi 表格式。

命令用法#

oushudb backup 命令

oushudb backup [-h] --backup-url BACKUP_URL [--include-object [INCLUDE_OBJECT ...]] [--incremental] [--with-global] [--schema-only]
                      [--jobs JOBS] [--verbose] [--dbname DBNAME] [--host HOST] [--database DATABASE] [--port PORT] [--username USERNAME]
                      [--no-password] [--password]

optional arguments:
  -h, --help            show this help message and exit
  --backup-url BACKUP_URL
                        the URL to backup, e.g. hdfs://host:port/path
  --include-object [INCLUDE_OBJECT ...]
                        include object in backup, e.g. <db>, <db.schema>, <db.schema.table>
  --incremental         perform incremental backup
  --with-global         include global objects in backup, including roles, tablespaces
  --schema-only         only backup schema
  --jobs JOBS           number of jobs to run in parallel
  --verbose             verbose output
  --dbname DBNAME       connect using connection string
  --host HOST           database server host or socket directory
  --database DATABASE   alternative default database
  --port PORT           database server port
  --username USERNAME   connect as specified database user
  --no-password         never prompt for password
  --password            force password prompt (should happen automatically)

oushudb restore 命令

oushudb restore [-h] --backup-url BACKUP_URL [--include-object [INCLUDE_OBJECT ...]] [--timestamp TIMESTAMP] [--with-global]
                       [--schema-only] [--create] [--clean] [--jobs JOBS] [--show-backup] [--on-error-continue] [--verbose] [--dbname DBNAME]
                       [--host HOST] [--database DATABASE] [--port PORT] [--username USERNAME] [--no-password] [--password]

optional arguments:
  -h, --help            show this help message and exit
  --backup-url BACKUP_URL
                        the URL to backup, e.g. hdfs://host:port/path
  --include-object [INCLUDE_OBJECT ...]
                        include object in backup, e.g. <db>, <db.schema>, <db.schema.table>
  --timestamp TIMESTAMP
                        restore from the specified timestamp, e.g. 20250101000000, you can get the timestamp by list backup directory
  --with-global         restore with global objects, including roles, tablespaces
  --schema-only         only restore schema
  --create              create the databases
  --clean               clean objects before restore
  --jobs JOBS           number of jobs to run in parallel
  --show-backup         show backup-set timestamps, if a timestamp is specified, show the backup objects of the timestamp
  --on-error-continue   continue restore on sql errors
  --verbose             verbose output
  --dbname DBNAME       connect using connection string
  --host HOST           database server host or socket directory
  --database DATABASE   alternative default database
  --port PORT           database server port
  --username USERNAME   connect as specified database user
  --no-password         never prompt for password
  --password            force password prompt (should happen automatically)

查看备份集#

oushudb backup 每次全量备份会创建一个新的目录,每次增量会在全量备份目录下创建新的子目录,目录结构如下。使用 oushudb restore --show-backup --backup-url <url> 命令可以查看历史备份时间戳,当恢复时,需要指定该时间戳使目标库表恢复到对应时间点。

hdfs://nameservice/backupdir/
├── base_timestamp1/
│   ├── timestamp1/
│   ├── timestamp2
│   └── timestamp3
└── base_timestamp4/
    ├── timestamp4/
    ├── timestamp5
    └── timestamp5

注意事项#

  • 当前 oushudb backup/restore 不支持 magma 表。

  • 恢复整个集群时,用户需要提前配置好同名的 vc、resqueue 对象,备份内只包含 role 和 tablespace 全局对象。

示例#

全量备份和增量备份

oushudb backup --backup-url hdfs://localhost:8020/backupdir
oushudb backup --incremental --backup-url hdfs://localhost:8020/backupdir

查看备份集

oushudb restore --show-backup --backup-url hdfs://localhost:8020/backupdir

恢复指定到时间戳

oushudb restore --backup-url hdfs://localhost:8020/backupdir --timestamp 20250124161224

仅备份恢复指定库表,如备份 db1 的 schema s1 和 db2 的 schema s2,在恢复时,只恢复 s1 下的表 t1

oushudb backup --include-object db1.s1 db2.s2 --backup-url hdfs://localhost:8020/backupdir
oushudb backup --include-object db1.s1 db2.s2 --incremental --backup-url hdfs://localhost:8020/backupdir

oushudb restore --include-object db1.s1.t1 --backup-url hdfs://localhost:8020/backupdir --timestamp 20250124161224

恢复整个集群,包括 role 和 tablespace

oushudb restore --with-global --backup-url hdfs://localhost:8020/backupdir --timestamp 20250124161224

恢复时自动创建缺失的数据库

oushudb restore --create --backup-url hdfs://localhost:8020/backupdir --timestamp 20250124161224

恢复时删除已有的表

oushudb restore --clean --backup-url hdfs://localhost:8020/backupdir --timestamp 20250124161224