GitLab在CockroachDB和YugabyteDB上的兼容性对比(一)-系统初始化

测试背景

GitLab是一款在全球范围内都非常流行的源代码管理工具,早期的版本当中用户可以选择使用MySQL或PostgreSQL两种数据库,但是从12.1.0版本开始官方就完全放弃了对MySQL的支持。

GitLab新版本中很多功能都基于PostgreSQL的特性开发,它是众多使用了PostgreSQL作为底层数据存储的标杆产品。

我们试想一下这种用户场景,某大型集团分为众多事业部,每个事业部甚至小团队可能都维护了自己的GitLab,从集团层面如何管理这些仓库就成了棘手的问题。比如:

  • 版本问题(开源版和商业版,高版本和低版本)
  • 精细化权限控制
  • 数据备份
  • 基础设施利用率

如果能有一套统一的GitLab环境,同时又具备良好的可扩展性和高可用性,那无疑是最好的解决方案。但是传统单机PostgreSQL数据库并不能满足以上需求,那能否考虑把GitLab跑在分布式数据库上?

CockroachDB和YugabyteDB是目前比较知名的实现了PG协议的新型开源分布式数据库,根据各自官网的描述:

CockroachDB supports the PostgreSQL wire protocol and the majority of PostgreSQL syntax. This means that existing applications built on PostgreSQL can often be migrated to CockroachDB without changing application code. (原文出处

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. (原文出处

CockroachDB说支持绝大多数的PG语法,YugabyteDB说支持所有的PG特性,本系列测评文章用于对比这两款数据库对GitLab的支持程度如何,一定程度上能反映出对标准PostgreSQL的兼容情况。

测试环境

  • CockroachDB
defaultdb=# select version();
                                         version
-----------------------------------------------------------------------------------------
 CockroachDB CCL v21.2.2 (x86_64-unknown-linux-gnu, built 2021/12/01 14:35:45, go1.16.6)
(1 row)
  • YugabyteDB
postgres=# select version();
                                                  version
------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.9.1.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit
(1 row)
  • GitLab
GitLab information
Version:        12.1.0-ee
Revision:       1f2e6f3f6d8
Directory:      /home/git/gitlab
DB Adapter:     PostgreSQL

用标准PostgreSQL部署的GitLab包含的数据库schema为:

gitlab_production=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
 relkind | count
---------+-------
 r       |   249
 i       |   903
 S       |   231
(3 rows)

CockroachDB启动流程

1、数据库初始化

执行GitLab setup程序生成所需要的库表结构:

dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production
This will create the necessary database tables and seed the database.
You will lose any previous data stored in the database.
Do you want to continue (yes/no)? yes
 
Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm")
rake aborted!
ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR:  unimplemented: extension "pg_trgm" is not yet supported
HINT:  You have attempted to use a feature that is not yet implemented.
See: https://go.crdb.dev/issue-v/51137/v21.2
: CREATE EXTENSION IF NOT EXISTS "pg_trgm"
/home/git/gitlab/config/initializers/peek.rb:18:in `async_exec_params'
/home/git/gitlab/config/initializers/peek.rb:18:in `exec_params'
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `block (2 levels) in exec_no_cache'
....

从上面的输出信息可以看到,GitLab初始化需要依赖PostgreSQL的Extension特性,但是很遗憾CockroachDB目前还不支持,在第一步就失败了,此时数据库中没有创建任何对象:

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
Empty set

2、访问GitLab

当我们访问GitLab主页面时会返回502错误信息:


从日志来看,是因为SQL执行的时候找不到目标表报错:

ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation "geo_nodes" does not exist
:               SELECT a.attname, format_type(a.atttypid, a.atttypmod),
                     pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
                     c.collname, col_description(a.attrelid, a.attnum) AS comment
                FROM pg_attribute a
                LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                LEFT JOIN pg_type t ON a.atttypid = t.oid
                LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
               WHERE a.attrelid = '"geo_nodes"'::regclass
                 AND a.attnum > 0 AND NOT a.attisdropped
               ORDER BY a.attnum

3、更新数据库版本

考虑到当前CockroachDB不是最新版本,有没有可能最新版已经支持extension功能,尝试升级一下版本到latest-v22.1:

defaultdb=# select version();
                                      version
------------------------------------------------------------------------------------
 CockroachDB CCL v22.1.0 (x86_64-pc-linux-gnu, built 2022/05/23 16:27:47, go1.17.6)
(1 row)

再次执行setup创建数据库,发现还是报相同的问题“ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR: unimplemented: extension “pg_trgm” is not yet supported”,说明新版本也无法支持extension特性。

YugabyteDB启动流程

1、数据库初始化

修改GitLab配置文件把数据库连接切换到YugabyteDB,用相同办法初始化一个新库:

dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production
This will create the necessary database tables and seed the database.
You will lose any previous data stored in the database.
Do you want to continue (yes/no)? yes
 
Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm")
   -> 2.5496s
-- enable_extension("plpgsql")
   -> 0.1143s
-- create_table("abuse_reports", {:id=>:serial, :force=>:cascade})
   -> 0.3709s
-- create_table("appearances", {:id=>:serial, :force=>:cascade})
   -> 0.3022s
...
...
-- create_table("issue_tracker_data", {:force=>:cascade})
   -> 3.7627s
-- create_table("issues", {:id=>:serial, :force=>:cascade})
rake aborted!
ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  index method "ybgin" not supported yet
HINT:  See https://github.com/YugaByte/yugabyte-db/issues/1337. Click '+' on the description to raise its priority
: CREATE  INDEX  "index_issues_on_description_trigram" ON "issues" USING gin ("description" gin_trgm_ops)
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'

从以上输出信息可以看出,刚开始setup运行正常,可以正常创建extension和table,持续约20分钟后碰到创建索引失败,原因是YugabyteDB不能识别“gin”类型的索引,取而代之的类型是“ybgin”。

看一下到这一步数据库生成了哪些对象:

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
 relkind | count
---------+-------
 S       |   113
 i       |   391
 r       |   117
(3 rows)

情况看起来比CockroachDB要好一些,但是比完整的库表结构还是差很多。

2、访问GitLab

此时依然无法访问GitLab主页面,从日志里面发现报错原因是缺少目标表:

source=rack-timeout id=7gatOugcqB8 timeout=60000ms state=ready
Started GET "/" for 10.3.74.126 at 2022-05-27 16:05:31 +0800
Processing by RootController#index as HTML
Completed 500 Internal Server Error in 78ms (ActiveRecord: 58.8ms | Elasticsearch: 0.0ms)
   
ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR:  relation "projects" does not exist
LINE 8:                WHERE a.attrelid = '"projects"'::regclass
                                          ^
:               SELECT a.attname, format_type(a.atttypid, a.atttypmod),
                     pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
                     c.collname, col_description(a.attrelid, a.attnum) AS comment
                FROM pg_attribute a
                LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                LEFT JOIN pg_type t ON a.atttypid = t.oid
                LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
               WHERE a.attrelid = '"projects"'::regclass
                 AND a.attnum > 0 AND NOT a.attisdropped
               ORDER BY a.attnum
):

3、更新数据库版本

同样地,我们尝试把YugabytesDB升级到最新版本,看是否已经完成了Gin索引兼容:

postgres=# select version();
                                                                                         version
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.13.2.0-b0 on x86_64-pc-linux-gnu, compiled by clang version 12.0.1 (https://github.com/yugabyte/llvm-project.git bdb147e675d8c87cee72cc1f87c4b82855977d94), 64-bit
(1 row)

再次执行setup程序,这个过程比较顺利,大约30分钟以后程序正常退出无报错。这时候我们看一下数据库中的对象情况:

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
 relkind | count
---------+-------
 S       |   231
 i       |   903
 r       |   249
(3 rows)

可以看到和标准PostgreSQL库对比完全一致。打开浏览器访问GitLab主页会自动跳转到登录页,查看日志无报错:


填写用户注册表单提交后新用户注册成功,自动跳转到GitLab主页面:

初步来看,GitLab功能没有受到切换数据库的影响,更详细的测试将在下一期中给大家呈现。

测试结论

1、CockroachDB v21.2不支持Extension功能,导致GitLab无法初始化数据库,最终启动失败,更新到最新版本v22.1后问题依旧存在。

2、YugabyteDB v2.9不支持Gin Index(Generalized inverted indexes),导致创建一部分表后报错,同样无法启动,但是更新到最新版本v2.13后问题解决,可以正常访问GitLab页面以及注册用户。

3、YugabyteDB支持PostgreSQL Extension,CockroachDB不支持。

下一步计划

下一步我们将尝试绕过GitLab生成数据库这一步骤,把一个带数据的标准GitLab库导入到CockroachDB和YugabyteDB中,选取一部分频繁使用的读写场景,再对比两者的兼容性表现。

1 个赞

这个测试很有意思

欢迎来一起交流~ :grinning: