GSoC Week1

发表于 2022-06-20 更新于 2022-07-23

Date: June 13, 2022 → June 19, 2022

Overview This week’s tasks

Review https://github.com/casbin/casbin-rs/pull/293
Make 2 issue & 2 PRs
Explore swagger code generator and explain why it doesn’t work

Task 1 Review PR

https://github.com/casbin/casbin-rs/pull/293

Original implementation

Original implementation use Arc<RwLock<Role>> :

pub struct DefaultRoleManager {
    all_domains: HashMap<String, HashMap<String, Arc<RwLock<Role>>>>,
		//...
}
pub struct Role {
    name: String,
    roles: Vec<Arc<RwLock<Role>>>,
}

The key of HashMap<String, HashMap<String, Arc<RwLock<Role>>>> is domain name (default value is DEFAULT), the key of HashMap<String, Arc<RwLock<Role>>> is role name, the entry is Arc<RwLock<Role>> , which stores role information: name of the role, and all roles it directly has

It is quite expensive as Arc<T> uses atomic ****operations for its reference counting, and there are many read() and write() operations for RwLock<Role> in current implmentation:

//L185
role1.write().delete_role(role2);

//L332
fn has_direct_role(&self, name: &str) -> bool {
		self.roles.iter().any(|role| role.read().name == name)
}

Important functions:

add_link append role2 to role1.roles
delete_link delete role2 from role1.roles
has_link check if role name1 has role name2 (dosen’t have to be direct role), if domain_matching_fn is speicified, it will find roles in matched domains
get_roles get all direct roles user name has in matched domains
get_users get all users that directly have role name in matched domains

Novel implementation

This PR introduces 2 new crates:

https://github.com/petgraph/petgraph graph data structure library, use StableDiGraph to store roles
https://github.com/petgraph/fixedbitset a simple bitset container, used in BFS to mark visited nodes

This PR’s implementation is like Go Casbin (https://github.com/casbin/casbin/blob/master/rbac/default-role-manager/role_manager.go)

type Role struct {
	name      string
	roles     *sync.Map
	users     *sync.Map
	matched   *sync.Map
	matchedBy *sync.Map
}

But instead of using map and storing related roles’ pointers in them, it use StableDiGraph to construct roles graph, and use edges EdgeVariant to represent relations:

enum EdgeVariant {
    Link,
    Match,
}

Edges with the EdgeVariant::Link are relations which in Go are modeled by the roles and users map.
Edges with the EdgeVariant::Match are relations which is Go are modeled by the matched and matchedBy map.

This PR gets rid of Arc<RwLock<Role>> by using StableDiGraph and NodeIndex :

pub struct DefaultRoleManager {
    all_domains: HashMap<String, StableDiGraph<String, EdgeVariant>>,
    all_domains_indices: HashMap<String, HashMap<String, NodeIndex<u32>>>,
    //...
}

StableGraph<N, E, Ty, Ix> is a graph datastructure using an adjacency list representation, all_domains constructs a StableGraph with directed edges for each domain, all_domains_indices stores node identifier NodeIndex of every role node in the graph

Important functions:

get_or_create_role

get or create role;

if the role is new, graph.add_node() to add role to grapgh;

if role_matching_fn is specified, call link_if_matches to match existing roles against new role and vice versa, if matched, call graph.add_edge() to create EdgeVariant::*Match* edge between roles.
add_link

add link from role1 to role2 , call graph.add_edge() to add EdgeVariant::*Link edge*
delete_link

remove edge from role1 to role2
has_link

Bfs searching in graph, checking if role1 is connected to role2
get_roles

Bfs searching in graph, getting all roles the user directly has
get_users

Find all nodes having Direction::*Incoming* edge connected to this role, that is, getting all users that directly have this role

Why Performance impoved?

In StableGraph, nodes (roles) and edges (relations) are each numbered in an interval from 0 to some number m, so we can access nodes and edges using their indices, also, creating a role is just adding a node to graph, link roles is justing add an edge between 2 nodes, we don’t have to modify nodes and edges, so Arc and RWLock are no longer needed.

Without atomic operations and lock/unlock, role manager is much faster now.

Tests

The PR add 2 tests:

test_basic_role_matching test user with wildcard *
test_basic_role_matching2 test role with wildcard *

Migrate these tests to Go Casbin:

func TestBasicRoleMatching(t *testing.T) {
	rm := NewRoleManager(10)
	rm.AddMatchingFunc("keyMatch", util.KeyMatch)

	_ = rm.AddLink("bob", "book_group")
	_ = rm.AddLink("*", "book_group")
	_ = rm.AddLink("*", "pen_group")
	_ = rm.AddLink("eve", "pen_group")

	testRole(t, rm, "alice", "book_group", true)
	testRole(t, rm, "eve", "book_group", true)
	testRole(t, rm, "bob", "book_group", true)
	testPrintRoles(t, rm, "alice", []string{"book_group", "pen_group"})
}

func TestBasicRoleMatching2(t *testing.T) {
	rm := NewRoleManager(10)
	rm.AddMatchingFunc("keyMatch", util.KeyMatch)

	_ = rm.AddLink("alice", "book_group")
	_ = rm.AddLink("alice", "*")
	_ = rm.AddLink("bob", "pen_group")

	testRole(t, rm, "alice", "book_group", true)
	testRole(t, rm, "alice", "pen_group", true)
	testRole(t, rm, "bob", "pen_group", true)
	testRole(t, rm, "bob", "book_group", false)
	testPrintRoles(t, rm, "alice", []string{"*", "alice", "bob", "book_group", "pen_group"})
	testPrintUsers(t, rm, "*", []string{"alice"})
}

TestBasicRoleMatching passed, TestBasicRoleMatching2 failed with:

1	role_manager_test.go:371: alice: [* book_group alice book_group bob pen_group], supposed to be [* alice bob book_group pen_group]

Note that book_group appears twice, this is because in (*Role).rangeRoles :

func (r *Role) rangeRoles(fn func(key, value interface{}) bool) {
	r.roles.Range(fn)
	r.roles.Range(func(key, value interface{}) bool {
		role := value.(*Role)
		role.matched.Range(fn)
		return true
	})
	r.matchedBy.Range(func(key, value interface{}) bool {
		role := value.(*Role)
		role.roles.Range(fn)
		return true
	})
}

All roles, matched, and matchedBy roles are appended to result, which is a list, whille casbin-rs uses HashSet , so there are no duplicate in casbin-rs.

Benchmark

Benchmark changes from GitHub workflow:

https://github.com/casbin/casbin-rs/runs/6409294114

group                                 changes                                master
-----                                 -------                                ------
b_benchmark_rbac_model_large          1.00     15.7±0.59ms        ? ?/sec    1.26     19.7±0.85ms        ? ?/sec
benchmark priority model              1.02      9.3±0.49µs        ? ?/sec    1.00      9.1±0.34µs        ? ?/sec
benchmark_abac_model                  1.01      5.4±0.32µs        ? ?/sec    1.00      5.3±0.55µs        ? ?/sec
benchmark_basic_model                 1.00      8.4±0.45µs        ? ?/sec    1.00      8.4±0.49µs        ? ?/sec
benchmark_key_match                   1.00     30.6±2.74µs        ? ?/sec    1.02     31.2±1.97µs        ? ?/sec
benchmark_raw                         1.01      5.0±0.18ns        ? ?/sec    1.00      5.0±0.16ns        ? ?/sec
benchmark_rbac_model                  1.00     12.5±0.65µs        ? ?/sec    1.00     12.4±0.95µs        ? ?/sec
benchmark_rbac_model_medium           1.00  1380.1±59.72µs        ? ?/sec    1.12  1551.8±91.11µs        ? ?/sec
benchmark_rbac_model_with_domains     1.00     13.0±0.54µs        ? ?/sec    1.02     13.2±0.85µs        ? ?/sec
benchmark_rbac_with_deny              1.00     17.0±1.32µs        ? ?/sec    1.00     17.0±1.17µs        ? ?/sec
benchmark_rbac_with_resource_roles    1.00      9.9±1.04µs        ? ?/sec    1.01     10.0±0.66µs        ? ?/sec
benchmark_role_manager_large          1.04      8.3±0.26ms        ? ?/sec    1.00      8.0±0.83ms        ? ?/sec
benchmark_role_manager_medium         1.00   487.8±22.79µs        ? ?/sec    1.31   640.4±41.89µs        ? ?/sec
benchmark_role_manager_small          1.00    142.5±7.47µs        ? ?/sec    1.09    155.4±9.60µs        ? ?/sec
┌─────────┬──────────────────────────────────────┬──────────────────────┬──────────────────┬────────────┐
│ (index) │                 name                 │   changesDuration    │  masterDuration  │ difference │
├─────────┼──────────────────────────────────────┼──────────────────────┼──────────────────┼────────────┤
│    0    │    'b_benchmark_rbac_model_large'    │  '**15.7±0.59ms**'   │  '19.7±0.85ms'   │   '-21'    │
│    1    │      'benchmark priority model'      │     '9.3±0.49µs'     │ '**9.1±0.34µs**' │   '+2.0'   │
│    2    │        'benchmark_abac_model'        │     '5.4±0.32µs'     │ '**5.3±0.55µs**' │   '+1.0'   │
│    3    │       'benchmark_basic_model'        │     '8.4±0.45µs'     │   '8.4±0.49µs'   │   '0.0'    │
│    4    │        'benchmark_key_match'         │  '**30.6±2.74µs**'   │  '31.2±1.97µs'   │   '-2.0'   │
│    5    │           'benchmark_raw'            │     '5.0±0.18ns'     │ '**5.0±0.16ns**' │   '+1.0'   │
│    6    │        'benchmark_rbac_model'        │    '12.5±0.65µs'     │  '12.4±0.95µs'   │   '0.0'    │
│    7    │    'benchmark_rbac_model_medium'     │ '**1380.1±59.72µs**' │ '1551.8±91.11µs' │   '-11'    │
│    8    │ 'benchmark_rbac_model_with_domains'  │  '**13.0±0.54µs**'   │  '13.2±0.85µs'   │   '-2.0'   │
│    9    │      'benchmark_rbac_with_deny'      │    '17.0±1.32µs'     │  '17.0±1.17µs'   │   '0.0'    │
│   10    │ 'benchmark_rbac_with_resource_roles' │   '**9.9±1.04µs**'   │  '10.0±0.66µs'   │  '-0.99'   │
│   11    │    'benchmark_role_manager_large'    │     '8.3±0.26ms'     │ '**8.0±0.83ms**' │   '+4.0'   │
│   12    │   'benchmark_role_manager_medium'    │ '**487.8±22.79µs**'  │ '640.4±41.89µs'  │   '-24'    │
│   13    │    'benchmark_role_manager_small'    │  '**142.5±7.47µs**'  │  '155.4±9.60µs'  │   '-8.3'   │
│   14    │                  ''                  │      undefined       │    undefined     │   '+NaN'   │
└─────────┴──────────────────────────────────────┴──────────────────────┴──────────────────┴────────────┘

Improved above 5%:

b_benchmark_rbac_model_large -21%
benchmark_rbac_model_medium -11%
benchmark_role_manager_medium -24%
benchmark_role_manager_small -8.3%

Regressed above 2%:

benchmark_role_manager_large +4.0%

And I found that benchmark workflow doesn’t work well:

https://github.com/casbin/casbin-rs/runs/6409294114#step:5:459

Untitled

It fails to post comment of benchmark result to current PR with status code 403

Task 2 Make 2 issue & 2 PRs

issue: benchmark workflow is unable to push benchmark result to PR comment

https://github.com/casbin/casbin-rs/issues/294

issue: As I said above, when reviewing PR, I found a bug in Go version casbin. In default-role-manager/role_manager.go, func (r *Role) getRoles() returns result with duplicate items

https://github.com/casbin/casbin/issues/1033

PR: there are 3 errors when running cargo clippy -- -D warnings , I fixed it

https://github.com/casbin/casbin-rs/pull/296

PR: There are many duplicate code in default_role_manager, making it hard to understand and maintain code, I improved it.

https://github.com/casbin/casbin-rs/pull/295

Task 3 Swagger code generator

There are 105 enpoints in casdoor now, so when developing casdoor rust sdk, I have to write many similar and duplicate code. To avoid this, I first looked for code generator.

As Casdoor provide well annotated swagger spec (https://github.com/casdoor/casdoor/blob/master/swagger/swagger.yml), I can generate code using swagger code generator.

First, I found swagger-codegen.

https://github.com/swagger-api/swagger-codegen

It can generate rust code with 2 different implementation.

I use docker image to generate code.

I tried first implementation:

docker run -u 1000:1000 --rm -v ${PWD}:/local swaggerapi/swagger-codegen-cli generate \
    -i /local/swagger.yml \
    -l rust \
    -o /local/out/rust

Code is generated very fast. The output is like:

Untitled

However, when running cargo build , I got many warnings like:

Untitled

After searching a while on the web, I found work around (https://stackoverflow.com/a/57641467): add #![allow(warnings)] at the first line in src/lib.rs

Now cargo build doesn’t produce warnings now.

Untitled

However, as the warnings are still here, this doesn’t help much, there are still many deprecated code and badly named variables.

So I tried second implementation:

docker run -u 1000:1000 --rm -v ${PWD}:/local swaggerapi/swagger-codegen-cli generate \
    -i /local/swagger.yml \
    -l rust-server \
    -o /local/out/rust-server

Now cargo build failes with:

Untitled

Searching issues of rust-openssl for a while, I found similar issue with me:

https://github.com/sfackler/rust-openssl/issues/1436

It says rust-openssl v0.9.24 is too old and doesn’t support OpenSSL 1.1.1

My openssl version is 1.1.1

Untitled

Obviously, swagger code generator is not maintained for a while, and we should find another code generator.

Then I found openapi-generator:

https://github.com/OpenAPITools/openapi-generator

The usage is similar to swagger-codegen:

docker run -u 1000:1000 --rm -v "${PWD}:/local" openapitools/openapi-generator-cli generate 
		-i /local/swagger.yml 
		-g rust  
		-o /local/out/openapi-rust

Code generator fails with:

Untitled

It says that because some endpoints in swagger.yml aren’t providing response, so the spec validation failes. For example, GetResources endpoint is not well annotated (no params, no response)

Untitled

Then I found paperclip:

https://github.com/paperclip-rs/paperclip

It’s a WIP OpenAPI tooling written in Rust and can also generate rust code.

1	paperclip --api v2 -o out/paper swagger.yml

Still, as there are several endpoints not well annotated, it fails with error:

Untitled

Also, generated code can’t have similar API with Go version Casdoor SDK:

https://github.com/casdoor/casdoor-go-sdk

After I explored these two code generators, I gave up and decide to write code on my own.

Summary:

generated code is difficult to maintain
generated code is too old and use deprecated syntax
generated code is badly written and has many warnings
generated code can’t have similar API with Go version Casdoor SDK
swagger.yml is not complete (some endpoint without response), some code generator will report syntax error

Next week Plan

Next week, I will:

start writing code for casdoor-rust-sdk
learn how middleware works and contribute to poem-casbin
maintain projects of casbin-rs

Overview This week’s tasks

Task 1 Review PR

Original implementation

Novel implementation

Why Performance impoved?

Tests

Benchmark

Task 2 Make 2 issue & 2 PRs

Task 3 Swagger code generator

Next week Plan

Wakatime weekly stats