0%

合约审计与形式化验证: Certik 与 Move Prover

CertiK (https://www.certik.com/) 是 Web3 领先的智能合约审计公司,提供一整套工具来大规模保护行业安全。Certik 通过专家审查 + AI + 形式化验证保障合约安全。
Certik

Certik 对智能合约的形式化验证主要过程为:

  1. 将合同的规范和所需属性定义为形式化语言。
  2. 将合约代码翻译成形式化表示,例如数学模型或逻辑。
  3. 使用自动定理证明器或模型检查器来验证合约的规范和属性是否成立。
  4. 重复验证过程以查找并修复与所需属性的任何错误或偏差。

Certik 使用 SMT Solver 和 Coq 作为形式化验证的工具。

Certik 在更多的案例中使用了 Coq。Coq 是一个交互式的形式化验证工具。 它提供了一种用于编写数学定义、可执行算法和定理的形式语言,以及用于机器检查证明的半交互式开发的环境。

Coq 证明助手: https://yuque.antfin-inc.com/vl7kr4/st175p/lb1fnk

Move 是一种用于编写智能合约的特定领域编程语言。 它被几个最近启动的项目使用,包括 Aptos、0L 和 Starcoin 区块链。 Move 开发环境包括一个名为“Move Prover”的检查器。

Move 编程语言有一个规范子语言,允许程序员声明有关 Move 程序的所需属性。特别是,Move Prover 遵循合约范式设计。 在内部,Move Prover 将 Move 程序及其规范转换为数学模型,然后使用演绎验证器 Boogie 对其进行形式化验证。 Boogie 使用 Z3 和 CVC4 等 SMT 求解器作为决策程序来进行证明。 一旦 Boogie 完成形式验证,其结果就会映射回 Move 语言级别并显示给程序员。
Move

SMT(satisfiability modulo theories) 求解器是 找到一组约束的可满足解: x - y = 3 x + y = 5 => x = 4 y = 1
合约/软件验证: 源代码 -> 建模 -> 约束 -> 对约束的反例求解 -> 有解则不满足约束,无解则满足约束
约束 x = y, 验证时: 对于 x != y,求解。所以 SMT 验证时的结果是给出一个反例和对应不满足的约束

Coq 和 Move Prover 的区别

用一个简单的例子感受 Coq 和 Move Prover之间的区别。

写一个 ERC20 中 transfer 函数,其实现与表示代币从一个合约地址转移到另一个地址。

1
2
3
4
5
6
7
8
9
uint sender;
uint receiver;
uint totalSupply;

// Transfer money from User 1 to User 2.
function transfer(uint amount) {
sender = sender - amount;
receiver = receiver + amount;
}

这个合约实际上要求 sender >= amount,但在这个代码中并没有检查。接下来使用 Coq 和 Move Prover 来检查这一“漏洞”。

Coq

形式验证需要证明程序的模型满足所声明的属性。在这个例子中,可以选择代币总数作为不变量,验证“交易前后代币不变这一属性”。即交易前的 sender, receiver 和交易后的sender’, receiver’ 满足 sender + receiver = sender’ + receiver’

Certik 在使用 Coq 进行合约审计时,会将合约的实现翻译为 Coq 函数。我们将这个函数翻译为 Coq 中的一个 Definition。

1
2
3
4
Definition transfer (sender receiver amount : nat) : nat :=
let sender' := sender - amount in
let receiver' := receiver + amount in
sender' + receiver'.

Coq是函数式编程语言,无副作用。不能直接更改输入的 sender receiver 的值。可以选择将交易后的 sender’ receiver’ 作为返回值,这里选择直接返回 sender’ + receiver’

然后,将模型需要满足的属性翻译为 Coq 中的引理。

1
2
Lemma transfer_eq : forall (a b c : nat),
a + b = transfer a b c.

这表示对于任意的自然数 a b c, a + b = transfer(a, b, c)

在开始证明之前,还需要引入一些Coq的标准库来辅助证明

1
Require Import Arith.

使用 Proof. 来开始证明

1
2
3
4
5
6
7
8
Proof.
intros a b c.
unfold transfer.
rewrite (Nat.add_comm b c).
rewrite Nat.add_assoc.
rewrite Nat.sub_add.
reflexivity.
Abort.

Coq 是一个交互式的定理证明工具。在执行每一个证明语句后,证明的目标会发生改变。通过将证明的目标拆解为子目标,然后利用现有条件或公理、性质和已证明的引理来逐个证明。

一开始,证明的目标为:
Coq1
使用 intros 引入 a b c 作为已知条件
Coq2
展开 transfer 函数
Coq3
接下来使用标准库中的 Nat.add_comm Nat.add_assoc Nat.sub_add 来重写目标中的表达式
Coq4
Coq5

对于 Nat.sub_add, 查看其定义:
Coq6
表示对于任意的自然数 n m, n <= m 时, m - n + n = m
对应此时目标中等号右边的 a - c + c。使用 Nat.sub_add 重写,
Coq7
这里引入了新的目标,即 c <= a。是因为使用 Nat.sub_add 重写的前提为 c <= a。即将证明的目标反推,如果证明了 c <= a,并且 a + b = a + b,那么利用 Nat.sub_add 这一引理,就可以证明 a + b = a - c + c + b。
利用自反性(reflexivity)可以证明第一个目标。
Coq8

此时对于 c <= a 无从下手,只能 Abort 放弃证明,合约模型不满足这一属性。

此时回顾 c 和 a 在合约中的定义,分别表示amount 和 sender 的金额,这一条件要求 amount <= sender,很容易发现合约“漏洞”。

完整代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Require Import Arith.

Definition transfer (sender receiver amount : nat) : nat :=
let sender' := sender - amount in
let receiver' := receiver + amount in
sender' + receiver'.


Lemma transfer_eq : forall (a b c : nat),
a + b = transfer a b c.

Proof.
intros a b c.
unfold transfer.
rewrite (Nat.add_comm b c).
rewrite Nat.add_assoc.
rewrite Nat.sub_add.
reflexivity.
Abort.

在线体验: https://coq.vercel.app/scratchpad.html

Move prover

1
2
3
4
5
6
7
8
9
10
11
12
13
module 0x1234::test {
fun transfer(sender: u64, receiver: u64, value: u64): u64 {
let _sender: u64 = sender - value;
let _receiver: u64 = receiver + value;
_sender + _receiver
}

spec transfer {
pragma aborts_if_is_strict;
ensures result == sender + receiver;
}
}

运行 move prove
move result

验证过程中,还会在当前项目目录下生成 boogie ir 的文件 xxx.bpl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
// fun test::transfer [verification] at ./sources/main.move:2:5+183
procedure {:timeLimit 40} $1234_test_transfer$verify(_$t0: int, _$t1: int, _$t2: int) returns ($ret0: int)
{
// declare local variables
var $t3: int;
var $t4: int;
var $t5: int;
var $t6: int;
var $t7: int;
var $t8: int;
var $t0: int;
var $t1: int;
var $t2: int;
var $temp_0'u64': int;
$t0 := _$t0;
$t1 := _$t1;
$t2 := _$t2;

// verification entrypoint assumptions
call $InitVerification();

// bytecode translation starts here
// assume WellFormed($t0) at ./sources/main.move:2:5+1
assume {:print "$at(2,26,27)"} true;
assume $IsValid'u64'($t0);

// assume WellFormed($t1) at ./sources/main.move:2:5+1
assume $IsValid'u64'($t1);

// assume WellFormed($t2) at ./sources/main.move:2:5+1
assume $IsValid'u64'($t2);

// trace_local[sender]($t0) at ./sources/main.move:2:5+1
assume {:print "$track_local(0,0,0):", $t0} $t0 == $t0;

// trace_local[receiver]($t1) at ./sources/main.move:2:5+1
assume {:print "$track_local(0,0,1):", $t1} $t1 == $t1;

// trace_local[value]($t2) at ./sources/main.move:2:5+1
assume {:print "$track_local(0,0,2):", $t2} $t2 == $t2;

// $t5 := -($t0, $t2) on_abort goto L2 with $t6 at ./sources/main.move:3:35+1
assume {:print "$at(2,120,121)"} true;
call $t5 := $Sub($t0, $t2);
if ($abort_flag) {
assume {:print "$at(2,120,121)"} true;
$t6 := $abort_code;
assume {:print "$track_abort(0,0):", $t6} $t6 == $t6;
goto L2;
}

// trace_local[_sender]($t5) at ./sources/main.move:3:13+7
assume {:print "$track_local(0,0,4):", $t5} $t5 == $t5;

// $t7 := +($t1, $t2) on_abort goto L2 with $t6 at ./sources/main.move:4:39+1
assume {:print "$at(2,167,168)"} true;
call $t7 := $AddU64($t1, $t2);
if ($abort_flag) {
assume {:print "$at(2,167,168)"} true;
$t6 := $abort_code;
assume {:print "$track_abort(0,0):", $t6} $t6 == $t6;
goto L2;
}

// trace_local[_receiver]($t7) at ./sources/main.move:4:13+9
assume {:print "$track_local(0,0,3):", $t7} $t7 == $t7;

// $t8 := +($t5, $t7) on_abort goto L2 with $t6 at ./sources/main.move:5:17+1
assume {:print "$at(2,192,193)"} true;
call $t8 := $AddU64($t5, $t7);
if ($abort_flag) {
assume {:print "$at(2,192,193)"} true;
$t6 := $abort_code;
assume {:print "$track_abort(0,0):", $t6} $t6 == $t6;
goto L2;
}

// trace_return[0]($t8) at ./sources/main.move:5:9+19
assume {:print "$track_return(0,0,0):", $t8} $t8 == $t8;

// label L1 at ./sources/main.move:6:5+1
assume {:print "$at(2,208,209)"} true;
L1:

// assert Not(false) at ./sources/main.move:6:5+1
assume {:print "$at(2,208,209)"} true;
assert {:msg "assert_failed(2,208,209): function does not abort under this condition"}
!false;

// assert Eq<u64>($t8, Add($t0, $t1)) at ./sources/main.move:10:9+36
assume {:print "$at(2,275,311)"} true;
assert {:msg "assert_failed(2,275,311): post-condition does not hold"}
$IsEqual'u64'($t8, ($t0 + $t1));

// return $t8 at ./sources/main.move:10:9+36
$ret0 := $t8;
return;

// label L2 at ./sources/main.move:6:5+1
assume {:print "$at(2,208,209)"} true;
L2:

// assert false at ./sources/main.move:8:5+102
assume {:print "$at(2,215,317)"} true;
assert {:msg "assert_failed(2,215,317): abort not covered by any of the `aborts_if` clauses"}
false;

// abort($t6) at ./sources/main.move:8:5+102
$abort_code := $t6;
$abort_flag := true;
return;

}

  • print: 打印调试信息,不影响执行
  • $at 是一个 Boogie 的内置函数,用于指示某个语句或表达式的位置信息
  • L1 L2, label,程序执行流程控制
  • assume: 做一些前提假设。例如,assume WellFormed($t0) 表示假设 $t0 是一个合法的 Move 类型,并且 assume $IsValid’u64’($t0) 表示假设 $t0 是一个有效的无符号 64 位整数。这些假设在验证过程中用于推导其他断言。
  • $abort_flag: 表示是否发生了异常(abort)
  • assert: 断言,它用于在程序执行期间检查某个条件是否满足。如果断言条件为真,则程序继续执行;如果断言条件为假,则会导致验证失败。 例如 assert {:msg “assert_failed(2,275,311): post-condition does not hold”}

对比

Coq
优点:

  • 从数学上证明,避免测试的 corner case
  • 形式化的规范避免自然语言中的模糊和歧义
  • 相对于 SMT,定理证明可以处理对无限状态系统的分析。上述例子中具体表现入参的区别, Coq 中可以对自然数域(nat 类型)做验证,而smt只能对有限的输入做验证

缺点

  • 需要手动 translate,将智能合约的实现翻译为 Coq 代码,翻译的过程中可能导致 Coq 代码和原合约代码语义上的区别,例如 transfer 中 uint 变量类型被翻译为 nat
  • 需要证明的合约满足的属性是不明确的。错误的约束可能会将漏洞从代码带到规范中,而不恰当的约束也可能导致错误不能被检查。这一点不只是coq,move prover 同样需要人为定义spec
  • Coq需要手动证明,成本高,需要验证人员对 Coq 和智能合约都有认知。同时 Coq 证明也很复杂, Certik 在验证 TEE 时用了数千行的规范和 17300 多行的证明代码
  • 过于 Customize,可以作为 Case by Case 的服务,但不太好沉淀为通用的解决方案。

Move prover
优点

  • SMT 是自动求解的,不需要像Coq一样手动构造证明
  • 更通用,完成语言前端到 boogie IR 的翻译之后,基本上所有语言都可以使用这个方案

缺点

补充

符号执行
symbolic

https://gitee.com/HW-PLLab/pllab_slides/blob/dev/WechatOfficialAccSlides/20230223-%E7%AC%A6%E5%8F%B7%E5%8C%96%E6%89%A7%E8%A1%8C%E5%9C%A8%E8%BD%AF%E4%BB%B6%E9%A2%86%E5%9F%9F%E7%9A%84%E5%BA%94%E7%94%A8%E5%8F%8A%E6%8C%91%E6%88%98-by-%E6%B2%88%E8%AF%97%E7%90%A6.pdf

合约语言验证技术对比
mothods
Solana Certora Prover 和 solidity SMTCheck 也是基于 SMT 的方案。
DeepSea 能采用 Coq 作为验证工具,大概率是因为语言使用 Ocaml 实现,Coq 由 Ocaml 语言实现且 Coq 代码可以提取为 Ocaml 代码,两者之间存在一些可转换性。

Todo

Solana Certora Prover (SCP)

Solidity SMTCheck

Ref:

Lint 的执行流程

前面两篇介绍了 LintLintPassCombinedLintPass 几个结构的实现,并以这些结构写了一个 Lint 的伪代码实现。

1
2
3
4
5
6
7
8
9
10
11
impl ast_visit::Visitor for Linter {
fn visit_crate(a: ast:crate){
combinedlintpass.check_crate(a);
walk_crate(a);
}
fn visit_stmt(a: ast:stmt){
combinedlintpass.check_stmt(a)
walk_stmt(a);
}
...
}

Rustc 中 Lint 的执行阶段

Rustc 的设计与经典编译器的设计基本无异,包含词法分析、语法分析、语义分析、生成IR、IR优化和代码生成等流程,但针对 Rust 的语言特性,还加入了一些特有的流程,如借用检查。对应的,代码在整个编译流程中的中间表示也有一定的扩展:

  • Token stream:Lexer 将源代码的字符流转化为词法单元(token) 流,这些词法单元被传递给下一个步骤,即语法分析。
  • Abstract Syntax Tree(AST):Parser 将 Token 流转换为抽象语法树(AST),抽象语法树几乎可以完全描述源代码中所写的内容。在 AST 上,Rustc 还执行了宏扩展、 early lint 等过程。
  • High-level IR(HIR):这是一种脱糖的 AST。它仍与源代码中的内容非常接近,但它包含一些隐含的东西,例如一些省略的生命周期等。这个 IR 适合类型检查。late lint也在类型检查之后进行。
  • Typed HIR(THIR):THIR 与 HIR 类似,但它携带了类型信息,并且更加脱糖(例如,函数调用和隐式的间接引用都会变成完全显式)。
  • Middle-level IR(MIR):MIR 基本上是一个控制流图(Control-Flow Graph)。CFG 是程序执行过程的抽象表现,代表了程序执行过程中会遍历到的所有路径。它用图的形式表示一个过程内所有基本块可能流向。Rustc 在 MIR 上除了基础的基于 CFG 的静态分析和 IR 优化外,还进行了 Rust 中所有权的借用检查。
  • LLVM IR:Rustc 的后端采用了 LLVM,因此,Rustc 会将 MIR 进一步转化为 LLVM IR 并传递给 LLVM 做进一步优化和代码生成的工作。

以上 Rust 代码的中间表示的转化流程也反映了 Rust 整个编译的流程,总结为一张图:
编译流程
Rustc 中的 rustc_driver::lib.rs 中控制了编译流程的各个阶段:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fn run_compiler(...) -> interface::Result<()> {
...
interface::run_compiler(config, |compiler| {
let linker = compiler.enter(|queries| {
...
queries.parse()?; // lexer parse
...
queries.expansion()?; // resolver
...
queries.prepare_outputs()?;
...
queries.global_ctxt()?; // ast -> hir
...
queries.ongoing_codegen()?;
...
}
}

前面介绍过,Rustc 中的 Lint 包含 early 和 late 两种,它们分别在 AST -> HIR 和 HIR -> THIR 两个阶段执行。这里我们同样以 WhileTrue 这个例子去看 Lint 从定义、到注册,最后执行的完整的流程。同时,WhileTrue 是 builtin 的 early lint 其中的一种,被包含在 BuiltinCombinedEarlyLintPass 之中。

定义

首先是 WhileTrue的 lint 和对应的 lintpass 的定义,它们被定义在 rustc_lint/src/builtin.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
declare_lint! {
/// The `while_true` lint detects `while true { }`.
///
/// ### Example
///
/// ```rust,no_run
/// while true {
///
/// }
/// ```
///
/// {{produces}}
///
/// ### Explanation
///
/// `while true` should be replaced with `loop`. A `loop` expression is
/// the preferred way to write an infinite loop because it more directly
/// expresses the intent of the loop.
WHILE_TRUE,
Warn,
"suggest using `loop { }` instead of `while true { }`"
}

declare_lint_pass!(WhileTrue => [WHILE_TRUE]);

impl EarlyLintPass for WhileTrue {
fn check_expr(&mut self, cx: &EarlyContext<'_>, e: &ast::Expr) {
...
}
}

与前面的介绍一样:

  1. declare_lint 宏声明一个 lint:WHILE_TRUE
  2. declare_lint_pass 宏声明一个lintpass:WhileTrue
  3. WhileTrue 实现 EarlyLintPass 中对应的检查方法,因为此 lintpass 只检查 Expr 节点,所以只需要实现 check_expr()函数即可。

注册

注册是指编译过程中将 Lint 加入到 LintStore 的过程。WhileTrue 不需要单独的注册和执行,它的检查方法通过宏扩展的方式展开到 BuiltinCombinedEarlyLintPass 中。BuiltinCombinedEarlyLintPass 的注册和执行都发生在 queries.expansion() 函数中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
pub fn expansion(
&self,
) -> Result<&Query<(Rc<ast::Crate>, Rc<RefCell<BoxedResolver>>, Lrc<LintStore>)>> {
tracing::trace!("expansion");
self.expansion.compute(|| {
let crate_name = self.crate_name()?.peek().clone();
// 注册
let (krate, lint_store) = self.register_plugins()?.take();
let _timer = self.session().timer("configure_and_expand");
let sess = self.session();
let mut resolver = passes::create_resolver(
sess.clone(),
self.codegen_backend().metadata_loader(),
&krate,
&crate_name,
);
let krate = resolver.access(|resolver| {
// 执行
passes::configure_and_expand(sess, &lint_store, krate, &crate_name, resolver)
})?;
Ok((Rc::new(krate), Rc::new(RefCell::new(resolver)), lint_store))
})
}

注册的过程会生成定义的 lint 的结构并添加到 LintStore 中。Lint 整体上被分为4个种类:pre-expansion, early, late, late-module。尽管 Lint 对应的 LintPass 在编译流程中执行的阶段不同,但注册都是发生在同一个阶段。
Lint 注册过程的函数调用链路如下:

  • rustc_driver::lib::run_compiler()
  • rustc_interface::queries::Queries.expansion()
  • rustc_interface::queries::Queries.register_plugins()
  • rustc_lint::lib::new_lint_store()
  • rustc_lint::lib::register_builtins()

在这里,默认的编译流程会执行 else{} 分支中的语句,BuiltinCombinedEarlyLintPass::get_lints() 会生成 WHILE_TRUE 并添加到 LintStore中。

1
2
3
4
5
6
7
8
9
10
11
if no_interleave_lints {
pre_expansion_lint_passes!(register_passes, register_pre_expansion_pass);
early_lint_passes!(register_passes, register_early_pass);
late_lint_passes!(register_passes, register_late_pass);
late_lint_mod_passes!(register_passes, register_late_mod_pass);
} else {
store.register_lints(&BuiltinCombinedPreExpansionLintPass::get_lints());
store.register_lints(&BuiltinCombinedEarlyLintPass::get_lints());
store.register_lints(&BuiltinCombinedModuleLateLintPass::get_lints());
store.register_lints(&BuiltinCombinedLateLintPass::get_lints());
}

执行

不同的 LintPass 的执行过程发生在编译过程的不同阶段,其中,BuiltinCombinedEarlyLintPass 执行过程的函数调用链路如下:

  • rustc_driver::lib::run_compiler()
  • rustc_interface::queries::Queries.expansion()
  • rustc_interface::passes::configure_and_expand()
  • rustc_lint::early::check_ast_node()
  • rustc_lint::early::early_lint_node()

首先,在 configure_and_expand() 函数中,执行了 pre-expansion 和 early 两种 lintpass。注册时使用了 BuiltinCombinedEarlyLintPass::get_lints() 方法生成 lints,而这里用 BuiltinCombinedEarlyLintPass::new() 方法生成了 lintpass。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
pub fn configure_and_expand(
sess: &Session,
lint_store: &LintStore,
mut krate: ast::Crate,
crate_name: &str,
resolver: &mut Resolver<'_>,
) -> Result<ast::Crate> {
pre_expansion_lint(sess, lint_store, resolver.registered_tools(), &krate, crate_name);
...
sess.time("early_lint_checks", || {
let lint_buffer = Some(std::mem::take(resolver.lint_buffer()));
rustc_lint::check_ast_node(
sess,
false,
lint_store,
resolver.registered_tools(),
lint_buffer,
rustc_lint::BuiltinCombinedEarlyLintPass::new(),
&krate,
)
});
}

Lint 的执行最终发生在 rustc_lint::early::early_lint_node() 函数中。比较 early_lint_node() 函数和 CombinedLintPass 一节最后的伪代码:

early_lint_node与CombinedLintPass

它们之间有以下的对应关系:

  • 参数 pass 是 configure_and_expand() 函数中新建的 BuiltinCombinedEarlyLintPass,它对应 combinedlintpass。
  • EarlyContextAndPass 将 pass 与 context 信息组合在一起,并且实现了 visitor,它对应 Linter。
  • check_node.check(cx) 调用了 cx.pass.check_crate() 进行 lint 检查,根据 BuiltinCombinedEarlyLintPass 的定义, 这个函数中会调用所有 builtin early lint 的 check_crate() 方法,然后执行 ast_visit::walk_crate() 遍历子节点,它对应了 visit_crate()。

no_interleave_lints

虽然 Rustc 中考虑性能因素,将 LintPass 组合成 CombinedLintPass,但提供了一些编译参数去配置 Lint。其中,Lint 的注册和执行过程中都用到了 no_interleave_lints 参数。这个参数默认为 false,表示是否单独执行每一个 lint。编译时将这个修改这个参数就可以单独注册每一个 lint 以及单独执行 lintpass,这样的设计提供了更好的灵活性和自定义的能力(比如,可以对每一个 lint 单独做 benchmark)。

1
2
3
4
5
6
7
8
9
10
11
if no_interleave_lints {
pre_expansion_lint_passes!(register_passes, register_pre_expansion_pass);
early_lint_passes!(register_passes, register_early_pass);
late_lint_passes!(register_passes, register_late_pass);
late_lint_mod_passes!(register_passes, register_late_mod_pass);
} else {
store.register_lints(&BuiltinCombinedPreExpansionLintPass::get_lints());
store.register_lints(&BuiltinCombinedEarlyLintPass::get_lints());
store.register_lints(&BuiltinCombinedModuleLateLintPass::get_lints());
store.register_lints(&BuiltinCombinedLateLintPass::get_lints());
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
pub fn check_ast_node<'a>(...) {
if sess.opts.debugging_opts.no_interleave_lints {
for (i, pass) in passes.iter_mut().enumerate() {
buffered =
sess.prof.extra_verbose_generic_activity("run_lint", pass.name()).run(|| {
early_lint_node(
sess,
!pre_expansion && i == 0,
lint_store,
registered_tools,
buffered,
EarlyLintPassObjects { lints: slice::from_mut(pass) },
check_node,
)
});
}
} else {
buffered = early_lint_node(
sess,
!pre_expansion,
lint_store,
registered_tools,
buffered,
builtin_lints,
check_node,
);
...
}
}

总结

至此,我们就分析了 Rustc 中一个 Lint 定义、实现对应的检查(LintPass)、注册、最终执行的完整流程。我们也可以利用这些宏,去定义新的Lint和LintPass(Clippy 中也是以相似的方式)。

定义

http://people.csail.mit.edu/silvio/Selected%20Scientific%20Papers/Proof%20Systems/The_Knowledge_Complexity_Of_Interactive_Proof_Systems.pdf

零知识协议是一种方法,通过这种方法,一方(证明者)可以向另一方(验证者)证明某事是真实的,除了证实特定声明之外,不会透露任何信息。

在16世纪的文艺复兴时期,意大利有两位数学家为竞争一元三次方程求根公式发现者的桂冠,就采用了零知识证明的方法。当时,数学家塔尔塔里雅和菲奥都宣称自己掌握了这个求根公式,为了证明自己没有说谎,又不把公式的具体内容公布出来(可能在当时数学公式也是一种技术秘密),他们摆开了擂台:双方各出30个一元三次方程给对方解,谁能全部解出,就说明谁掌握了这个公式。比赛结果显示,塔尔塔里雅解出了菲奥出的全部30个方程,而菲奥一个也解不出。于是人们相信塔尔塔里雅是一元三次方程求根公式的真正发现者,虽然当时除了塔尔塔里雅外,谁也不知道这个公式到底是个什么样子。

零知识证明的应用场景

https://ethereum.org/zh/zero-knowledge-proofs/#use-cases-for-zero-knowledge-proofs

  • 匿名支付
  • 身份保护
  • 认证
  • 可验证计算
  • 减少链上投票中的回路和串通

如何证明交易发起者?
每个用户都有一对公钥和密钥来管理钱包,公钥类似于银行账户的用户名,私钥类似于账户密码,账户地址是根据公钥通过不可逆算法算出来的字符串,公钥和地址都是可以公开的,私钥是需要个人妥善保管的。如果要证明这笔交易的发起方确实是小明,小明的钱包如果将私钥打包在交易中,那全网都知道了小明的私钥,相当于小明的银行卡密码被泄露了,那小明的资产就极其不安全。所以私钥不能泄露,多少比特币用户早年因为私钥丢了,结果当初不值钱的比特币现在已经是一笔巨款,简直是拍断大腿。
因此,接下来的问题是怎么在不泄露私钥的情况下证明这笔交易确实是有小明自己发起的,然后让网络上形成共识让这笔交易入链,交易完成后,小明才能喝到咖啡,咖啡店才能收到钱。这就是个很典型的零知识证明问题。

零知识证明的原理

image.png

证明

证明 x = 3 是 x^3^+x+5 = 35 的解

转换为计算函数

1
2
3
def qeval(x):
y = x**3
return x + y + 5

拍平(flatten)

将函数转为两种形式的语句
x = y

x = y op z
其中 op 可以是 +、* (- 和 / ?)
这种表达式可以看作是电路中的逻辑门,每一条语句称为一个 约束

1
2
3
4
sym_1 = x * x
y = sym_1 * x
sym_2 = y + x
~out = sym_2 + 5

R1CS

将拍平的表达式转为 R1CS(rank-1 constraint system)电路。
R1CS 是一组长度为3的向量(a, b, c),其中解向量 s满足
s·a * s·b - s·c = 0

解向量 s 一般称为 witness

向量构造方式就是拍平后的表达式的变量的系数。
首先将所有变量排列, ~one 表示常量 1
'~one', 'x', '~out', 'sym_1', 'y', 'sym_2'

第一行

1
2
3
4
5
sym_1 = x * x

=>

1 * x + 1 * x - 1 * sym_1 = 0

所以
1
2
3
a = [0, 1, 0, 0, 0, 0]
b = [0, 1, 0, 0, 0, 0]
c = [0, 0, 0, 1, 0, 0]

同理,剩下3行的向量分别为

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
y = sym_1 * x
=>
a = [0, 0, 0, 1, 0, 0]
b = [0, 1, 0, 0, 0, 0]
c = [0, 0, 0, 0, 1, 0]

sym_2 = y + x
=>
a = [0, 1, 0, 0, 1, 0]
b = [1, 0, 0, 0, 0, 0]
c = [0, 0, 0, 0, 0, 1]



~out = sym_2 + 5
=>
a = [5, 0, 0, 0, 0, 1]
b = [1, 0, 0, 0, 0, 0]
c = [0, 0, 1, 0, 0, 0]

'*' 对应系数分别在 a, b
'+' 对应系数都在a, b的常数系数为1

验证一下,x = 3 时,解向量 s = [1, 3, 35, 9, 27, 30]
第一行:

1
2
3
4
[1, 3, 35, 9, 27, 30] · [0, 1, 0, 0, 0, 0] * 
[1, 3, 35, 9, 27, 30] · [0, 1, 0, 0, 0, 0] -
[1, 3, 35, 9, 27, 30] · [0, 0, 0, 1, 0, 0]
= 3 * 3 - 9 = 0

第二行
1
2
3
4
[1, 3, 35, 9, 27, 30] · [0, 0, 0, 1, 0, 0] * 
[1, 3, 35, 9, 27, 30] · [0, 1, 0, 0, 0, 0] -
[1, 3, 35, 9, 27, 30] · [0, 0, 0, 0, 1, 0]
= 9 * 3 - 27 = 0

第三行

1
2
3
4
[1, 3, 35, 9, 27, 30] · [0, 1, 0, 0, 1, 0] * 
[1, 3, 35, 9, 27, 30] · [1, 0, 0, 0, 0, 0] -
[1, 3, 35, 9, 27, 30] · [0, 0, 0, 0, 0, 1]
= (3 + 27)* 1 - 27 = 0

第四行

1
2
3
4
[1, 3, 35, 9, 27, 30] · [5, 0, 0, 0, 0, 1] * 
[1, 3, 35, 9, 27, 30] · [1, 0, 0, 0, 0, 0] -
[1, 3, 35, 9, 27, 30] · [0, 0, 1, 0, 0, 0]
= (5 + 30)* 1 - 35 = 0

完整的 R1CS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
A
[0, 1, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0]
[0, 1, 0, 0, 1, 0]
[5, 0, 0, 0, 0, 1]
B
[0, 1, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0]
C
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 1, 0]
[0, 0, 0, 0, 0, 1]
[0, 0, 1, 0, 0, 0]

R1CS -> QAP

R1CS 可以完成证明,但对于一个解向量,验证时,有几个门电路就需要验证多少次。将 R1CS 转为 QAP(Quadratic Arithmetic Programs),将四组三个长度为6的多项式,转变为6组3个3阶多项式。用多项式去代替向量的点积运算

We go from four groups of three vectors of length six to six groups of three degree-3 polynomials, where evaluating the polynomials at each x coordinate represents one of the constraints.

对于R1CS中的向量,转变为经过n个点的多项式(n 等于向量个数 4)

1
2
3
4
5
A
[0, 1, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0]
[0, 1, 0, 0, 1, 0]
[5, 0, 0, 0, 0, 1]

取每一列构造坐标,行号为x,值为 y
A 的第一列 [0, 0, 0, 5]
构造坐标 (1,0),(2,0),(3,0)(4,5), 求一个经过这四个点的多项式

拉格朗日插值法
通过n个点 (x1,y1),(x2,y2),…,(xn,yn) 的n-1阶多项式为如下所示:

将(1,0),(2,0),(3,0)(4,5) 带入, 求得多项式 0.833 x^3 - 5x^2 + 9.166x - 5

同理,计算出所有的多项式系数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
A polynomials
[-5.0, 9.166, -5.0, 0.833]
[8.0, -11.333, 5.0, -0.666]
[0.0, 0.0, 0.0, 0.0]
[-6.0, 9.5, -4.0, 0.5]
[4.0, -7.0, 3.5, -0.5]
[-1.0, 1.833, -1.0, 0.166]

B polynomials
[3.0, -5.166, 2.5, -0.333]
[-2.0, 5.166, -2.5, 0.333]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]

C polynomials
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[-1.0, 1.833, -1.0, 0.166]
[4.0, -4.333, 1.5, -0.166]
[-6.0, 9.5, -4.0, 0.5]
[4.0, -7.0, 3.5, -0.5]

这样,x = 1 的时候,计算 18 个表达式,可以得到
[0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],正好对应 r1cs 的 A B C的第一行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
A
[0, 1, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0]
[0, 1, 0, 0, 1, 0]
[5, 0, 0, 0, 0, 1]
B
[0, 1, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0]
C
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 1, 0]
[0, 0, 0, 0, 0, 1]
[0, 0, 1, 0, 0, 0]

记 A(x) = s · (A1,A2,A3,A4,A5,A6)

A(x) B(x) - C(x) = s·a s·b - s·c

同理, x = 1, 2, 3, 4 时,对应R1CS的4个约束, A(x) B(x) - C(x) = s·a s·b - s·c = 0
所以 A(x) B(x) - C(x) = 0 至少有 4 个解, 1, 2, 3, 4。
所以 A(x)
B(x) - C(x) 可以表示为 (x - 1)(x - 2)(x - 3)(x - 4) H(x)
所以 A(x)
B(x) - C(x) 如果能够整除 (x - 1)(x - 2)(x - 3)(x - 4),就可以认为满足所有约束

验证: 带入解向量 s = (1,3,35,9,27,30),求得

1
2
3
4
5
6
7
8
9
10
11
A(x) = s · (A1,A2,A3,A4,A5,A6)
= (1,3,35,9,27,30) ·(
[-5.0, 9.166, -5.0, 0.833]
[8.0, -11.333, 5.0, -0.666]
[0.0, 0.0, 0.0, 0.0]
[-6.0, 9.5, -4.0, 0.5]
[4.0, -7.0, 3.5, -0.5]
[-1.0, 1.833, -1.0, 0.166])
= [43.0, -73.333, 38.5, -5.166]
(第一个系数43 = 1 * (-5) + 3 * 8 - 6 * 9 + 4 * 27 - 30


同样
1
2
s . B  = [-3.0, 10.333, -5.0, 0.666]
s . C = [-41.0, 71.666, -24.5, 2.833]

1
2
3
4
5
6
7
8
9
10
11
// 顺序写反了?43应该是1的系数?
A(x) * B(x) - C(x) =
(43 * x^3 - 73.333 * x^2 + 38.5 * x - 5.166) *
(-3 * x^3 + 10.333 * x^2 - 5 * x + 0.666) -
(-41 * x^3 - 71.666 * x^2 - 24.5 * x + 2.833)
= -88 * x^6 + 592.666 * x^5 - 1063.777 * x^4 + 805 * x^3
- 294.777 * x^2 + 51.5 * x -3.444


(x - 1)(x - 2)(x - 3)(x - 4)
= 24 * x^4 - 50 * x^3 + 35 * x^2 - 10 * x + 1
1
2
A(x) * B(x) - C(x)  / (x - 1)(x - 2)(x - 3)(x - 4)
= - 3.666 * x^2 - 17.055 * x - 3.444

能够整除,认为 解 s 满足所有约束

相较于 R1CS,这部分用 R1CS 求出了多项式(系数)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
A polynomials
[-5.0, 9.166, -5.0, 0.833]
[8.0, -11.333, 5.0, -0.666]
[0.0, 0.0, 0.0, 0.0]
[-6.0, 9.5, -4.0, 0.5]
[4.0, -7.0, 3.5, -0.5]
[-1.0, 1.833, -1.0, 0.166]

B polynomials
[3.0, -5.166, 2.5, -0.333]
[-2.0, 5.166, -2.5, 0.333]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]

C polynomials
[0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0]
[-1.0, 1.833, -1.0, 0.166]
[4.0, -4.333, 1.5, -0.166]
[-6.0, 9.5, -4.0, 0.5]
[4.0, -7.0, 3.5, -0.5]

对于任意的解向量 s,计算 s · (A1,A2,A3,A4,A5,A6) × s · (B1,B2,B3,B4,B5,B6) - s · (C1,C2,C3,C4,C5,C6) 能否整除 (x - 1)(x - 2)(x - 3)(x - 4)

椭圆曲线

定义

image.png
其中不等式保证曲线不包含奇点(保证椭圆上任意一点都有切线)
椭圆曲线关于 x 轴对称

椭圆曲线运算规则

加法:A + B

image.png
A B 是椭圆曲线上点,椭圆曲线上A + B的定义为:AB和椭圆曲线的交点关于X轴的对称点

乘法: 2A = A + A

image.png

A 是椭圆曲线上点,椭圆曲线上 2A (A + A)的定义为:A点的切线和椭圆曲线的交点关于X轴的对称点
3A = A + 2A

性质

离散对数问题:
椭圆曲线上两个点 P Q, k为整数
Q = kP
加密原理:
点P为基点(base point), k 为私钥, Q 为公钥
给定 P 和 k, 计算 Q 很容易,
image.png
但给定 P Q,求 k 非常困难

指数知识假设

在椭圆曲线上,给定一对点 (P,Q),其中 Pk = Q ,然后在给定一个点 C ,你不可能得到一个点 R=Ck,除非你知道 C 是怎样由点 P “派生”出来的(比如你知道 C=nP 中的 n ,那么 Ck = nPk = nQ )。

1
2
3
Pk = Q  
C = nP
Ck = R

zk-SNARK 的安全性依赖于这个假设,尽管其还没有被证明等价于其他困难问题(如离散对数问题),但是大多数密码学家认为它足够坚固。

zk - SNARK 证明

1
2
3
Pk = Q  
C = nP
Ck = R

Pk = Q k = f(x) = 43 x^3 - 73.333 x^2 + 38.5 * x - 5.166 = k
给定PQ ,证明者无法算出 k

C = nP n = s
Ck = nPk = n Q = s Q = R
给定 CR, 验已知k,可以验证 Ck = R
但根据 C = nP,无法计算 n,即解向量 s

椭圆曲线上任取一点 G, k_a,k_b,k_c, t

1
2
3
4
5
6
7
8
9
G * A_1(t), G * A_1(t) * k_a
G * A_2(t), G * A_2(t) * k_a

G * B_1(t), G * B_1(t) * k_b
G * B_2(t), G * B_2(t) * k_b

G * C_1(t), G * C_1(t) * k_c
G * C_2(t), G * C_2(t) * k_c

根据指数知识假设,只有知道解向量 s,才能够求出

1
2
3
4
5
6
7
8
π_a = G * A_1(t) * s, π_a' = G * A_1(t) * k_a * s
G * A_2(t) * s, G * A_2(t) * k_a * s

G * B_1(t) * s, G * B_1(t) * k_b * s
G * B_2(t) * s, G * B_2(t) * k_b * s

G * C_1(t) * s, G * C_1(t) * k_c * s
G * C_2(t) * s, G * C_2(t) * k_c * s

Ref

Quadratic Arithmetic Programs: from Zero to Hero: https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649?ref=blog.anoma.net
ETH ZKP: https://ethereum.org/zh/zero-knowledge-proofs/
VampIR:https://blog.anoma.net/a-vamp-irs-guide-to-arithmetic-circuits-and-perfectly-boiled-eggs/
【ECC加密算法】| ECC加密原理详解| 椭圆曲线加密| 密码学| 信息安全: https://www.bilibili.com/video/BV1v44y1b7Fd/?spm_id_from=333.337.search-card.all.click&vd_source=3f6c0363de570541eace117866128565
公钥加密技术ECC椭圆曲线加密算法原理: https://www.bilibili.com/video/BV1BY411M74G/?spm_id_from=333.337.search-card.all.click&vd_source=3f6c0363de570541eace117866128565

Request:

  • Git
  • Openai API Key
  • python 3

Fine tuning

安装

安装 OpenAI CLI

1
pip install --upgrade openai

在环境变量中设置OPENAI_API_KEY

1
export OPENAI_API_KEY="<OPENAI_API_KEY>"

Prepare training data

训练数据使用 jsonl 格式,每一行都是一个 json。其中包含 promptcompletion,即问题和对应的回答。

需要注意的是,对于基础模型,可能需要很多不同的 prompt(few-shot learning),但对于 fine-tuning,训练数据通常是一个输入和多种输出。

训练一个关于 KCL 的AI,首先教它回复关于如何安装 KCL。

以下训练数据由Chatgpt生成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{"prompt": "如何安装KCL###", "completion": " 请访问 https://kcl-lang.io/docs/next/user_docs/getting-started/install 以获取KCL安装指南。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以从 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL,并遵循安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL,其中包含有关如何安装和使用KCL的文档。\nEND"}
{"prompt": "如何安装KCL###", "completion": " KCL的安装指南可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 找到。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 有关如何安装和配置KCL的信息可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 中找到。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL并查看安装指南。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 有关如何安装KCL的详细信息可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 找到。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL,并遵循安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请查看 https://kcl-lang.io/docs/next/user_docs/getting-started/install 以获取有关如何安装和使用KCL的指南。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 找到有关如何安装和配置KCL的文档。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请访问 https://kcl-lang.io/docs/next/user_docs/getting-started/install 并下载KCL,然后遵循安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以从 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL,并遵循安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 有关如何安装KCL###的信息可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 中找到。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请访问 https://kcl-lang.io/docs/next/user_docs/getting-started/install 并按照指南安装KCL。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请访问 https://kcl-lang.io/docs/next/user_docs/getting-started/install 并下载KCL,然后按照安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请查看 https://kcl-lang.io/docs/next/user_docs/getting-started/install 并下载KCL,然后按照安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 请查看 https://kcl-lang.io/docs/next/user_docs/getting-started/install 并下载KCL,然后遵循安装指南来安装它。\nEND"}
{"prompt": "如何安装KCL###", "completion": " 您可以在 https://kcl-lang.io/docs/next/user_docs/getting-started/install 下载KCL,并遵循安装指南来安装它。\nEND"}

创建训练模型

1
openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

其中,TRAIN_FILE_ID_OR_PATH 为训练数据的文件路径,BASE_MODEL 为基础模型,可选值为 ada, babbage, curie, 或 davinci。每种模型有不同的能力和价格。

执行命令后,会提交训练任务,通常需要排队,等待OpenAI训练完成即可。

使用模型

查看自己已经训练好的模型:

1
openai api fine_tunes.list

通过 OpenAI CLI使用模型

1
openai api completions.create -m <FINE_TUNED_MODEL> -p <YOUR_PROMPT>

其他使用方法(cURL, Python, Node.js 参考 https://platform.openai.com/docs/guides/fine-tuning/use-a-fine-tuned-model

(伪)部署到网站

OpenAI CLI查询比较麻烦,考虑通过网页方式调用。这里选择白嫖Github Page。

新建 github 仓库

新建仓库为

写一个前端

新建 index.html 并添加以下代码。并push到github。

以下代码由 Chatgpt 生成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
<!DOCTYPE html>
<html>
<head>
<title>OpenAI API demo</title>
</head>
<body>
<h1>OpenAI API demo</h1>
<label for="api_key">API Key:</label>
<input type="text" id="api_key" />

<label for="model">Model:</label>
<input type="text" id="model" />

<label for="input">Enter text:</label>
<input type="text" id="input" />
<button onclick="submit()">Submit</button>
<div id="output"></div>

<script>
async function submit() {
const apiKey = document.getElementById("api_key").value;
const model = document.getElementById("model").value;
const input = document.getElementById("input").value;

const response = await fetch(
"https://api.openai.com/v1/completions",
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: "Bearer " + apiKey,
},
body: JSON.stringify({
prompt: input,
max_tokens: 200,
model: model,
}),
}
);

const json = await response.json();
const output = json.choices[0].text.trim();
document.getElementById("output").innerHTML = output;
}
</script>
</body>
</html>

注意:

  1. Key 不能直接写在代码中,OpenAI 会认为 Key 泄露并且让 Key 失效。
  2. 模型可以写在代码中,但是考虑到模型迭代,测试阶段可以在前端直接输入模型名,方便测试。

设置 Github Page

Github repo -> Settings -> Pages -> Build and deployment -> Source -> Deploy from branch

测试

https://platform.openai.com/docs/api-reference/fine-tunes

Introduction

ChatGPT is a natural language processing model developed by OpenAI that can generate human-like responses to text input. It has been used in various applications, such as chatbots, question-answering systems, and language translation tools.
Importance of programming languages and programmers: Programming languages are the backbone of software development, and programmers are the ones who use these languages to create software applications that power many aspects of our lives. As technology continues to advance and become more integrated into our daily routines, the demand for programming skills and expertise is increasing rapidly.
This section sets the stage for the discussion of how ChatGPT might impact programming languages and programmers. It establishes the relevance of these topics and highlights the significance of ChatGPT as a potential game-changer in the field of computer science.

Impact on programming languages

ChatGPT’s natural language processing abilities

ChatGPT is a state-of-the-art natural language processing model developed by OpenAI. It is a machine learning system that uses deep learning techniques to analyze and understand human language. ChatGPT is based on a transformer architecture, which enables it to process and generate natural language text at a large scale. It has been trained on vast amounts of text data and has learned to predict the most likely next word or sentence given a specific prompt.

One of the most impressive features of ChatGPT is its ability to generate human-like responses to text input. This means that it can hold a conversation with a user and respond in a way that is both understandable and natural-sounding. ChatGPT can also perform a variety of other natural language processing tasks, such as language translation, text summarization, and sentiment analysis.

ChatGPT’s natural language processing abilities have significant implications for the field of computer science. It has the potential to revolutionize the way we interact with computers and other devices, making it easier and more intuitive for users to communicate with machines. ChatGPT could also be used to power chatbots and other conversational interfaces, enabling businesses to provide better customer service and support. Ultimately, ChatGPT’s natural language processing abilities are likely to play a significant role in shaping the future of computing and artificial intelligence.

Potential for easier programming

Rust is known for its complex features like lifetime and ownership, which can be difficult to understand for many programmers, especially beginners. However, with the help of ChatGPT, developers can ask questions about these complex features in natural language and get easy-to-understand explanations.

For example, a developer who is struggling to understand Rust’s lifetime and ownership rules can ask the ChatGPT Rust chatbot, “What are Rust’s lifetime and ownership rules?” The chatbot can respond with a natural language explanation, such as “In Rust, every value has an owner and a lifetime. The owner is responsible for deallocating the value when it’s no longer needed, and the lifetime determines how long the value can be used.”

The Rust chatbot can also give more specific explanations of Rust’s lifetime and ownership rules based on the developer’s specific questions. For instance, a developer might ask, “How do I move a value between ownership?” The chatbot can respond with a natural language explanation, such as “In Rust, you can move a value from one owner to another by using the ‘move’ keyword. This transfers ownership of the value to the new owner, and the old owner can no longer access the value.”

With the help of ChatGPT, developers can get easy-to-understand explanations of Rust’s complex features, regardless of their level of experience. This can help to make Rust programming more accessible to a wider range of developers, and can also help to improve communication and collaboration between developers who have different levels of experience with the language.

In conclusion, ChatGPT has the potential to make programming easier by allowing developers to ask questions and receive explanations in natural language, rather than having to learn the specific syntax and structure of a programming language. With the help of ChatGPT, developers can more easily understand complex features like Rust’s lifetime and ownership rules, which can help to make Rust programming more accessible and collaborative for developers of all levels.

Possibility of creating new programming languages

One of the exciting possibilities that arises with the development of language models like GPT is the potential for creating new programming languages. Language models have the ability to learn and generate natural language, which means they could be programmed to understand and generate code in new programming languages.

This opens up a world of possibilities for creating programming languages that are optimized for specific tasks or domains. For example, a programming language could be designed specifically for machine learning or data analysis, with syntax and features tailored to those tasks. Alternatively, a new programming language could be developed for use in areas like robotics, finance, or healthcare, with features optimized for those domains.

The potential benefits of creating new programming languages are many. A language tailored for a specific domain could make it easier for programmers to develop applications in that field. It could also make it easier to optimize code for specific hardware or software platforms. Additionally, a new programming language could provide a more intuitive or efficient way to program, leading to faster development and better performance.

Of course, creating a new programming language is a complex task that involves more than just generating code. It requires careful consideration of the syntax and structure of the language, as well as its features and capabilities. It also involves testing and refinement to ensure the language is reliable and efficient.

However, with the help of language models like GPT, the process of creating a new programming language could be accelerated. Developers could use GPT to experiment with different syntaxes and features, and generate code for testing and refinement.

In conclusion, the possibility of creating new programming languages with the help of language models like GPT is an exciting development in the world of programming. By tailoring programming languages to specific domains or tasks, developers have the potential to create more efficient, intuitive, and powerful programming languages that improve the speed and quality of software development.

Impact on programmers

Potential for more efficient communication and collaboration

The development of language models like GPT also has the potential to improve communication and collaboration across various industries and domains. Natural language processing and generation technology can help to bridge language barriers and facilitate more efficient communication among people who speak different languages.

One area where this technology could be particularly useful is in the field of international business. With the help of language models, employees could more easily communicate with their international colleagues and clients, translating messages and documents in real-time. This could lead to more productive and efficient collaborations, as well as new business opportunities.

Moreover, language models could also facilitate collaboration among teams working on complex projects. For example, a team of software developers could use a language model to generate code snippets or provide suggestions for optimizations, reducing the time and resources required for manual coding and debugging. Additionally, language models could help team members communicate and collaborate more effectively by automatically generating summaries of team meetings or flagging areas where further clarification is needed.

Furthermore, language models could also enhance communication and collaboration in scientific research. In fields such as medicine and biology, researchers often work with large amounts of data and complex terminology. With the help of language models, researchers could more easily share and analyze data, generating new insights and discoveries.

In conclusion, the potential for more efficient communication and collaboration through the development of language models like GPT is significant. By bridging language barriers and enhancing communication among teams, these technologies could improve productivity, reduce costs, and open up new opportunities for collaboration across various industries and domains.

Potential for automating certain programming tasks

The development of language models like GPT also has the potential to automate certain programming tasks, which could greatly improve the efficiency and productivity of software developers. Natural language processing and generation technology can help programmers to write code more quickly, accurately, and with fewer errors than traditional manual coding methods.

One area where this technology could be particularly useful is in the field of software development. With the help of language models, programmers could more easily generate code snippets, create templates, and automate repetitive coding tasks. For example, a language model could be trained on a large dataset of existing code and used to automatically generate new code based on specific inputs. This could help programmers to create new software programs more quickly and efficiently, reducing the time and resources required for manual coding and debugging.

Moreover, language models could also help to address the problem of code quality and consistency. By providing standardized code templates and automating certain coding tasks, language models could help to ensure that code is written in a consistent and error-free manner, reducing the risk of bugs and other issues in the final software product.

Furthermore, language models could also help to democratize programming by making it more accessible to non-technical users. With the help of language models, individuals without a background in programming could more easily generate code and develop their software applications, opening up new opportunities for innovation and entrepreneurship.

In conclusion, the potential for automating certain programming tasks through the development of language models like GPT is significant. By improving the speed, accuracy, and consistency of code generation, these technologies could greatly enhance the efficiency and productivity of software developers, as well as open up new opportunities for innovation and entrepreneurship in the field of software development.

Potential for changing the nature of programming jobs

The development of language models like GPT has the potential to change the nature of programming jobs in several ways. One of the main impacts is likely to be on the skills required for programming jobs, as well as the tasks that programmers are responsible for. The introduction of language models could lead to the automation of certain aspects of programming, such as code generation and debugging. This could mean that programmers will need to develop new skills, such as the ability to work with language models and to understand the output that they generate.

Moreover, the use of language models could also lead to changes in the roles and responsibilities of programmers. With certain programming tasks becoming automated, programmers may be able to focus more on higher-level tasks such as designing software architecture, analyzing data, and developing algorithms. This could require programmers to develop new skills in areas such as machine learning and data analysis, in order to remain competitive in the field.

The use of language models could also lead to changes in the types of programming jobs that are available. Certain tasks, such as code generation and debugging, may be more easily automated than others, such as software design and project management. This could lead to a shift in the types of jobs that are available in the programming industry, with more emphasis on higher-level tasks and less on manual coding and debugging.

Finally, the use of language models could also have implications for the way that programming work is organized and distributed. With certain programming tasks becoming automated, it may be possible for programming work to be outsourced more easily, or for non-technical users to generate their own code using language models. This could lead to changes in the way that programming teams are structured, and in the way that programming work is managed and distributed.

In conclusion, the development of language models like GPT has the potential to change the nature of programming jobs in several ways, including the skills required, the tasks programmers are responsible for, the types of jobs available, and the way that programming work is organized and distributed. While the full extent of these changes is not yet clear, it is likely that language models will have a significant impact on the programming industry in the coming years.

Challenges and concerns

Limitations of ChatGPT

While ChatGPT is a promising tool for generating human-like responses to text-based queries, it has several limitations that must be taken into account. Some of the most significant limitations of ChatGPT are:

  • Lack of context: ChatGPT relies heavily on the text that is inputted into it, and may not take into account the broader context of the conversation or the user’s background. This can lead to responses that are not relevant or appear to miss the point of the conversation.

  • Inability to handle complex queries: While ChatGPT can generate responses to a wide range of queries, it may struggle with more complex or nuanced questions that require a deeper understanding of the topic. This can lead to responses that are incomplete, inaccurate, or not helpful.

  • Repetitive responses: ChatGPT may generate repetitive responses, especially when asked similar questions or when responding to queries that are related to a particular topic. This can lead to a lack of variety in the responses, which can be frustrating for users.

  • Limited language support: ChatGPT is primarily trained on English language text, which means that it may struggle with other languages or with users who use non-standard language or phrasing.

  • Lack of emotional intelligence: While ChatGPT can generate responses that are contextually relevant, it lacks emotional intelligence and may struggle to understand the tone or sentiment of a conversation. This can lead to responses that are insensitive or inappropriate.

In conclusion, while ChatGPT is a promising tool for generating text-based responses, it has several limitations that must be taken into account. These limitations can impact the accuracy and relevance of the responses, and can impact the user experience for those using the tool. As with any AI tool, it is important to use ChatGPT in conjunction with other resources and to be aware of its limitations.

Possible impact on job market

The development of artificial intelligence (AI) and chatbots like ChatGPT has raised concerns about their potential impact on the job market. While some argue that AI and chatbots will create new job opportunities and enhance productivity, others fear that they will lead to significant job losses and displace workers in certain industries.

  • Automation of jobs: AI and chatbots can perform repetitive and routine tasks much faster and more accurately than humans, which means that certain jobs may become automated. For example, customer service roles may be replaced by chatbots that can provide instant responses to customer queries. This could lead to job losses in these industries as businesses look to save costs by replacing human workers with chatbots.

  • Creation of new jobs: On the other hand, the development of AI and chatbots may also create new job opportunities in areas such as software development, data analysis, and machine learning. These new jobs may require a different set of skills and expertise than traditional jobs, which could lead to a shift in the job market and a need for retraining and reskilling of workers.

  • Changes in skill requirements: As AI and chatbots become more prevalent in the workplace, the skills required for certain jobs may change. Workers may need to develop skills in areas such as data analysis and programming to remain competitive in the job market. This could lead to a skills gap if workers are unable or unwilling to develop these new skills.

  • Impact on low-wage jobs: AI and chatbots may have a greater impact on low-wage jobs, as these roles are often more routine in nature and therefore more susceptible to automation. This could lead to income inequality as workers in low-wage jobs are displaced and struggle to find new employment opportunities.

  • Impact on industries: Certain industries may be more vulnerable to job losses as a result of AI and chatbots. For example, the transportation and logistics industries may be impacted by the development of autonomous vehicles, which could replace human drivers. Similarly, the manufacturing industry may see increased automation as a result of advances in AI and robotics.

Concerns about accuracy and bias in programming

The use of artificial intelligence (AI) and chatbots like ChatGPT has raised concerns about the accuracy and potential bias in programming. These concerns arise due to the fact that AI and chatbots are programmed by humans, who may unconsciously introduce their own biases and prejudices into the technology. This can have a significant impact on the accuracy and fairness of AI and chatbots, particularly in areas such as decision-making, where the consequences of bias can be severe.

  • Bias in programming: One of the main concerns about AI and chatbots is that they may perpetuate and amplify biases that are present in society. This can happen in a number of ways, such as when programmers use biased datasets to train algorithms, or when they unconsciously introduce their own biases into the programming. For example, an AI algorithm used in hiring may be trained using data that is biased against certain groups, resulting in discrimination against those groups in the hiring process.

  • Lack of transparency: Another concern is the lack of transparency in AI and chatbot programming. In some cases, it may be difficult to understand how an AI or chatbot arrived at a particular decision or response, making it difficult to identify and correct biases. This can lead to mistrust and skepticism of AI and chatbots, particularly in areas where the consequences of bias can be severe, such as criminal justice and healthcare.

  • Quality of data: The accuracy of AI and chatbots is also dependent on the quality of data used to train the algorithms. If the data is inaccurate or incomplete, the algorithms may produce inaccurate or biased results. This can be particularly problematic in areas where data is scarce or difficult to collect, such as in developing countries or in marginalized communities.

  • Ethical considerations: The use of AI and chatbots also raises ethical considerations, particularly in areas such as privacy and surveillance. For example, if chatbots are used to collect personal data from users, there is a risk that this data could be misused or shared without the user’s consent. Similarly, algorithms used in decision-making may have a disproportionate impact on certain groups, raising questions about fairness and justice.

In conclusion, while the use of AI and chatbots like ChatGPT has the potential to revolutionize many aspects of our lives, it also raises important concerns about accuracy and bias in programming. Businesses and policymakers will need to take steps to ensure that AI and chatbots are programmed in a way that is fair, transparent, and ethical, and that they do not perpetuate or amplify biases in society. This will require ongoing monitoring and evaluation of the technology, as well as a commitment to diversity and inclusion in programming teams.

Conclusion

In conclusion, ChatGPT has the potential to significantly impact programming languages and programmers. The natural language processing abilities of ChatGPT could make programming easier and more accessible for those without extensive programming knowledge. Additionally, it could potentially lead to the creation of new programming languages. For programmers, ChatGPT could make communication and collaboration more efficient, as well as automate certain programming tasks. However, there are also concerns about the limitations of ChatGPT, its impact on the job market, and accuracy and bias in programming. It is important to continue exploring and monitoring the impact of ChatGPT on programming, while also addressing these concerns through ethical programming practices. Ultimately, the successful integration of ChatGPT into programming will require a balanced approach that considers both the potential benefits and drawbacks.

以太坊测试网首次成功模拟 ETH 质押提款

2 月 7 日,以太坊 zhejiang 测试网络在 UTC 时间 15:00(北京时间 23:00 )epoch 1350 上激活上海升级,首次成功模拟了质押 ETH 的提款。在 zhejiang 测试网上,部分和全部提款已经 BLS 更改都包含在执行负载中,下一次测试网升级将在下周的某个时间在 Sepolia 进行,随后是是 Goerli 测试网。

权益证明与质押

ETH 合并

合并是指以太坊的原有执行层(从创世块开始就一直存在的主网)加入其新的权益证明共识层,即信标链。 合并摒弃了消耗大量能源的挖矿,而是通过质押以太币来保护网络的安全。

信标链于 2020 年 12 月 1 日创建,它作为独立的区块链与主网一起运行。并于于 2022 年 9 月 15 日执行合并, 它完成了以太坊向权益证明共识的过渡,以太坊正式弃用了工作量证明并将能源消耗减少了约 99.95%。

什么是权益证明?

权益证明是一类算法,它可以通过确保有价值的资产被不诚实行为的攻击者丢失来为区块链提供安全性。 股权证明系统需要一组验证者来提供一些资产,如果验证者从事某些可证明的不诚实行为,这些资产可能会被销毁。 以太坊使用权益证明机制来保护区块链。

什么是质押?

质押是指存入 32 个以太币以激活验证者软件的行为。作为验证者,你将负责存储数据、处理交易以及向区块链添加新区块。这将每个人保证以太坊的安全,并在此过程中为你赚取新的以太币。

原文链接: https://ethereum.org/zh/zero-knowledge-proofs/

什么是零知识证证明(Zero-Knowledge Proofs, ZKP)?

零知识证明是一种无需揭示声明本身即可证明其有效性的方法。“证明者”(prover)是试图证明主张的一方,而“验证者”则负责验证主张。

零知识证明最早出现在1985年的一篇论文《交互证明系统的知识复杂性》中,该论文提供了当今普遍使用的零知识证明的定义:

零知识协议是一种方法,其中一方(证明者)可以向另一方(验证者)证明某事是真实的,同时不暴露除了该特定声明为真实之外的任何信息。

零知识证明已经在近几年得到了改进,现在它们正被用于几个现实世界的应用中。

我们为什么需要零知识证明?

零知识证明代表了应用密码学的一个突破,它承诺可以提高个人信息的安全性。考虑一下你如何向另一方(如服务提供商)证明一个主张(例如“我是X国公民”)。您需要提供“证据”以支持您的主张,例如国家护照或驾驶执照。

但是,这种方法存在问题,主要是缺乏隐私性。与第三方服务共享的个人识别信息(Personally Identifiable Information, PII)存储在中央数据库中,易受到黑客攻击。由于身份盗窃已成为一个严重问题,有人呼吁采取更具有隐私保护性的共享敏感信息的方法。

零知识证明通过消除证明声明有效性所需要的信息披露可以解决此问题。零知识协议将语句(称为“见证”)作为输入,生成其有效性的简洁证明。该证明提供了强有力的保证,即声明为真,而不暴露创建它所使用的信息。

回到我们之前的示例,您证明公民身份的唯一证据就是零知识证明。验证者只需检查证明的某些属性是否为真,以确信基础声明也为真。

零知识证明是如何运作的?

零知识证明允许您在不共享陈述内容或揭示您如何发现真相的情况下,证明陈述的真实性。为了使此可能,零知识协议依赖于将一些数据作为输入并返回“真”或“假”的算法。

零知识协议必须满足以下标准:

  1. 完备性(Completeness):如果输入有效,零知识协议始终返回“真”。因此,如果潜在的陈述是真实的,并且证明人和验证者表现诚实,则可以接受证明。

  2. 可靠性(Soundness):如果输入无效,从理论上讲,不可能欺骗零知识协议返回“真”。因此,撒谎者不能诱使诚实的验证者相信无效声明是有效的(除了很小的概率)。

  3. 零知识:验证者除了了解声明的有效性或虚假性以外,不了解任何其他信息(他们对声明“零知识”)。这个要求还阻止验证者从证明中推导出原始输入(陈述的内容)。

基本上,零知识证明由三个要素组成 witness、challenge 和 response。

  • Witness: 在零知识证明中,证明者想要证明某一隐藏信息。这个秘密信息就是证明的“witness”,并且证明者假定:基于对witness的某些了解,能确保解一组问题,这些问题只能由了解信息的一方回答。因此,证明者开始证明流程,通过随机选择一个问题,计算回答,并将其发送给验证者。

  • Challenge: 验证者从该系列中随机挑选另外一个问题,并要求证明回答。

  • Response: 证明者接受该问题,计算答案,并将其返回给验证者。证明者的回应允许验证者检查前者是否真的了解 Witness。为了确保凭证者不是在盲目猜测而得到正确答案,验证者会选择更多的问题来问。通过重复这种互动很多次,证明者伪装了解 witness 的可能性显著降低,直到验证者满意为止。

上述描述的是“交互式零知识证明”的结构。早期的零知识协议使用交互式证明,验证一个陈述的有效性需要证明者和验证者之间的来回通信。

一个很好的例子可以说明交互式证明如何运作,就是Jean-Jacques Quisquater的著名的阿里巴巴洞穴故事。在故事中,佩姬(证明者)想要向维克托(验证者)证明她知道开启魔法门的秘密短语,但又不把这个短语暴露出来。

非交互式零知识证明

虽然具有革命性,但交互式证明的用处有限,因为它需要两方随时可用并反复交互。 即使验证者确信证明者是诚实的,该证明也无法用于独立验证(计算新证明需要证明者和验证者之间的一组新消息)。

为了解决这个问题,Manuel Blum、Paul Feldman 和 Silvio Micali 提出了第一个非交互式零知识证明,其中证明者和验证者拥有共享密钥。 这允许证明者在不提供信息本身的情况下证明他们对某些信息的了解(即 witness)。

与交互式证明不同,非交互式证明只需要参与者(证明者和验证者)之间进行一轮通信。 证明者将秘密信息传递给特殊算法以计算零知识证明。 该证明被发送给验证者,验证者使用另一种算法检查证明者是否知道秘密信息。

非交互式证明减少了证明者和验证者之间的通信,使零知识证明更加高效。 此外,一旦生成了证明,其他任何人(可以访问共享密钥和验证算法)都可以进行验证。

非交互式证明代表了零知识技术的突破,并推动了当今使用的证明系统的发展。 我们在下面讨论这些证明类型:

零知识证明的类型

ZK-SNARKs

ZK-SNARK 是 Zero-Knowledge Succinct Non-Interactive Argument of Knowledge 的缩写。 ZK-SNARK 协议具有以下特点:

  • Zero-Knowledge: 验证者可以在不知道语句的任何其他信息的情况下验证语句的完整性。 验证者对声明的唯一了解是它是真还是假。

  • Succinct: 零知识证明比witness小,可以快速验证。

  • Non-Interactive: 证明是“非交互式”的,因为证明者和验证者只交互一次,不像交互式证明需要多轮通信。

Argument: 证明满足“可靠性”要求,因此作弊的可能性极小。

(Of) Knowledge:如果不访问秘密信息(witness),则无法构建零知识证明。 对于没有见证人的证明者来说,即使不是不可能,也很难计算出有效的零知识证明。

前面提到的“共享密钥”是指证明者和验证者同意在生成和验证证明时使用的公共参数。 生成公共参数(统称为公共参考字符串 (CRS))是一项敏感操作,因为它对协议的安全性很重要。 如果用于生成 CRS 的熵(随机性)落入不诚实的证明者手中,他们就可以计算出错误证明。

多方计算(MPC)是一种降低生成公共参数风险的方法。 多方参与可信设置仪式(Trust Setup Ceremonies),其中每个人贡献一些随机值来生成 CRS。 只要一个诚实的一方破坏了他们的那部分熵,ZK-SNARK 协议就会保持计算稳健性。

可信设置要求用户信任参数生成的参与者。 然而,ZK-STARKs 的开发使得证明协议能够在不受信任的设置下工作。

ZK-STARKs

ZK-STARK 是 Zero-Knowledge Scalable Transparent Argument of Knowledge 的缩写。 ZK-STARKs 类似于 ZK-SNARKs,除了:

  • Scalable: 当见证规模较大时,ZK-STARK 在生成和验证证明方面比 ZK-SNARK 更快。 使用 STARK 证明,证明者和验证者的时间只会随着见证的增长而略有增加(SNARK 证明者和验证者的时间随着见证者的规模线性增加)。

  • Transparent: ZK-STARK 依靠可公开验证的随机性来生成用于证明和验证的公共参数,而不是可信设置。 因此,与 ZK-SNARK 相比,它们更加透明。

ZK-STARKs 产生比 ZK-SNARKs 更大的证明,这意味着它们通常具有更高的验证开销。 但是,在某些情况下(例如证明大型数据集),ZK-STARK 可能比 ZK-SNARK 更具成本效益。

零知识证明的用例

匿名支付

信用卡支付通常对多方可见,包括支付提供商、银行和其他相关方(例如政府机构)。 虽然金融监督有利于识别非法活动,但它也损害了普通公民的隐私。

加密货币旨在为用户提供一种进行私人、点对点交易的方式。 但大多数加密货币交易在公共区块链上都是公开可见的。 用户身份通常是假名的,并且要么故意链接到真实世界的身份(例如,通过在 Twitter 或 GitHub 个人资料中包含 ETH 地址),要么可以使用基本的链上和链下数据分析与真实世界的身份相关联。

有专为完全匿名交易而设计的特定“隐私币”。 Zcash 和 Monero 等注重隐私的区块链会屏蔽交易细节,包括发送方/接收方地址、资产类型、数量和交易时间表。

通过将零知识技术融入协议,以隐私为中心的区块链网络允许节点在无需访问交易数据的情况下验证交易。

零知识证明也被应用于公共区块链上的匿名交易。 一个例子是 Tornado Cash,这是一种去中心化的非托管服务,允许用户在以太坊上进行私人交易。 Tornado Cash 使用零知识证明来混淆交易细节并保证财务隐私。 不幸的是,因为这些是“选择加入”的隐私工具,所以它们与非法活动有关。 为了克服这个问题,隐私最终必须成为公共区块链的默认设置。

身份保护

当前的身份管理系统将个人信息置于危险之中。 零知识证明可以帮助个人验证身份,同时保护敏感信息。

零知识证明在去中心化身份的背景下特别有用。 去中心化身份(也称为“自我主权身份”)使个人能够控制对个人标识符的访问。 在不透露您的税号或护照详细信息的情况下证明您的公民身份是零知识技术如何实现去中心化身份的一个很好的例子。

验证

使用在线服务需要证明您的身份和访问这些平台的权利。 这通常需要提供个人信息,例如姓名、电子邮件地址、出生日期等。 您可能还需要记住长密码,否则可能会失去访问权限。

然而,零知识证明可以简化平台和用户的身份验证。 一旦使用公共输入(例如,证明用户是平台成员的数据)和私人输入(例如,用户的详细信息)生成零知识证明,用户可以在需要访问时简单地出示它以验证其身份。 这改善了用户体验,并使组织无需存储大量用户信息。

可验证计算

可验证计算是零知识技术改进区块链设计的另一种应用。 可验证计算允许我们将计算外包给另一个实体,同时保持可验证的结果。 该实体将结果连同证明程序已正确执行的证据一起提交。

可验证计算对于在不降低安全性的情况下提高区块链处理速度至关重要。 理解这一点需要了解扩展以太坊的提议解决方案的差异。

分片等链上扩展解决方案需要对区块链的基础层进行大量修改。 然而,这种方法非常复杂,实施中的错误可能会破坏以太坊的安全模型。

链下扩展解决方案不需要重新设计核心以太坊协议。 相反,他们依靠外包计算模型来提高以太坊基础层的吞吐量。

这是它在实践中的工作原理:

  • 以太坊不是处理每笔交易,而是将执行卸载到一个单独的链上。
  • 处理交易后,另一条链返回结果以应用于以太坊的状态。

这里的好处是以太坊不需要执行任何操作,只需要将外包计算的结果应用到它的状态。 这减少了网络拥塞并提高了交易速度(链下协议优化以加快执行速度)。

链需要一种方法来验证链下交易而不重新执行它们,否则链下执行的价值就会丢失。

这就是可验证计算发挥作用的地方。 当一个节点在以太坊之外执行交易时,它会提交一个零知识证明来证明链下执行的正确性。 此证明(称为有效性证明)保证交易有效,允许以太坊将结果应用于其状态——无需等待任何人对其提出异议。

零知识汇总和验证是两种链下扩展解决方案,它们使用有效性证明来提供安全的可扩展性。 这些协议在链下执行数千笔交易,并提交证明以在以太坊上进行验证。 一旦证明得到验证,这些结果就可以立即应用,从而允许以太坊处理更多交易,而无需增加基础层的计算。

使用零知识证明的缺点

硬件成本

生成零知识证明涉及非常复杂的计算,最好在专用机器上执行。 由于这些机器价格昂贵,它们通常是普通人买不起的。 此外,想要使用零知识技术的应用程序必须考虑硬件成本——这可能会增加最终用户的成本。

证明验证费用

验证证明还需要复杂的计算,并增加了在应用程序中实施零知识技术的成本。 这个成本在证明计算的上下文中特别相关。 例如,ZK-rollups 支付约 500,000 gas 来验证以太坊上的单个 ZK-SNARK 证明,而 ZK-STARKs 需要更高的费用。

信任假设

在 ZK-SNARK 中,公共参考字符串(公共参数)生成一次,可供希望参与零知识协议的各方重复使用。 公共参数是通过可信的设置仪式创建的,参与者被认为是诚实的。

但用户确实没有办法评估参与者的诚实度,用户必须相信开发人员的话。 ZK-STARK 没有信任假设,因为生成字符串时使用的随机性是可公开验证的。 与此同时,研究人员正在为 ZK-SNARKs 进行非可信设置,以提高证明机制的安全性。

量子计算威胁

ZK-SNARK 使用椭圆曲线密码学 (ECDSA) 进行加密。 虽然 ECDSA 算法目前是安全的,但未来量子计算机的发展可能会打破其安全模型。

ZK-STARK 被认为不受量子计算的威胁,因为它使用抗碰撞哈希进行加密。 与椭圆曲线密码学中使用的公私密钥对不同,抗碰撞散列更难被量子计算算法破解。

更多

Computer Scientist Explains One Concept in 5 Levels of Difficulty | WIRED> - Wired YouTube channel

SNARKs vs. STARKS vs. Recursive SNARKs — Alchemy Overviews

A Zero-Knowledge Proof: Improving Privacy on a Blockchain — Dmitry Lavrenov

zk-SNARKs — A Realistic Zero-Knowledge Example and Deep Dive — Adam Luciano

ZK-STARKs — Create Verifiable Trust, even against Quantum Computers — Adam Luciano

An approximate introduction to how zk-SNARKs are possible — Vitalik Buterin

What is Zero-Knowledge Proof and Its Role in Blockchain? — LeewayHertz

每一年的经历都很新奇,每一年的生活都很多彩。2022,又度过了神奇的一年。

2022从论文开始。2022年的元旦节,1月1号,我完成了硕士毕业论文的最后一章:致谢。我真的把这一章当作整个毕业论文最重要的一部分去写。感谢了父母,老师,同学,朋友,同事,两年来所有提供过帮助的人。四月份从学校毕业,在家被封控了一个月,生日当天从隔离点出来。五月来到杭州正式入职,完成学生到打工人的转变。5-12月就是上班-出去玩-上班-出去玩,最后一天以自驾150公里,在千岛湖看了场电影结束了。

今年去了很多地方。五月份正式搬到了杭州,这是应该是我长期居住的第三个城市。去了舟山,去了安吉,去了上海,去了千岛湖,还有大把时间挥霍在南京,我已经在这个城市呆了8年了,相比杭州我还挺喜欢南京的。

今年认识了很多新的朋友。公司的同事、大把神仙一样的高P大牛,新员工培训时各方面顶尖的同龄人。去考驾照认识了一个有点意思的小姐姐。但打游戏倒是没什么新的游戏好友,大概是生活太精彩了,游戏对我已经没什么吸引力了。

今年有了更多新奇的体验。去舟山赶海,去安吉滑雪,去千岛湖骑车,去了上海搞展,还在西湖边看大爷放风筝。在南京去看脱口秀,去看画展,去猫咖撸猫/狗,去吃了越南/印度各种各样神奇的菜,跟琪琪出去玩真的很开心。南图从我大一的时候就听说挺好,今年终于去了一次。唯一的遗憾是十一预约南博但没有预约成。

今年拼了两个乐高,老友记和杜卡迪。

今年在工作中写了一篇技术分享的长文,很受大家欢迎,github涨了接近200个star,OSChina上也有大几千的阅读,甚至准备以此为基础准备去写一本电子书,这是今年最有成就感的事。去年写的paper终于中了,虽然只是一个很水的会。但也是我中的第一篇paper。

今年也是摆烂的一年,勇于开坑但从不填坑。跟yyj搞arduboy游戏机因为太忙烂尾;《编译原理》《代码大全》《敏捷软件开发》几本书看了一半没有兴趣继续下去,唯一看完的是一本《人月神话》;上面的电子书已经烂尾;帮琪琪写的东西也一直没时间搞(这个尽量不烂尾😐);甚至几个游戏都烂掉不想玩了。人生嘛,总是在寻找自己爱的人和事,只是短暂爱过罢了。毕竟海誓山盟并不能够绑住我,一百年里我想爱的人(事)那么多。

今年又学了很多技能。学会了一点点rust,学了一点点kubernetes,对区块链和web3.0有了一点点了解,写了一点点llvm,也读了一丢丢《毛选》。自己买了个域名,搞了博客(👆就是你们看到的这个),虽然全是水文和自己的胡言乱语。工作上稍微搞了一下项目的微信公众号。今年leetcode提交了385次,AC了大概200多题,但某种意义上也是一种烂尾。年底终于去学了摩托车,也拿到了驾照,新的一年希望有机会多骑车,多和朋友一起出去骑车。

今年依旧花了很多时间打游戏。年初成为原批,毕业前打了几个月原神。依旧在打LOL,十年LOL我终于在下棋第一次打上了钻石,但rank越来越摆烂了。Apex依然又菜又爱玩。开始玩尘埃拉力赛,自己买了方向盘踏板和手刹。蹭了室友的switch和PS5,手柄用起来像个残疾人,还得是键鼠。玩了很多小游戏,破门而入(CQB),boratato,死亡细胞,枪火重生(肉鸽)。打游戏还是 生活才是我的第一快乐源泉 。

今年是依旧单身的一年。

本来不想写太多跟工作相关的事情,就像我老板说的:“蚂蚁人的上限不应该只有375”。“程序员也不应该只为了工作和赚钱而活着”,这和我想做一个优秀的程序员应该也不矛盾(我觉得我还算是一个程序员,或者是编程爱好者?)。

Enjoy my fucking life!

https://github.com/zhengda/The-Mythical-Man-Month-zh

目录
第1章 焦油坑

  • 1.1 编程系统产品(Programming Systems Product)开发的工作量是供个人使用的、独立开发的构件程序的九倍。我估计软件构件产品化引起了3倍工作量,将软件构件整合成完整系统所需要的设计、集成和测试又强加了3倍的工作量,这些高成本的构件在根本上是相互独立的。

  • 1.2 编程行业”满足我们内心深处的创造渴望和愉悦所有人的共有情感”,提供了五种乐趣:

    • 创建事物的快乐
    • 开发对其他人有用的东西的乐趣
    • 将可以活动、相互啮合的零部件组装成类似迷宫的东西,这个过程所体现出令人神魂颠倒的魅力
    • 面对不重复的任务,不间断学习的乐趣
    • 工作在如此易于驾驭的介质上的乐趣—纯粹的思维活动,其存在、移动和运转方式完全不同于实际物体
  • 1.3 同样,这个行业具有一些内在固有的苦恼:

    • 将做事方式调整到追求完美,是学习编程的最困难部分
    • 由其他人来设定目标,并且必须依靠自己无法控制的事物(特别是程序);权威不等同于责任
    • 实际情况看起来要比这一点好一些:真正的权威来自于每次任务的完成
    • 任何创造性活动都伴随着枯燥艰苦的劳动,编程也不例外
    • 人们通常期望项目在接近结束时,(bug、工作时间)能收敛得快一些,然而软件项目的情况却是越接近完成,收敛得越慢
    • 产品在即将完成时总面临着陈旧过时的威胁
Read more »

CombinedLintPass

背景

前一篇文章 介绍了关于 LintLintPass 的一些概念和实现。基于这些结构,提供了一个简易的 Lint 检查的实现方式。本文主要介绍 CombinedLintPass 这一结构的实现,并基于 CombinedLintPass 进一步优化 Lint 的实现。

Read more »