如何通过http使用智能协议(原始)获取git对象?

the*_*hai 13 git

我正在尝试使用git智能协议通过http从github.com/git/git获取标记"v2.4.2"的注释.

//获取参考

curl -H "User-Agent: git/1.8.1" -v  https://github.com/git/git/info/refs?service=git-upload-pack
Run Code Online (Sandbox Code Playgroud)

返回refs:

.....
003e2be062dfcfd1fd4aca132ec02a40b56f63776202 refs/tags/v2.4.1
0041aaa7e0d7f8f003c0c8ab34f959083f6d191d44ca refs/tags/v2.4.1^{}
003e29932f3915935d773dc8d52c292cadd81c81071d refs/tags/v2.4.2
00419eabf5b536662000f79978c4d1b6e4eff5c8d785 refs/tags/v2.4.2^{}
Run Code Online (Sandbox Code Playgroud)

//发出上传包请求

printf "0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d785\n0024have 003e2be062dfcfd1fd4aca132ec02a40b56f63776202\n0000" | curl -H "User-Agent: git/1.8.1" -v  -d @- https://github.com/git/git/git-upload-pack -H "Content-Type: application/x-git-upload-pack-request" --trace-ascii /dev/stdout
Run Code Online (Sandbox Code Playgroud)

这什么都不返回.我想知道请求中有什么问题(即我错误地计算了十六进制?)

Warning: --trace-ascii overrides an earlier trace/verbose option
== Info: Hostname was NOT found in DNS cache
== Info:   Trying 192.30.252.130...
== Info: Connected to github.com (192.30.252.130) port 443 (#0)
== Info: TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
== Info: Server certificate: github.com
== Info: Server certificate: DigiCert SHA2 Extended Validation Server CA
== Info: Server certificate: DigiCert High Assurance EV Root CA
=> Send header, 170 bytes (0xaa)
0000: POST /git/git/git-upload-pack HTTP/1.1
0028: Host: github.com
003a: Accept: */*
0047: User-Agent: git/1.8.1
005e: Content-Type: application/x-git-upload-pack-request
0093: Content-Length: 110
00a8: 
=> Send data, 110 bytes (0x6e)
0000: 0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d7850024have 00
0040: 3e2be062dfcfd1fd4aca132ec02a40b56f637762020000
== Info: upload completely sent off: 110 out of 110 bytes
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
== Info: Server GitHub Babel 2.0 is not blacklisted
<= Recv header, 26 bytes (0x1a)
0000: Server: GitHub Babel 2.0
<= Recv header, 52 bytes (0x34)
0000: Content-Type: application/x-git-upload-pack-result
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 40 bytes (0x28)
0000: Expires: Fri, 01 Jan 1980 00:00:00 GMT
<= Recv header, 18 bytes (0x12)
0000: Pragma: no-cache
<= Recv header, 53 bytes (0x35)
0000: Cache-Control: no-cache, max-age=0, must-revalidate
<= Recv header, 23 bytes (0x17)
0000: Vary: Accept-Encoding
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 5 bytes (0x5)
0000: 0
0003: 
== Info: Connection #0 to host github.com left intact
Run Code Online (Sandbox Code Playgroud)

我为什么要这样做?

  • 我没有对文件系统的写访问权限
  • 避免获取不必要的数据(即提交)
  • 标准API /协议

lar*_*sks 9

提交十六进制

您没有错误计算十六进制,但是您没有传递正确的值.请记住,智能协议中的每一行都有一个长度计数:

<length><data>
Run Code Online (Sandbox Code Playgroud)

所以对于一个看起来像这样的行:

00419eabf5b536662000f79978c4d1b6e4eff5c8d785 refs/tags/v2.4.2^{}
Run Code Online (Sandbox Code Playgroud)

您需要丢弃前四个字符,这使得实际提交十六进制:

9eabf5b536662000f79978c4d1b6e4eff5c8d785
Run Code Online (Sandbox Code Playgroud)

请求格式

POST发出请求时,havewant行应该由换行符分隔,但是如果你查看输出curl,你可以看到没有换行符:

=> Send data, 110 bytes (0x6e)
0000: 0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d7850024have 00
0040: 3e2be062dfcfd1fd4aca132ec02a40b56f637762020000
Run Code Online (Sandbox Code Playgroud)

您需要使用--data-binary而不是--data:

--data-binary @-
Run Code Online (Sandbox Code Playgroud)

您需要在这些行前面加上长度计数,并且需要以包含以下内容的行结束0000:

0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785
0032have 2be062dfcfd1fd4aca132ec02a40b56f63776202
0000
Run Code Online (Sandbox Code Playgroud)

调试技巧

GIT_TRACE_PACKET=1如果您希望从中获取大量调试信息,您可以在您的环境中进行设置,git以确切了解它来回发送的内容.

这就是他写的全部内容

即使考虑到上述信息,我也无法自己得到回复,但我认为这会有所帮助.

更新

所以,这很有趣.

我在本地设置了一个git服务器(使用git http-backendthttpd),然后运行tcpdump以获取git remote update操作生成的流量.事实证明,您需要使用null命令来分隔wanthave指令,这是0000(没有换行符,因为长度也会对换行符进行编码).那是:

<length>want <commitid><newline>
0000<length>have <commitid><newline>
<length>done
Run Code Online (Sandbox Code Playgroud)

例如:

0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785
00000032have 2be062dfcfd1fd4aca132ec02a40b56f63776202
0009done
Run Code Online (Sandbox Code Playgroud)

这给了我:

0000: POST /git/git/git-upload-pack HTTP/1.1
0028: Host: github.com
003a: Accept: */*
0047: Content-type: application/x-git-upload-pack-request
007c: User-agent: git/1.8
0091: Content-Length: 113
00a6: 
=> Send data, 113 bytes (0x71)
0000: 0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785.00000032have 2
0040: be062dfcfd1fd4aca132ec02a40b56f63776202.0009done.
== Info: upload completely sent off: 113 out of 113 bytes
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
<= Recv header, 26 bytes (0x1a)
0000: Server: GitHub Babel 2.0
<= Recv header, 52 bytes (0x34)
0000: Content-Type: application/x-git-upload-pack-result
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 40 bytes (0x28)
0000: Expires: Fri, 01 Jan 1980 00:00:00 GMT
<= Recv header, 18 bytes (0x12)
0000: Pragma: no-cache
<= Recv header, 53 bytes (0x35)
0000: Cache-Control: no-cache, max-age=0, must-revalidate
<= Recv header, 23 bytes (0x17)
0000: Vary: Accept-Encoding
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 4 bytes (0x4)
0000: 31
<= Recv data, 51 bytes (0x33)
0000: 0031ACK 2be062dfcfd1fd4aca132ec02a40b56f63776202.
<= Recv data, 6 bytes (0x6)
0000: 1fff
<= Recv data, 1370 bytes (0x55a)
0000: PACK.......[..x...An.0...z.?`..d.*..@..z..(.tu......>~B.....]..8
0040: 2...j).OQ}..#.....'......[..8K..t..,%.S..u..@l..XT...o......'...
[....]
Run Code Online (Sandbox Code Playgroud)

双倍奖励更新

您可以使用该git unpack-objects命令提取packfile.从上面的跟踪中可以看出,您首先得到一个长度编码的响应( 0031ACK 2be062dfcfd1fd4aca132ec02a40b56f63776202),后跟包数据,因此您需要丢弃第一行:

$ git init tmprepo
$ cd temprepo
$ tail -n +2 output_from_curl | git unpack-objects
Unpacking objects: 100% (91/91), done.
$ find .git/objects -type f | head -3
$ git cat-file -p dc940e63c453199dd9a7285533fbf2355bab03d1
/*
 * GIT - The information manager from hell
 *
 * Copyright (C) Linus Torvalds, 2005
 *
 * This handles basic git sha1 object files - packing, unpacking,
 * creation etc.
 */
[...]
Run Code Online (Sandbox Code Playgroud)