Skip to content
Avatar
  • Dapper Labs
  • Austin, TX

Organizations

@dapperlabs @onflow
Block or Report

Block or report fxamacker

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
fxamacker/README.md

My first open source project is fxamacker/cbor. It's a CBOR codec used by Arm Ltd., Berlin Institute of Health at Charité, Chainlink, ConsenSys, Dapper Labs, Duo Labs (cisco), EdgeX Foundry, Fortanix, National Cybersecurity Agency of France (govt), Oasis Labs, Netherlands (govt), Taurus SA, Tailscale, and others. Microsoft Corporation had NCC Group produce a security assessment (PDF) which includes portions of fxamacker/cbor in its scope.

Most of my source code is closed source (in many languages but mostly multithreaded C++). I'm currently working on open source Go projects like fxamacker/cbor, fxamacker/circlehash, onflow/atree, onflow/cadence, and onflow/flow-go.

image

Design & Implementation

onflow/atree: I designed and implemented a novel hash collision handling method as part of Atree (onflow/atree). I tried to balance speed, security, and storage size. It uses a fast noncryptographic 64-bit hash and if there is a hash collision, it uses deferred and segmented 256-bit cryptographic digest (in 64-bit segments). By default, it uses CircleHash64 and BLAKE3.

This hash collision handling method is different from published methods such as Cuckoo Hashing, Double Hashing, 2-Choice Hashing, etc.

Atree is used by Cadence in the Flow Blockchain. Atree wouldn't exist without Dieter Shirley setting goals and inspiring us, Ramtin M. Seraj leading the R&D to make it possible, and Bastian Müller improving Atree while leading the integration into Cadence. Special thanks to Supun Setunga for leading the very complex data migration work and more.

Optimizations

My favorite optimization improved speed, allocs/op, alloc/op, and file size without adding concurrency or negative tradeoffs.

onflow/flow-go: Found optimizations by reading unfamiliar source code and proposed them to resolve issue #1750. Very grateful for Ramtin M. Seraj for opening a batch of issues and letting me tackle this one.

PR #1944 (Optimize MTrie Checkpoint for speed, memory, and file size):

  • SPEED: 171x speedup (11.4 hours to 4 minutes) in MTrie traversing/flattening/writing phase (without adding concurrency) which led to a 47x speedup in checkpointing (11.7 hours to 15 mins).
  • MEMORY: -431 GB alloc/op (-54.35%) and -7.6 billion allocs/op (-63.67%)
  • STORAGE: -6.9 GB file size (without using compression yet)

After PR #1944 reduced Mtrie flattening and serialization phase to under 5 minutes (which sometimes took 17 hours on mainnet16), creating a separate MTrie state used most of the remaining duration and memory.

Additional optimizations (add concurrency, add compression, etc.) were moved to separate issue/PR and I switched my focus to related issues like #1747.

UPDATE: About six months later, file size grew from 53GB to 126GB and checkpointing frequency increased to every few hours (instead of about once daily) due to increased transactions and data size. Without PR #1944, checkpointing would be taking over 20-30 hours each time, require more operational RAM, and slowdown the system with increased gc pressure. More info: issue #2286 and PR #2792.

Evaluations and Improvements

fxamacker/circlehash: I created CircleHash64 on weekends after evaluating state-of-the-art fast hashes for work. At the time, I needed a fast hash for short input sizes typically <128 bytes but didn't like existing hashes. I didn't want to reinvent the wheel so I based it on Google Abseil C++ internal hash. CircleHash64 is well-rounded: it balances speed, digest quality, and maintainability.

CircleHash64 has good results in Strict Avalanche Criterion (SAC).

CircleHash64 Abseil C++ SipHash-2-4 xxh64
SAC worst-bit
0-128 byte inputs
(lower % is better)
0.791% 🥇
w/ 99 bytes
0.862%
w/ 67 bytes
0.802%
w/ 75 & 117 bytes
0.817%
w/ 84 bytes

☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).

CircleHash64 is fast at hashing short inputs with a 64-bit seed

CircleHash64
(seeded)
XXH3
(seeded)
XXH64
(w/o seed)
SipHash
(seeded)
4 bytes 1.34 GB/s 1.21 GB/s 0.877 GB/s 0.361 GB/s
8 bytes 2.70 GB/s 2.41 GB/s 1.68 GB/s 0.642 GB/s
16 bytes 5.48 GB/s 5.21 GB/s 2.94 GB/s 1.03 GB/s
32 bytes 8.01 GB/s 7.08 GB/s 3.33 GB/s 1.46 GB/s
64 bytes 10.3 GB/s 9.33 GB/s 5.47 GB/s 1.83 GB/s
128 bytes 12.8 GB/s 11.6 GB/s 8.22 GB/s 2.09 GB/s
192 bytes 14.2 GB/s 9.86 GB/s 9.71 GB/s 2.17 GB/s
256 bytes 15.0 GB/s 8.19 GB/s 10.2 GB/s 2.22 GB/s
  • Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU
  • Results from go test -bench=. -count=20 and benchstat
  • Fastest XXH64 in Go+Assembly doesn't support seed

CircleHash64 doesn't have big GB/s drops in throughput as input size gets larger. Other CircleHash variants are faster for larger input sizes and a bit slower for short inputs (not yet published).

Implement IETF Internet Standards (RFC 8949 & RFC 7049)

fxamacker/cbor: I designed and implemented a secure CBOR codec after reading RFC 7049. During implementation, I helped review the draft leading to RFC 8949. The CBOR codec rejects malformed CBOR data and has an option to detect duplicate map keys. It doesn't crash when decoding bad CBOR data.

Decoding 9 or 10 bytes of malformed CBOR data shouldn't exhaust memory. For example,
[]byte{0x9B, 0x00, 0x00, 0x42, 0xFA, 0x42, 0xFA, 0x42, 0xFA, 0x42}

Decode bad 10 bytes to interface{} Decode bad 10 bytes to []byte
fxamacker/cbor
1.0-2.3
49.44 ns/op, 24 B/op, 2 allocs/op* 51.93 ns/op, 32 B/op, 2 allocs/op*
ugorji/go 1.2.6 ⚠️ 45021 ns/op, 262852 B/op, 7 allocs/op 💥 runtime: out of memory: cannot allocate
ugorji/go 1.1-1.1.7 💥 runtime: out of memory: cannot allocate 💥 runtime: out of memory: cannot allocate

*Speed and memory are for latest codec version listed in the row (compiled with Go 1.17.5).

fxamacker/cbor CBOR safety settings include: MaxNestedLevels, MaxArrayElements, MaxMapPairs, and IndefLength.

Professional Background

I try to balance competing factors such as speed, security, usability, and maintainability based on each project's priorities.

Most recently, I accepted an offer I received on April 13, 2021 from Dapper Labs. I had been working for them as an independent contractor for about two weeks to help optimize Cadence storage layer and to create a streaming mode branch of fxamacker/cbor. On my first day as a contractor, I created issue 738 and the Cadence team was very welcoming and productive to work with. I subsequently opened 100+ issues and 100+ PRs at work in 2021.

My prior experience before Dapper Labs includes co-founding & bootstrapping enterprise software company, and working as an IT consultant.

  • My smallest consulting client - a startup. I assisted with prototyping which helped secure their next round of funding.
  • My largest consulting client - an S&P 500 company with almost 50,000 employees. I evaluated (as part of a large team) various technologies to be selected for their new global stack for deployment to over 100 countries.
  • My largest software licensing+subscription+support client - a company with over 3,000 employees in the US that deployed my data security system to all their US-based offices and factories. The tamper-resistant system used 4 types of servers to distribute and enforce security policies across multiple timezones for various client software. The system was designed to repair and update itself with bugfixes without introducing downtime. I was only one of two people to ever have access to the source code: just two of us conceived, designed, implemented, tested, and maintained the system. Our system beat enterprise solutions from well-funded large competitors year after year during customer evaluations which included testing employee-attempted data theft. It was not approved for export or use outside the US.

Developing commercial software provided the advantage of choosing the most appropriate language and framework for each part of the system because the customers didn't know what programming languages, tools, or frameworks were used.

Pinned

  1. cbor Public

    CBOR codec (RFC 8949) with CBOR tags, Go struct tags (toarray, keyasint, omitempty), float64/32/16, big.Int, and fuzz tested billions of execs.

    Go 462 41

  2. circlehash Public

    CircleHash is a family of fast hashes -- CircleHash64f is ideal for short inputs, reaching 10GB/s starting at <64 bytes and 15GB/s at 256 bytes (i7-1068NG7)

    Go 16 2

  3. The draft leading up to RFC8949

    Makefile 5 12

  4. Cadence, the resource-oriented smart contract programming language 🏃‍♂️

    Go 399 105

  5. float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32

    Go 28 4

  6. Atree provides scalable arrays and scalable ordered maps.

    Go 31 5

2,054 contributions in the last year

Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Mon Wed Fri
Activity overview
Contributed to onflow/atree, onflow/flow-go, fxamacker/circlehash and 16 other repositories

Contribution activity

July 2022

Created a pull request in onflow/flow-go that received 1 comment

[Execution Node] [WIP] Reuse ledger state in checkpoints for -152GB RAM and -24 minutes

⚠️ WIP (not ready for code review), this PR is an alternative approach to PR #2770. ⚠️ Avoid creating separate ledger state during checkpointing. U…

+771 −591 1 comment
2 contributions in private repositories Jul 7 – Jul 13

Seeing something unexpected? Take a look at the GitHub profile guide.