In this part of the tutorial series, we will pick up where we left off in part 1 and cover some additional CKB programming concepts. After this article, you will have a full understanding of how scripts work, how to communicate with scripts beyond passing arguments to them, the various ways you can manage transaction dependencies, and even how to add different types of time stamps to cells to control when they become available for use. The information from this post and its predecessor is sufficient to enable you to understand and design highly complex transactions on CKB.

Script Execution Model

In the last post, I explained two important aspects of the script execution model. Those were:

  • Execution Context: That scripts are attached to cells, but have access to the entire transaction in which they are executing. I.e., they are included at the cell level, but execute at the transaction level.
  • References and Dependencies: Scripts are referenced by cells within their type and lock fields, and references can use a script's type hash or data hash. Further every script referenced by a cell within a transaction must be included in the transaction's cell dependencies.

Both of these properties of the Script Execution Model have practical consequences. By knowing about a script's execution context, you know that the constraints you can define in your script can enforce rules beyond the cell to which the script is attached. By knowing how references and dependencies work, you know how to look at a cell's structure and determine which dependencies you'll need to include in a transaction that includes that cell.

From this perspective, let's take a look at a simple transaction transferring native tokens (CKBytes) from one address to another.

Example CKByte-transfer transaction

Note that I am not including cell fields in the diagram that are irrelevant to this discussion. In reality, all cells have type, lock, capacity and data fields, though I am excluding the type field because it is null in this example and including the data field on cells that have empty data...

In the above transaction, there are two cells as input and one cell as output. The two input cells each have a capacity of 500 CKBytes and are owner by the user whose public key hash is equal to 0xabc. The output cell has a capacity of 1000, owned by the user whose public key hash is equal to 0xdef. All three cells depend on the same lock script, which is included as a cell dep. We can see that all three cells use the same lock script because their Lock fields reference the same code_hash of 0x123 with a hash type of type.

When this transaction executes, the CKB-VM will look within the transaction's cell_deps to locate a cell for which the hash of its type field will equal 0x123. Since Lock Scripts only execute on inputs and not on outputs, an execution will not occur for the output. If the output had a Type Script as well, however, it would execute because Type Scripts execute for both inputs and outputs.

The question becomes: since both cells on the inputs reference the same script in their Lock fields, will the script execute once or twice?

More specifically, will the CKB-VM re-execute the lock for each cell that references this script?

Not exactly.

What actually happens is that prior to execution, cells are organized by script group. A script group is a group of cells that:

  • Reference the same dependency and
  • Have the same args

A script will execute once for each script group.

In the above example, there is only one script group: the two inputs (the output will not be included in a script group since it only has a lock script and lock scripts don't execute for outputs). Since the two inputs share the same arg, the script will only execute once. It's up to the script code itself to check all of the cells in the script group.

As a CKB developer, this means that when you're writing new scripts, you will have to make sure to explicitly check every cell in the script group using the right system calls. Later on in this post we will run an experiment within our developer environment to see this in action.

The concept of script groups are the final piece of the Script Execution Model. Combined with the concepts of the Script Execution Context and References & Dependencies, you have all the foundational knowledge to begin reasoning about scripts. Before we move onto actually writing scripts and learning about the available system calls our scripts can use, it's worthwhile to dive into more details on structure of transactions in CKB.

Transaction Structures In-Depth

Up until this point, I've represented transaction as having three parts: inputs, outputs, and dependencies. Further, I've said that cells have four fields: data, capacity, lock, and type. I've also only talked about transaction dependencies as if they were normal references to pre-existing cells, identical to inputs but placed in a different field within the transaction.

But transactions aren't so simple.

In this section, we'll build upon the models so far by considering the additional properties of certain data structures:

  • Additional properties of a transaction: witnesses
  • Additional properties of dependencies: depType, header_deps, cell_deps
  • Additional properties of cells: since

Witnesses

Transactions need to be signed before they're submitted to CKB. Transaction signatures exist in the witnesses field. For each input in the transaction, there needs to be one witness within that transaction. So, if I have 3 inputs, then there should be 3 witnesses as well.

The witnesses field is simply an array of byte arrays, except for the first witness, which is a map with the following structure:

{
 lock: bytes,
 input_type: bytes,
 output_type: bytes
}

As you know, cells have both type and lock fields, which are key value structures like the following:

{
code_hash: bytes,
hash_type: "data" | "type"
args: [...]
}

The args field here shouldn't really change - its purpose is to provide the executable code that the script structure references with data that it will probably need on every execution. For example, the default lock script for cells references a cell dependency that contains an implementation of Secp256k1 for signature verification. The arg passed to this script is the public key hash of the owner. The only time you would change this is if you were to transfer capacity to a new address (such as the simple transaction example in the beginning of this post). The limitation of the args field contained within a cell's lock or type script structures is that you would have to consume the cell and create a new one as an output in order to change the args. So, what if you don't want to change the args passed to a script, but you want to provide it with additional information per transaction? That is the purpose of the first witness - called the witnessArgs - in a transaction. It enables you to pass transient data to scripts that only matters for that specific transaction.

To illustrate this, let's look at a simple example for the logic of a hash-locked script.

The function of a hash-lock is to lock a cell with a given "secret hash", such that only someone who can provide the preimage to the secret hash can unlock the cell.

A hash-lock therefore needs two pieces of information:

  1. The secret hash
  2. The submitted preimage

The difference between these two pieces of information is that the lock script always needs access to the secret hash and is set by the creator of the cell. The preimage, on the other hand, is submitted by anyone trying to claim the cell, and is only meaningful within a single transaction.

Therefore, the script structure would look like this:

// Lock field of a cell

lock: {
 code_hash: <hash_of_cell_containing_hashlock_code>,
 hash_type: "data",
 args: [secret_hash]
}

To claim this cell, a user would need to create a transaction and include this cell in the transaction's inputs. The user would also need to somehow indicate to the hash-lock that it knows the preimage that, when hashed, will generate the secret hash stored in the lock's args. To do this, the user would include the preimage within the witnessArg's lock field:

witnessArgs:
{
 lock: <preimage>,
 input_type: null,
 output_type: null
}

The hash-lock could then load in the lock field of the transaction's witnessArgs, hash the preimage, and check the resulting value against the secret hash. If they match, the hash-lock will succeed, otherwise, it will fail, causing the transaction to fail.

The other fields in the witnessArgs exist to enable similar functionality - per-transaction, transient arguments - for other scripts in a transaction. input_type provides arguments for the type scripts attached to inputs, while output_type provides arguments for the type scripts attached to outputs.

Transaction signing works by taking the values of each witness in the witnesses array and signing it with the private key that corresponds to the public key hash of the associated input's lock. So, the third witness will be signed with the private key that corresponds to the third input's lock script (at least in the case where the lock script is Secp256k1. There are nuances here. For example, some lock scripts may require that all inputs are signed by the same key, while other lock scripts that enable partially signed transactions will only check the witness at the index equal to the index of the input).

Dependencies

The next structure we will look at in more detail are the transaction's dependencies.

I mentioned before that dependencies are like inputs, the difference being that they are not consumed when they're included in a transaction.

A cell must already exist on-chain prior to being used as a dependency. You can't, for example, create a transaction with an output while also including that same output within the same transaction's dependencies.

Cell output used as cell dep is not allowed

The above diagram depicts Cell 3 in output referencing a dependency by the hash of cell 2's code. It is not possible for the same cell to be both an output and a dependency within the same transaction. A keen reader may notice that this is not the only way the diagram could be interpreted. It is plausible that Cell 2 and Cell Dep 1 are completely different and just happen to contain the same data. In this case, the transaction would work.

This ambiguity is due to the fact that I've been depicting dependencies as resolved dependencies. When a transaction is submitted to CKB, dependencies are actually references to cells. One of the first steps in the process of executing a transaction is dependency resolution, in which dependencies, just like inputs, are loaded into the VM.

This is what a transaction dependency looks like prior to dependency resolution:

{
 dep_type: "code" | "dep_group",
 out_point: {
  tx_hash: "0x...",
  index: "0x..."
 }
}

Don't worry about the dep_type field, yet. I'll explain that portion in the very next section. For now, notice that a dependency, just like an input, contains an outpoint, which is a reference to a specific output at a given index of a previously confirmed transaction.

Dep Types

Cell dependencies can be of two different types: code or depGroup.

When I described the process of dependency resolution earlier in this post, that process was more specifically about cell dependencies that have a depType of code. Code dependency types are the simpler of the two options, as these dependencies are just cells with code in their data fields.

The use of depGroup allows you to "bundle" various dependencies together. For example, if you have three separate scripts that tend to be used together, but should still be separated in general, then you can use a depGroup to include all three of them with a single dependency.

The data field in a dependency of type depGroup should be a serialized list of OutPoints. Remember that outpoints contain a transaction hash and an index. Before any code is executed, during dependency resolution, the cells referenced in the depGroup are resolved as individual cells within the collection of transaction dependencies. In this way, the various type and lock scripts on your cells do not need to reference the dependencies any differently than they would code dependencies.

Header Deps

In the above section, I described the structure of a dependency. But transactions actually don't have a dependencies field. Rather, they have a cell_deps field and a header_deps field. Everything I've said about dependencies so far has been in reference to cell dependencies, or cell_deps.

Header dependencies enable scripts to access block headers. A transaction's header_deps are a simple array of 32 byte hashes of a block header. Please refer to this document to view the various fields stored in a block header.

Header dependencies can be used in one of two ways within a script by using the load_header system call (don't worry about which system calls are available to scripts or how to invoke them, as that will be described in part 3).

First, a script can load a header by index. Invoking the system call in this way will cause the VM to look inside the header_deps and retrieve the header hash at the provided index. It will then load the corresponding block header and write it to memory for the script to access.

The second way to load a header is to load the header of a specific input. Invoking the system call in this way will cause the VM to search the headers referenced by header_deps and, if one of the header_deps refers to a block in which the input was created, it will write that header to memory.

Time Locked Cells

Cell's have another field, called since, which can contain various types of timestamps to delay the ability to spend the cell until the timestamp has been reached.

I won't go through all of the formatting details and other nuances, so if you want to learn more about the since field after this section, please refer to this RFC.

The since field can contain time locks of three different types. It can contain a block number to indicate that a cell cannot be sent until after a certain block height. It can also contain a Unix timestamp or an epoch number. These values can be treated as absolute or relative timestamps, and how they are treated depends on a flag included in the since value (see RFC for more details on this).

Conclusion & Next Steps

The concepts introduced within this post have built on the foundations from part 1, and have prepared us to analyze and design sophisticated CKB systems. In the next post, I'll first walk you through the functionality and inner workings of the Nervos DAO - a script that makes use of many of these more advanced concepts. After that, I'll introduce a workflow for going from dapp idea to dapp MVP, as well as introduce the tools you'll need to prototype CKB Dapps. You'll learn how to design, write, and test CKB on-chain scripts as well, write generator logic, and integrate with a frontend UI.

If you're interested in studying more in the mean time, please take a look at the following repos and articles for further reading (some of these I've referenced in the above paragraphs, but I will list them here as well for convenience):

  1. CKB Data Structures (and Block Header)
  2. CKB Since Field
  3. CKB Crypto-economics
  4. Nervos DAO RFC