The purpose of a light client is to allow a user to quickly verify on-chain data without running a full node. The example tasks of a light client could be:
- Verify a transaction is included in a block (e.g., a client could verify a receive transaction from a sender)
- Verify a state of the latest ledger such as an account’s balance
- Verify an event happened at a specific height
A light client will start from a trusted state such as genesis state of a blockchain, and then download only necessary data (light-client data) that is sufficient to verify the later state. Once the new state is verified by the light-client data, and it can be a new trusted state. The light client needs a good design and the following may be some properties:
- The size of light client data should be as small as possible;
- The light client data should be self-contained, i.e., a light-client could run by synchronizing with another light client solely without extra data from a full node.
in general, most light client only downloads the header of the blockchain:
- For PoW chain, the header contains difficulty information, Merkel hash of tx and ledger state. This allows if there is a fork, a light client could determine which fork is the canonical chain, and another PoW chain light client can run by only synchronizing the headers of another client;
- For BFT chain (e.g., PoS BFT), the header contains the signature of all validators, and as long as the signatures contain 2/3 signatures of all validators (voting power may also take account), then the header is considered valid and finalized (not reversible).
Taking Tendermint as an example, each valid header has at least 2/3 signatures. However, only headers are not enough to be light-client data because the header does not contain the list of validators (and their voting power) mainly due to size concern - suppose there are 100 validators, the list may cost 2k-3k in header (100 * (20 bytes validator addr + 16 bytes voting power). As a result, to run a light client, the client still need to synchronize with a full node to obtain the validators if the validator set is changed.
Further improvement can be done by using an epoch-based BFT, where the validator change can only happen for every N blocks, where N is the size of the epoch. The epoch can be a day, or a week, or even a month, and the list of validators will be updated in a new epoch. Consider a block interval 10s, the cost of publishing the validators can be 1 / (24 * 3600 / 10) <= 0.016%.
Another interesting topic is that some blockchains, such as Facebook Libra, do not have headers - it only contains the block data directly. In such a setup, how to run a light client in the absence of headers needs further investigation.