Transaction Graph Dataset for the Ethereum Blockchain
Can Özturan, Alper Şen, Baran Kılıç - Boğaziçi University (BOUN)
Infinitech partner Bogazici University is involved in Pilot 9 which is about analysis of blockchain transaction graphs for fraudulent activities and Task 4.4 in Work Package 4 which is about tokenization and smart contracts in finance and insurance services. Since the start of the Infinitech project, Bogazici University has been working on preparing transaction datasets from the public blockchains. Transaction graph dataset for the Ethereum Blockchain has now been uploaded to Zenodo site https://zenodo.org/ .
Ethereum Blockchain supports execution of programs that are called smart contracts. Smart contracts can be used to implement tokens, which can represent real world assets such as company shares, property shares, bonds, ounces of gold and fiat currencies such as the dollar and the euro. There is a growing interest in issuance of stable coins which are tokens representing fiat currencies in a 1-to-1 manner (for example, 1 token equaling 1 EUR or 1 USD) by finance companies. Gemini USD (GUSD), Tether USD (USDT), Tether Gold (XAUT), Statis Euro (EURS) and Turkish BiLira (TRYB) are examples of tokens currently available on the Ethereum Blockchain. The tokens are implemented in smart contracts that adhere to the so-called ERC20 Token standard. The standard transfer functions available in ERC20 token contracts facilitate transfer of ownerships of these tokens.
As shown in the figure above, stable coin contracts can also be seen as an interface between the public Ethereum blockchain ecosystem and the traditional finance ecosystems. Valuable blockchain assets such as crypto currencies can be exchanged with stable coins and stable coins can in turn be redeemed as fiat money in the traditional finance ecosystem. Therefore, assets that are obtained fraudulently can go through various transfers on the blockchain and end up as stable coins in different jurisdictions. As a result, it is possible that a company that accepts stable coins, is paid by stable coins that can be traced to addresses involved in fraudulent activities. Holding stable coins that originated from fraudulent or sanctioned addresses can be risky for the company. If this company is a customer of a bank and even if the bank is not involved in blockchain or token business at all, the decisions of the bank towards this company can be affected. For example, due to risky tokens owned by the company, the bank may have to act more cautiously when issuing credit to the company.
The above scenarios necessitate construction of the transaction graph that contains ether and token transfers among addresses. The program that we develop can extract ether transfers as well as ERC20 token transfers from 35 popular token contracts including the stable coins such as GUSD, USDT, XAUT, PAX, TUSD, EURS, TRYB, USDC and QCAD from the raw blockchain data. The data set that is compiled will later be used in the Pilot 9 Sandbox.