Turning Up the Dial: the Evolution of a Cybercrime Market Through Set-up, Stable, and COVID-19 Eras
We investigate the evolution of a cybercrime market during a two-year period from June 2018 to June 2020, including its shift through the COVID-19 pandemic by looking at the contractual transactions made on the largest, longest-running cybercrime forum, HackForums. We divide the period into three eras based on external events observed, which significantly affect to the marketplace (when the contracts system was made compulsory, and when the global pandemic was declared): Set-up, Stable, and COVID-19 eras.
Our dataset contains nearly 200,000 contractual transactions created on the marketplace of HackForums over two years from June 2018 to June 2020, providing valuable insights into the economic activity linked to the forum. Contracts can be public or private in which the details of private contracts are hidden. Public contracts also only visible for users if they pay a
small fee to upgraded their account. While private contract contain only minimal basic information such as make, taker, contract type, and created date, each public contract also includes the goods and services being exchanged, obligations, agreement terms, and the ratings of the parties involved. All contracts belong to one of five types: Sale (the largest proportion), followed by Exchange, Purchase, Trade and Vouch Copy. Some of the
contracts are linked with the advertising threads and discussion posts which provides additional context. Thus, the dataset also includes the information of thousands of posts, threads, and members. Our data is made available for academia through a sharing agreement so that other academic researchers can fully reproduce our experiments and build further analyses based on our results. To request the data, please contact Cambridge Cybercrime Center director and follow the instruction.
Taxonomy of the collected contracts
Huge shifts between eras. While there are significant changes between three eras, the number of new contracts and members tend to fluctuate together, except during the Set-up era when the number of new contracts gradually grew but the number of new members moderately decreased. This indicates that on average, users were making more contracts each month during this era. We see an outbreak in both the number of new members and contracts in response to the new policy adopted at the beginning of Stable era. The last 4-months in the COVID-19 era shows a sharp but fairly short-lived peak in both new contracts and members, even surpassed the previous peak, indicating a stimulus of trading activities. While the number of new members also increases, this does not outpace the past peak, indicating that not only new members but also established members were contributing more at that time. After the peak in April 2020, we see a drop in both number of users and contracts, showing a decrease of trading activities on the marketplace. It appears lockdown only intensified the market for a short period after the pandemic was declared.
Contracts completion time. We observe that contracts are completed faster over time. The maximum occur during Set-up, with contracts completed in around 70 hours. During Stable era, the completion time dropped to a stable state at around 30 hours. We observe the minimum completion times for all contract types occur during COVID-19, with contracts taking less than 10 hours to complete in June 2020. We believe this increased speed over time is presumably due to users becoming more familiar with the contract system, or to increased trading demand. We also observe suspicious some short-lived peaks in Trade, which we then check and see some very time-consuming transactions. As the number of Trade transactions is pretty small, it is likely that those peaks are due to noise instead of reflecting the real characteristic of the marketplace. For this analysis we only consider contracts having completed time, however, as these contracts account for around 70% of all completed contracts, we believe this is representative of the marketplace.
Market centralisation. We find that the market is highly centralised, a small number of threads and members covering the majority of transactions -- only around top 5% of users are responsible for over 70% of contracts and around 70% of contracts associated with a thread are linked to the top 30% of threads. It suggests a few key members and threads play a prominent role in the marketplace. By looking at the underlying social graph made by the members, we also find that the forum is highly centralised around power-users in term of contractual connectivity. We denote top 5% influential members and threads contributing to the most contracts each month as key members and key threads, respectively. This figure here shows the proportion of contracts made by key members and threads each month. Except for key threads over created contracts, the others increased from Set-up to Stable eras. We also observe a rapid growths of all types at the beginning of COVID-19 Era, indicating the market became more centralised in response to the pandemic.
Trading activities. We observe that currency exchange and payments accounting for the largest proportion of contracts and users involved, followed by Giftcards and accounts/licenses. The other popular products include automated bots, hacking tutorials, remote access tools (RATs), and eWhoring packs. We found that most members involves in a few, one-off transactions with 49% making only one transaction and 46 accepting only one transaction. This figure here shows the evolution of the top five products over the three eras. Despite the increased number of contracts at the beginning of Stable, During Set-up and Stable, the number of contracts in the top five categories does not grow rapidly. This is likely due to the decline in the number of public contracts, with users moving to private contracts where we cannot see the content. In fact, the proportion of public contract decreases from 50% at the beginning of Set-up Era to around 10% at the end of this era. During the COVID-19 Era, we see a stimulus of all top products. And at the end of this era, hackforums-related comes to take the highest position, despite placing last at the beginning of Set-up Era. hackforums-related refers to virtual Hack Forums products, such as buying bytes (a type of internal currency using within the forum), and vouch copies. It suggests a high demand for establishing reputation in the community. Overall, we see a clear evidence that the platform is being used as a cash-out market.
Preferred payment methods. In term of both number of contracts completed (upper figure) and trading value (lower figure), Bitcoin and PayPal dominate the others at all time. We also see other cryptocurrencies, including Ethereum, Bitcoin Cash, Litecoin, and Monero, account for trivial proportions, indicating that despite its limitations, Bitcoin is still a popular cryptocurrency on the underground marketplace. Amazon Giftcards are ranked third and the most wanted fiat is USD. There is a gradual downtrend in Stable we again believe this is because many users chose private transactions. We see a peak at the beginning of Stable era for most of these payment methods and a stimulus among them during COVID-19 era, particularly Bitcoin and PayPal. However, the ranking between them basically does not change except a considerable increase of Cashapp, in the last two months, when it outpaces PayPal and Amazon Giftcards to take the second place, its highest ever ranking.
Trading value. We estimate the trading values for completed contracts, ignoring Vouch Copy (as they are proofs of reputation rather than an economic trades). To extract the trading values more precisely, we manually check high-value transactions exceeding 1,000 USD, which is found mostly related to Bitcoin and PayPal (or Cashapp) exchanges. We verify these by reading the contract information, checking Bitcoin address if provided, to identify actual values. We estimate the total value of public transactions to be nearly 1 million of US dollar. The actual trading values are likely to be much larger, as the proportion of completed private contracts is over five times higher than public ones. To estimate the value for both private and public contracts, we assume private contracts are at least as valuable on average as public ones. We note more valuable transactions accord a higher degree of risk of incrimination, and therefore may be more likely to be private. We thus extrapolate by each contract type to gain a lower bound total estimated value of private contracts around 5 millions and therefore, more than 6 millions for all contracts. Again, we observe that the marketplace is highly centralised around power-users, with the top 10% users party to over 70% of the total value. The figure shows the evolution of monthly value by contract types, where we can see Exchange generally accounts for the highest value, followed by Sale and Purchase. We see a downtrend doing Set-up and Stable, presumably due to the decrease of public contract proportion during Set-up (so that we cannot observe the right value here). We see an uplift at the beginning of Stable Era, when new policy is adopted. In COVID-19, the biggest increases come from Sale and Exchange, where we observe a stimulus in the trading value. We also observe a short-lived outbreak in Sale, which for a while outpaces Exchange to become the highest value in March and April 2020, however Exchange quickly resumes first place afterwards.
The ‘cold start’ problem. This problem refers to the situation that new members face the challenge of getting started on the market, establishing reputation and building up a customer base without any existing reputation. While this problem is commonly known in recommender systems, we believe we are the first to consider it in relation to underground markets. To address this question, we first use clustering to take out the users successfully overcame the 'cold start' problem then investigate the trading activities that they involve in. Our choice of predictor variables is informed by the literature on trust in underground markets, and include users’ positive and negative ratings, number of disputed transactions, and length of participation since first active post. We use k-means clustering to examine groups within a subset of members who accepted their first contract in Stable. We limit this analysis to Stable only, as during Set-up many actors had a presence in the marketplace before the contract system began. We find two clusters: the first one contains the majority (97.7%) of members. These users have a median of one accepted contract and seven posts, showing general low-volume activity of most members. The second cluster is significantly smaller, containing 2.3% of members, with a median of 49 accepted contracts and nearly 300 posts. This cluster is characterised by a greater amount of market activity, and it is the outliers we are interested in, as these capture the users who successfully overcame the ‘cold start’ problem. We then look at the Products and Services that those users involves in, finding that in the context of the ‘cold start’ problem, the transaction type plays an important role. The majority of these members build their reputation by participating in low-level currency Exchange. A proportion of these users only offer items on an exchange basis. A small proportion of users do not participate in Exchange, instead establishing themselves by offering products and services.
The evolution of latent groups. To generate insights into the kinds of users and patterns of behaviour in the market, we use Latent Transition Modelling (LTM) to identify latent ‘classes’ within the data. LTM involves a longitudinal application of Latent Class Analysis, a statistical modelling technique which uses clustering to find latent groups in data which share similar characteristics, and to assign group membership to the items in our dataset. We ignore trade and vouch copy since the number of these is pretty small. This is crucial to understanding how co-operation evolves over time in this marketplace, the behaviours of each class, how users move between classes over time and how they change across the lifetime of the market. The main findings here are mostly discussed in the Summary section below.
Summary: A big picture
Set-up Era: The market forms in the first era, with users gradually shifting to the new platform. Initially Exchange contracts are split between large numbers of small-scale users (who make single currency exchanges) and power-users. We see an increase in Exchange, largely driven by power-users who then dominate this contract type. Purchase is dominated by ‘small-fry’, with the growth in transactions almost entirely driven by single transaction users, with takers split between small-fry and a small number of power-users. Sale is dominated by small-scale users in this era. Although this period begins with an even mix of public and private transactions, it shifts in favour of private transactions towards its end, when contracts become compulsory. We do not see the growth in this era of a ‘concentrated’ market, with small-scale users dealing with other small-scale users, and power-users with other power-users. Thus, as the market slowly grows, it is not turning into a ‘business to customer’ market. This has important implications for trust mechanisms across Set-up, as the trust function of recording transactions is likely to play only a small role where individuals make or accept only a single transaction. Thus, in Set-up, we conclude that the market largely facilitates the growth of relationships between power-users, rather than the establishment of trusted traders used by large numbers of small-scale users.
Stable Era: There is a considerable shift in the composition and scale of the market when contracts become compulsory. While far more transactions are being made, the majority (around 88%) are now private, meaning other users can only see limited feedback and the transaction type. The market sustains a diverse range of behaviours and products over this period. We observe the growth of ‘business to customer’ patterns of trade, with individual power-users beginning to cultivate large numbers of small-scale customers, rather than trading with one another. The influx of customers appears to accelerate competition (and hence, conflict), accelerating progression to the norming phase, our Stable era. Power-users who established themselves over the Set- up period capitalise on the reputation and trust they have built up, and alongside newer would-be power-users, are well-positioned to capitalise on the influx of small-scale custom. Hence, the trust relationships facilitated by the market infrastructure shift between these two phases - from a forming, orienting function to one which more closely resembles trust relationships within a traditional market, with clear producers and consumers.
1. No ground truth verification. While contractual details can be observed, we generally have no way to verify if transactions actually go through with exact values as described. Where traders specify the Bitcoin/Ethereum addresses or the transaction hashes, the actual values on the blockchain can be confirmed, otherwise, the dataset lacks ground truth verification. Moreover, even when the transaction hash is provided, we have no way to verify its integrity, as the dishonest parties could still find an appropriate transaction on blockchain then put it into the contract details to gain reputation.
2. Large number of private contracts. Through the entire period, the proportion of private contracts in our dataset is considerable (around 88% at the end), which means some details are hidden from us.
Frequently Asked Questions
1. Can the private individual information be leaked? We strictly follow our institution’s ethical review procedure and carefully designed our experiments to operate ethically and collectively without aiming to identify individuals. Thus, in our paper, the identity of all contracts, posts, threads, and users involved are entirely hidden to ensure that no private sensitive information could be revealed. As our work only studies the collective behaviours instead of individual, and the data is publicly available, it is justified in accordance with the British Society of Criminology’s Statement on Ethics.
2. Why Tuckman’s stages of group development is used? We consider this marketplace to be an example of a disparate group of actors coming together (with their own diverse motivations) to work on a joint endeavour. Thus, this theory fits with the development stages of the marketplace, and we use this theory to make sense of the changes we observe in our data which correspond to the external events in three eras. Note that the events which we use to define these eras play a deductive rather than inductive role in our analysis – they are imposed by us in order to analyse the effects of external factors, rather than arising from the data.
3. Can the contracts be completed outside the platform in order to do not leave any (potentially incriminatory) trace? We are not sure whether people opt to avoid using contracts, but yes, it can be operated out of the forum, for example, by using direct messages; however, as the system allows private contracts which do not reveal the goods being exchanged, we believe users are incentivised to use the system as it enables them to gain reputation and provides a certain level of protection (e.g., opening disputes) if anything goes wrong with transactions.
4. Can I get access to the dataset? Yes. One of our key vision is to make the data available for other researchers so that they can apply their own skills to address cybercrime issues. Thus, we widely provide the dataset for academic researchers upon signing a simple agreement (to prevent misuses). If you are interested in, please drop an email to email@example.com then we will help you reach further.
Jack Hughes · Cambridge Cybercrime Center, University of Cambridge ·
Ildiko Pete · Cambridge Cybercrime Center, University of Cambridge ·
Ben Collier · Cambridge Cybercrime Center, University of Cambridge ·
Yi Ting Chua · Cambridge Cybercrime Center, University of Cambridge ·
Ilia Shumailov · Cambridge Cybercrime Center, University of Cambridge ·
Alice Hutchings · Cambridge Cybercrime Center, University of Cambridge ·
Our work is supported by the UK government under Engineering and Physical Sciences Research Council (EPSRC) grant numbers EP/M020320/1 and EP/V026178/1.