Volume Obfuscation Game: The Lead Data Brokers Out To Waste Your Time

Introduction

Group-IB has observed a growing number of data advertisements targeting organizations worldwide across multiple industries, circulating within Chinese-language cybercrime ecosystems on dark web forums and Telegram.

These sources typically advertise a large volume of data in short time frames, however Group-IB’s past analyses revealed that most claims consist of data compiled from prior breaches, generated and contain no indications of a data breach.

A combination of rapid, high-volume messaging, frequent low-credibility claims, plus the lack of wider understanding and analysis of these sources and data contributes to a misunderstanding of their nature, operations and credibility.

The research Group-IB conducted for this blog highlights five prominent lead data sources operating exclusively in Chinese-language environments, provides examples of analytical assessments, sample data validation, plus key characteristics and patterns organizations can watch for.

Key Discoveries

Data brokers active in Chinese-speaking dark web forums and Telegram channels are advertising large volumes of purportedly stolen data from organizations worldwide.
Past Group-IB analyses show that these datasets are derived from prior breaches, include generated data and exhibit multiple inconsistencies – characteristic of “lead data”.
Examples of analytic assessments are provided to trace upstream data sources and highlight the datasets’ composition and inconsistencies.
Observed brokers display recurring, identifiable messaging patterns, including specific posting structures and keywords.
These claims are assessed to divert analytical resources from legitimate threats and create significant time sinks for less-informed organizations.
Organizations should leverage this research to identify and assess similar claims using a structured analytical approach.

Group-IB Threat Intelligence Portal:

Group-IB customers can access our Threat Intelligence portal for detailed information about Aiqianjin, Yiqun Data, Phoenix Overseas Resources and other lead data sources.

Overview of Lead Data Sources

This section highlights five of the most active lead data sources identified by Group-IB during the research period, including the dark web forums Exchange Market (交易市场, also known as Deepmix) and Chang’An Sleepless Night (长安不夜城), in addition to Telegram-based data brokers such as Aiqianjin (爱钱进), Yiqun Data (义群数据) and Phoenix Overseas Resources (凤凰海外资源).

1. Exchange Market

Exchange Market (交易市场), also known as Deepmix, is a Chinese-language dark web forum which has been operating since 2013.

The forum consists of nine sections including:

Data and Information (数据与信息)
Routine Services and Operations (常规服务与操作)
Technical Support (技术服务[非教程])
Curiosity Genre Movie Resources (影视猎奇资源)
Virtual Items (虚拟物品)
Other Categories (其它类别)
Tutorials and Documentation (教程与文档)
Private Custom Content (私人专拍)
Physical Goods Shipped (实体发货商品)

Figure 1. Exchange Market dark web forum.

Contrary to other mainstream dark web forums, Exchange Market provides no publicly viewable, customizable user profiles, and uses randomly-generated numeric identifiers which are assigned at registration to track seller accounts (卖家账号).

Figure 2. Exchange Market thread metadata.

Furthermore, the forum does not provide a functional private messaging feature, with communication between buyers and sellers requiring referrals to the forum.

Figure 3. Exchange Market thread reply feature.

2. Chang’An Sleepless Night

Chang’An Sleepless Night (长安不夜城) is a Chinese-language dark web marketplace which is similar to Exchange Market.

Figure 4. Chang’An Sleepless Night dark web marketplace.

The forum also consists of nine sections including:

Data resources (数据资源)
Film, TV and audio-visual products (影视音像)
Technical skills (技术技能)
Card materials and CVV (卡料CVV)
Physical items (实体物品)
Services (服务业务)
Private auctions (私人专拍)
Virtual items (虚拟物品)
Other categories (其他类别)

As with Exchange Market, buyers and sellers on Chang’An Sleepless Night rely on a reply-based, seemingly non-functional communication feature, and users are simply identified by redacted alphanumeric names, such as “B*****7”.

Figure 5a. Chang’An Sleepless Night thread metadata.

Figure 5b. Chang’An Sleepless Night thread metadata (translated).

3. Aiqianjin

Aiqianjin (爱钱进) was first seen on Telegram on the 8th of April, 2023, following two years and three months of activity, the channel ceased operations in July 2025.

Aiqianjin engaged in advertising lead data alongside customized data requests, and at the time of detection was one of the largest Telegram-based vendors – reaching approximately 5,000 subscribers in July 2025.

Figure 6. Aiqianjin’s Telegram channel.

Aiqianjin’s administrator has been active in underground Telegram communities focused on data sharing, scamming, cryptocurrency and logs since at least January 2022.

Figure 7a. Aiqianjin’s administrator profiles.

4. Yiqun Data

Yiqun Data (义群数据) emerged on Telegram on the 1st of March, 2025 and at the time of writing this research had 431 subscribers.

Initially, the channel marketed itself as a provider of website penetration testing, privilege escalation, domestic (presumably referring to mainland China) and international database dumps, software development, remote device control and system development and deployment.

Figure 8. Yiqun Data’s Telegram channel.

The channel’s administrator is also involved in other underground Telegram communities.

Figure 9. Yiqun Data’s administrator profile.

5. Phoenix Overseas Resources

Phoenix Overseas Resources (凤凰海外资源) was first seen on Telegram on the 31st of December, 2024 and had over 400 subscribers at the time of detection.

Figure 10. Phoenix Overseas Resources’ Telegram channel.

Phoenix Overseas Resources’ administrator initially started their activity in July 2024, participating in various data collection, exchange and scraping groups on Telegram.

Figure 11. Phoenix Overseas Resources’ administrator profile.

Data Examples and Analysis

Example 01 – Exchange Market

In June 2025, a user posted a thread on Exchange Market advertising credit card data from a prominent bank in the Gulf.

The full data allegedly contained over 600,000 records, and the following image (Figure 12 below) was provided as a sample.

Figure 12. Image of sample data posted on Exchange Market.

Prior to validating the sample data, there are a handful of key points to understand and analyze.

Left to right, the columns in this image include:

Bank name
First name
Last name
Gender
Phone number
Credit card type
Password
Card status
Timestamp

What immediately stands out from the sample data, even before performing any validation, is its inconsistent nature and values – the first and last name fields (2nd and 3rd columns, respectively), contain a mixture of names in both English and Arabic, which is uncommon for a production database without distinct fields for each language.

The 6th field; credit card type in Arabic, appears to be translated and can immediately be discerned as inaccurate by native-speakers. Values such as “بطاقة الاتمان التوقيع” literally translate to “The credit card the signature”, which is incoherent in both English and Arabic.

The 8th field; credit card status in Arabic, is also translated and uses atypical wording. In this sample, all credit card statuses are listed as “الحالة النشطة” which literally translates to “The active status” – an unusual value which is typically expected to be “Active”.

Barring further validation, the three indicators above strongly suggest that the data is fabricated, or lacks credibility.

Moving to validation, a clearer picture of how this data was aggregated forms.

Taking the first two records from the sample image (Figure 12 above) and searching for the phone numbers and password hashes reveals the following:

Sample data – record #1

The full name and phone number have been sourced from the Facebook 2021 leak, which exposed the data of 553 million users.

Figure 13. Facebook 2021 leak record showing matching phone number and name.

The password hash has been sourced from an October 2020 leak targeting Eatigo, notably the record from this leak shows a different name and phone number.

Figure 14. Eatigo 2020 leak record showing matching password hash with inconsistent data.

Sample data – record #2

The second record from the sample data has also been compiled in a similar fashion.

A record from the Facebook 2021 leak was used as a source for the individual’s phone number and name.

Figure 15. Facebook 2021 leak record showing matching phone number and name.

The password hash has been sourced from the Eatigo 2020 leak, however the record shows a different name; Alex.

Figure 16. Eatigo 2020 leak record showing matching password hash with inconsistent data.

Example 02 – Yiqun Data

In January 2026, Yiqun Data posted a message on their Telegram channel advertising 700,000 records of data related to bonds, mutual funds and forex stemming from a bank in the Gulf.

The following image (Figure 17) was provided as a sample:

Figure 17. Image of sample data posted by Yiqun Data

Left to right, the data includes the fields:

First name
Last name
Phone number
Gender
Bank
International Securities Identification Number (ISIN)
Investment type
Funds
Password
Date of birth
Update timestamp

Similar to the previous example from Exchange Market, the individuals’ names and phone numbers in the first two records are sourced from the Facebook 2021 leak, while their passwords are sourced from the October 2020 Eatigo leak.

Sample data – record #1

Figure 18. Facebook 2021 leak record showing matching phone number and name.

Figure 19. Eatigo 2020 leak record showing matching password hash with inconsistent data.

Sample data – record #2

Figure 20. Facebook 2021 leak record showing matching phone number and name.

Figure 21. Eatigo 2020 leak record showing matching password hash with inconsistent data.

Example 03 – Phoenix Overseas Resources

In January 2026, Phoenix Overseas Resources advertised 760,000 records from an investment service in the Gulf.

The following image (Figure 22) was provided as a sample:

Figure 22. Image of sample data posted by Phoenix Overseas Resources.

Left to right, the data includes the fields:

Email address
First name
Last name
Phone number
Organization’s name
Gender
Date of birth
Service/account type
Investment instrument
Timestamp

Sample data – record #1

In this example, the first record’s email address has been sourced from a Truecaller leak in April 2022.

Note that the record shows a different phone number than that in the sample data image.

Figure 23. Truecaller 2022 leak record showing matching email address and name with inconsistent data.

Sample data – record #2

Similarly, the second record’s email address was sourced from the 2022 Truecaller leak, with an inconsistent phone number.

Figure 24. Truecaller 2022 leak record showing matching email address and name with inconsistent data.

While the individual’s phone number was sourced from the Facebook 202 leak, and also shows an inconsistent name.

Example 04 – Aiqianjin

In June 2025, Aiqianjin claimed to be selling over 600,000 records of “bank savings card account data” from a bank in the Gulf.

The following image (Figure 26) was provided as a sample:

Figure 26. Image of sample data posted by Aiqianjin.

Left to right, the data includes the fields:

First name
Last name
Phone number
Bank
Account type
IP address
Gender
Date of birth
A timestamp

Similar to our initial example from Exchange Market, several inconsistencies can be spotted in the sample data without performing any validation.

The column names appear to be translated into Arabic, and would be phrased differently in a legitimate, production database. Additionally, the use of first and last names in different languages is atypical.

Furthermore, Aiqianjin has translated the 6th column’s name; IP address, from English to Arabic, however, the vendor appears to have directly translated “IP” into Arabic, which resulted in its interpretation by translation engines as “intellectual property” (الملكية الفكرية) in Arabic.

Figure 27. Google Translate – English to Arabic translation of “IP”.

Figure 28. Google Translate – Arabic to English translation of “الملكية الفكرية”.

Regardless of the indications that the data lacks any credibility, validation reveals that most of the individual’s names and phone numbers have been a part of the Facebook 2021 leak.

Sample data – record #1

Figure 29. Facebook 2021 leak record showing matching phone number and name.

Sample data – record #2

Figure 30. Facebook 2021 leak record showing matching phone number and name.

Upstream Sources

Although the examples provided in this research include limited datasets in respect to the volume of data advertised on these platforms and by these brokers, a clear pattern can be seen, which also manifests in prior analyses of lead data.

Data brokers operating in Chinese-language environments show a clear preference for specific types of leaks, most notably the Facebook 2021 dataset, and it is probable that high-volume leaks containing personally identifiable information such as names, phone numbers and email addresses will continue being the dominant source for compilation of lead data.

Figure 31. Upstream sources from the examples’ sample datasets.

Identifying Lead Data

Chinese-language data sources exhibit distinct characteristics, and the following visual and behavioral markers can be used to discern both these types of sources and data.

Firstly, these sources and advertisements are inherently posted in Chinese, with limited instances of advertisements or claims in English.

A major indicator is the sheer number of messages posted by these sources in a significantly short period of time.

The example sources in this research often post over 500 to 1000 messages a month, which would constitute an unprecedented number of breaches if true.

Figure 32. Monthly message volume between April 2023 – January 2026.

Using over 17,000 messages collected by Group-IB from the three Telegram sources in this research, the following keywords were observed which are consistently found in messages advertising lead data.

Keyword	Translation	Occurrence rate
<number>万	Ten thousand	27.50% (4,870)
<number>月	Month	20.98% (3,716)
<number>w	Shorthand for “10,000”	5.91% (1,047)
<number>万条	10,000 items	5.61% (994)

Lead data advertisements often have consistent message structures, where samples are provided as an image of a spreadsheet (see labels 1), the total number of records for sale is provided using one of the aforementioned keywords (see labels 2), the alleged victim organization or the data’s geographic origin is typically listed in English (see labels 3) and the month when the data was obtained is provided using the aforementioned keywords (see label 4). Examples of two such posts on different platforms are shown below.

Figure 33. Data advertisement on Telegram from Phoenix Overseas Resources.

Figure 34. Exchange Market thread.

Finally, where labelled sample data is provided, it often contains a combination of the following fields:

First name
Last name
Date of birth
Gender
Email address
Phone number
Country or nationality
User ID
Hashed password
Organization name
Product, customer, or account type or category
A timestamp
Other arbitrary fields

Conclusion

Lead data, often advertised within Chinese-speaking cybercrime communities on dark web forums and Telegram channels, consists of low-credibility data observed by Group-IB researchers to be compiled from prior leaks, generated and contains no indications of an actual breach.

Although lead data bears extremely low credibility, it may contain legitimate identifiers of real individuals as seen in the examples above. However, these sole identifiers are often found to be inconsistent with other data, and do not represent real customers of targeted organizations.

Individuals and security teams are recommended to utilize the examples, patterns, keywords and assessments in this research to develop an informed, analytical approach to respond to similar claims from such sources if and when they occur.

Group-IB’s Threat Intelligence team has extensively analyzed numerous similar incidents, and the primary takeaway is that they divert analytical resources from legitimate threats and incidents, and can create a significant time sink for uninformed organizations.

Recommendations

If your organization is mentioned in these sources:

Verify that the records are relevant to your organization – the data should aid in assessing whether it legitimately belongs to your organization; field names, data types, structures and record counts should match your organization’s internal records if the claim is legitimate
Verify that the identifiers are consistent with your internal data – lead data will often contain identifiers such as names, email addresses or phone numbers that do not exist in the allegedly targeted organization’s database(s)
Analyze records as a whole – in case a legitimate identifier (such as an email address) matches your internal data, check that other identifiers from the record are consistent – the presence of real identifiers does not validate the claim if other details are incorrect, this is often seen in cases with cross-compiled data
Leverage Threat Intelligence services for the latest information, updated threat feeds and deeper validation of data

Frequently Asked Questions

1. What data is typically sold in Chinese-speaking cybercrime communities?

arrow_drop_down

Personally identifiable information such as names, email addresses, and phone numbers, alongside password hashes, financial figures, timestamps and product/customer account categories are typically seen in advertised datasets.

Data listings typically claim thousands to millions of records for sale – often listed in multiples of ten thousand records, abbreviated with “万”, “w”, or “万条”.

2. Is this data credible?

arrow_drop_down

No. Past analyses have consistently shown that the majority of data listings originating from Chinese-speaking sources consist of data that is fabricated, or cross-compiled from multiple prior breaches, that is unrelated to the listed organization – characteristic of “lead data”.

3. What are some common sources of this type of data?

arrow_drop_down

Dark web marketplaces and forums such as:

Exchange Market / Deepmix (xxxxxxxxxs6qbnahsbvxbghsnqh4rj6whbyblqtnmetf7vell2fmxmad[.]onion)
Chang’An Sleepless Night (cabyceogpsji73sske5nvo45mdrkbz4m3qd3iommf3zaaa6izg3j2cqd[.]onion)

Telegram-based data brokers such as:

Yiqun Data

Phoenix Overseas Resources

Aiqianjin

DISCLAIMER: All technical information, including malware analysis, indicators of compromise and infrastructure details provided in this publication, is shared solely for defensive cybersecurity and research purposes. Group-IB does not endorse or permit any unauthorized or offensive use of the information contained herein. The data and conclusions represent Group-IB’s analytical assessment based on available evidence and are intended to help organizations detect, prevent, and respond to cyber threats.

Group-IB expressly disclaims liability for any misuse of the information provided. Organizations and readers are encouraged to apply this intelligence responsibly and in compliance with all applicable laws and regulations.

This blog may reference legitimate third-party services such as Telegram and others, solely to illustrate cases where threat actors have abused or misused these platforms.

This material is provided for informational purposes, prepared by Group-IB as part of its own analytical investigation, and reflects recently identified threat activity.

All trademarks referenced herein are the property of their respective owners and are used solely for informational purposes, without any implication of affiliation or sponsorship.