CN107544979A

CN107544979A - The credibility Analysis method and system of user data

Info

Publication number: CN107544979A
Application number: CN201610474402.4A
Authority: CN
Inventors: 于秋林; 陈尧
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2016-06-24
Filing date: 2016-06-24
Publication date: 2018-01-05

Abstract

The present invention discloses a method and system for analyzing the credibility of user data. The method includes: an embodiment of the present invention provides a method for analyzing the credibility of user data, including: S10: The server acquires a preset number of user samples data, and obtain multiple field data corresponding to the user sample data; S11: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and count the data to be analyzed The matching rate of each field data of the user data source and the user sample data; and S12: According to the statistical matching rate of each field data of the user data source to be analyzed, and according to the preset analysis rules, determine whether the user data source to be analyzed is acceptable The user data source for the letter. The invention can improve the accuracy and efficiency of user data analysis.

Description

User data credibility analysis method and system

技术领域technical field

本发明涉及用户数据处理的技术领域，尤其涉及一种用户数据的可信性分析方法及系统。The present invention relates to the technical field of user data processing, in particular to a method and system for analyzing the credibility of user data.

背景技术Background technique

目前，在对用户数据进行大数据分析及应用的过程中，需要从多个用户数据源获取多类型用户数据(例如，从X1公司获取用户的银行业务数据；从X2公司获取用户的寿险业务数据；从X3公司获取用户的车险数据；从X4公司获取用户的消费业务数据；从X5公司获取用户的通信业务数据等等)，然而，现行的对特定用户数据源的用户数据的真实性评估主要依赖专家经验、抽查验证，对其中不可信的用户数据没有辨识能力。比如，A用户数据源的用户数据中部分用户姓名填写为“某某”，B用户数据源的用户数据中部分用户姓名填写为“先生”，通过现有方式无法辨识出A用户数据源和B用户数据源的用户姓名是否可信。因此，如何对特定用户数据源的用户数据可信度进行准确分析已经成为一个亟待解决的技术问题。At present, in the process of big data analysis and application of user data, it is necessary to obtain multiple types of user data from multiple user data sources (for example, obtain user banking business data from X1 company; obtain user life insurance business data from X2 company ; Obtain the user's auto insurance data from X3 company; obtain the user's consumption business data from X4 company; obtain the user's communication business data from X5 company, etc.), however, the current authenticity assessment of user data from specific user data sources mainly Relying on expert experience and spot check verification, there is no ability to identify untrustworthy user data. For example, some user names in the user data of user A data source are filled in as "so-and-so", and some user names in the user data of user B data source are filled in as "Mr". Whether the user name of the user data source is trusted. Therefore, how to accurately analyze the user data credibility of a specific user data source has become an urgent technical problem to be solved.

发明内容Contents of the invention

本发明提供一种用户数据的可信性分析方法及系统，以解决现有用户数据的可信性无法准确分析的问题。The present invention provides a method and system for analyzing the credibility of user data to solve the problem that the credibility of existing user data cannot be accurately analyzed.

第一方面，本发明提供了一种用户数据的可信性分析方法，包括：In a first aspect, the present invention provides a method for analyzing the credibility of user data, including:

S10：服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据；S10: The server obtains a preset number of user sample data, and obtains multiple field data corresponding to the user sample data;

S11：将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率；及S11: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data; and

S12：根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。S12: Determine whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules.

第二方面，本发明提供了一种用户数据的可信性分析系统，包括：In a second aspect, the present invention provides a system for analyzing the credibility of user data, including:

获取模块，用于获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据；An acquisition module, configured to acquire a preset number of user sample data, and acquire a plurality of field data corresponding to the user sample data;

分析模块，用于将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率；及The analysis module is used to match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the difference between each field data of the user data source to be analyzed and the user sample data match rate; and

确定模块，用于根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。The determining module is configured to determine whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules.

本发明提供了一种用户数据的可信性分析方法及系统，该方法包括：S10：服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据；S11：将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率；及S12：根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。本发明实施例的技术方案可以将待分析用户数据源与用户样本数据进行字段的逐一匹配以实现自动针对所述待分析用户数据源的可信性进行准确分析，从而提高数据分析的准确率及效率。。The present invention provides a method and system for analyzing the credibility of user data. The method includes: S10: the server obtains a preset number of user sample data, and obtains a plurality of field data corresponding to the user sample data; Analyze each field data of the user data of the user data source and match the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data; and S12: According to Calculate the matching rate of each field data of the user data source to be analyzed, and determine whether the user data source to be analyzed is a credible user data source according to the preset analysis rules. The technical solution of the embodiment of the present invention can match the user data source to be analyzed with the user sample data field by field one by one to realize automatic and accurate analysis of the credibility of the user data source to be analyzed, thereby improving the accuracy of data analysis and efficiency. .

附图说明Description of drawings

图1为本发明实施例一提供的一种用户数据的可信性分析方法的流程示意图；FIG. 1 is a schematic flowchart of a user data credibility analysis method provided by Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种用户数据的可信性分析方法的流程示意图；FIG. 2 is a schematic flowchart of a method for analyzing the credibility of user data provided by Embodiment 2 of the present invention;

图3为本发明实施例三提供的一种用户数据的可信性分析方法的流程示意图；FIG. 3 is a schematic flowchart of a user data credibility analysis method provided by Embodiment 3 of the present invention;

图4为本发明实施四提供的一种用户数据的可信性分析系统的结构示意图。FIG. 4 is a schematic structural diagram of a user data credibility analysis system provided by Embodiment 4 of the present invention.

具体实施方式detailed description

下面结合附图并通过具体实施方式来进一步说明本发明的技术方案。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部结构。The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and through specific implementation methods. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings but not all structures.

实施例一Embodiment one

图1为本发明实施例一提供的一种用户数据的可信性分析方法流程示意图，该方法可以由用户数据的可信性分析系统执行，其中该用户数据的可信性分析系统可由软件和/或硬件实现，一般可集成在服务器中。Fig. 1 is a schematic flowchart of a user data credibility analysis method provided in Embodiment 1 of the present invention, the method can be executed by a user data credibility analysis system, wherein the user data credibility analysis system can be implemented by software and /or hardware implementation, generally can be integrated in the server.

参见图1，本实施例的方法包括如下步骤：Referring to Fig. 1, the method of the present embodiment comprises the steps:

S10：服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据。S10: The server obtains a preset number of user sample data, and obtains a plurality of field data corresponding to the user sample data.

具体的，所述服务器可以与多个数据库连接，所述服务器可以从多个数据库中获取用户数据。其中每个数据库可视为一个数据源。Specifically, the server may be connected to multiple databases, and the server may acquire user data from multiple databases. Each of these databases can be considered a data source.

所述预设数量可以根据实际情况进行设置，例如10万个。所述多个字段数据包括：姓名、身份证号、年龄、住址、职业、收入、办公地址、存款额等中任一种或几种组合。The preset number can be set according to actual conditions, for example, 100,000. The multiple field data include: any one or a combination of name, ID number, age, address, occupation, income, office address, deposit amount, etc.

S11：将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率。S11: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data.

具体的，将待分析的用户数据的各个字段数据与用户样本数据对应字段数据逐一匹配，若用户数据中的字段数据与样本数据中对应字段数据一致，则表示该字段相匹配，并统计出待分析用户数据源各个字段数据在样本数据的匹配率。例如，100个用户样本数据对应100个姓名字段数据，100个姓名字段数据在Z1用户数据源中若有99个字段数据相匹配，则代表100个用户样本数据的姓名字段数据在Z1用户数据源的匹配率为99％。Specifically, each field data of the user data to be analyzed is matched with the corresponding field data of the user sample data one by one. If the field data in the user data is consistent with the corresponding field data in the sample data, it means that the field matches, and the statistics to Analyze the matching rate of each field data of the user data source in the sample data. For example, 100 user sample data correspond to 100 name field data, and if 100 name field data match 99 field data in the Z1 user data source, it means that the name field data of the 100 user sample data are in the Z1 user data source The match rate is 99%.

具体的，当统计出待分析用户数据源的各个字段数据在样本数据的匹配率后，则按照预设的分析规则确定出该待分析用户数据源是否可信。Specifically, after calculating the matching rate of each field data of the user data source to be analyzed in the sample data, it is determined whether the user data source to be analyzed is credible according to the preset analysis rules.

本发明实施例的技术方案，通过服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据；将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率；根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。本发明实施例的技术方案可以将待分析用户数据源与用户样本数据进行字段的逐一匹配以实现自动针对所述待分析用户数据源的可信性进行准确分析，从而提高数据分析的准确率及效率。According to the technical solution of the embodiment of the present invention, the server obtains a preset number of user sample data, and obtains a plurality of field data corresponding to the user sample data; Match the corresponding field data of the data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data; according to the statistics of the matching rate of each field data of the user data source to be analyzed, and according to the preset The analysis rule determines whether the user data source to be analyzed is a credible user data source. The technical solution of the embodiment of the present invention can match the user data source to be analyzed with the user sample data field by field one by one to realize automatic and accurate analysis of the credibility of the user data source to be analyzed, thereby improving the accuracy of data analysis and efficiency.

实施例二Embodiment two

图2是为本发明实施例二提供的一种用户数据的可信性分析方法流程示意图。以实施例一为基础，将预设的分析规则作进一步优化，以提高将用户数据的可信性分析的效率。FIG. 2 is a schematic flowchart of a method for analyzing the credibility of user data provided by Embodiment 2 of the present invention. Based on the first embodiment, the preset analysis rules are further optimized to improve the efficiency of analyzing the credibility of user data.

S20：服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据。S20: The server obtains a preset number of user sample data, and obtains a plurality of field data corresponding to the user sample data.

所述预设数量可以根据实际情况进行设置，例如10万个。所述多个字段数据包括：姓名、身份证号、年龄、住址、职业、收入、办公地址、存款额等。The preset number can be set according to actual conditions, for example, 100,000. The multiple field data include: name, ID number, age, address, occupation, income, office address, deposit amount, and the like.

S21：将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率。S21: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data.

S22：确定出待分析用户数据源中的匹配率大于预设匹配率的字段，并统计出待分析用户数据源中的匹配率大于预设匹配率的字段数量；若所述统计出的字段数量大于预设数量，则确定该待分析用户数据源为可信的用户数据源并添加可信标识。S22: Determine the fields in the user data source to be analyzed whose matching rate is greater than the preset matching rate, and count the number of fields in the user data source to be analyzed whose matching rate is greater than the preset matching rate; is greater than the preset number, then determine that the user data source to be analyzed is a credible user data source and add a credible mark.

具体的，在本实施例中，可针对每一字段的匹配率预设一统一数值，例如，99％；也可以针对每一字段的匹配率预设不同的数值，例如，针对“姓名”字段设置匹配率99％，针对“身份证号”字段设置匹配率98％等。Specifically, in this embodiment, a uniform value can be preset for the matching rate of each field, for example, 99%; different values can also be preset for the matching rate of each field, for example, for the "name" field Set the matching rate to 99%, and set the matching rate to 98% for the "ID card number" field, etc.

在统计出待分析用户数据源中的匹配率大于预设匹配率的字段数据，当所述统计出的字段数量大于预设数量(例如，10个)时，确定该待分析用户数据为可信的用户数据源。After counting the field data whose matching rate in the user data source to be analyzed is greater than the preset matching rate, when the number of fields counted is greater than the preset number (for example, 10), it is determined that the user data to be analyzed is credible source of user data.

进一步的，本实施例可以为该可信的用户数据源增加可信标识；同时针对那些不可信的用户数据源也可增加不可信标识。Further, in this embodiment, a credible identifier can be added to the credible user data source; at the same time, an untrustworthy identifier can also be added to those untrustworthy user data sources.

实施例三Embodiment three

图3是为本发明实施例三提供的一种用户数据的可信性分析方法流程示意图。以实施例一为基础，将预设的分析规则作进一步优化，以提高将用户数据的可信性分析的效率。FIG. 3 is a schematic flowchart of a method for analyzing the credibility of user data provided by Embodiment 3 of the present invention. Based on the first embodiment, the preset analysis rules are further optimized to improve the efficiency of analyzing the credibility of user data.

S30：服务器获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据。S30: The server obtains a preset number of user sample data, and obtains a plurality of field data corresponding to the user sample data.

S31：将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率。S31: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data.

S32：确定出待分析用户数据源中的匹配率大于预设匹配率的字段；分析待分析用户数据源对应的确定出的字段是否包含所有预先确定的关键字段；若包含预先确定的关键字段，则确定该待分析用户数据源为可信的用户数据源并添加可信标识。S32: Determine the field whose matching rate in the user data source to be analyzed is greater than the preset matching rate; analyze whether the determined field corresponding to the user data source to be analyzed contains all predetermined key fields; if it contains predetermined keywords section, then determine that the user data source to be analyzed is a trusted user data source and add a trusted identifier.

在统计出待分析用户数据源中的匹配率大于预设匹配率的字段数据，待分析用户数据源对应的确定出的字段是否包含所有预先确写的关键字段时，确定该待分析用户数据为可信的用户数据源。所述关键字段可以为：例如，姓名及/或住址等。When the field data in the user data source to be analyzed whose matching rate is greater than the preset matching rate is counted, and whether the determined field corresponding to the user data source to be analyzed contains all the pre-written key fields, determine the user data to be analyzed as a trusted source of user data. The key field may be: for example, name and/or address, etc.

实施例四Embodiment four

图4为本发明实施例四提供的一种用户数据的可信性分析系统的结构示意图。所述用户数据的可信性分析系统应用于服务器中以进行用户数据可信性的分析。FIG. 4 is a schematic structural diagram of a user data credibility analysis system provided by Embodiment 4 of the present invention. The user data credibility analysis system is applied in a server to analyze the user data credibility.

本实施例的系统具体包括：获取模块40、分析模块41及确定模块42。The system of this embodiment specifically includes: an acquisition module 40 , an analysis module 41 and a determination module 42 .

所述获取模块40，用于获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据。The acquiring module 40 is configured to acquire a preset number of user sample data, and acquire a plurality of field data corresponding to the user sample data.

所述分析模块41，用于将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率。The analysis module 41 is configured to match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the relationship between each field data of the user data source to be analyzed and the user The matching rate of the sample data.

所述确定模块42，用于根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。The determination module 42 is configured to determine whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules.

进一步的，所述确定模块42具体用于：Further, the determination module 42 is specifically used for:

确定出待分析用户数据源中的匹配率大于预设匹配率的字段，并统计出待分析用户数据源中的匹配率大于预设匹配率的字段数量；若所述统计出的字段数量大于预设数量，则确定该待分析用户数据源为可信的用户数据源并添加可信标识；或Determine the field whose matching rate in the user data source to be analyzed is greater than the preset matching rate, and count the number of fields in the user data source to be analyzed whose matching rate is greater than the preset matching rate; Determine the user data source to be analyzed as a credible user data source and add a credible mark; or

确定出待分析用户数据源中的匹配率大于预设匹配率的字段；分析待分析用户数据源对应的确定出的字段是否包含所有预先确定的关键字段；若包含预先确定的关键字段，则确定该待分析用户数据源为可信的用户数据源并添加可信标识。Determine the field whose matching rate in the user data source to be analyzed is greater than the preset matching rate; analyze whether the determined field corresponding to the user data source to be analyzed contains all the predetermined key fields; if it contains the predetermined key field, Then determine that the user data source to be analyzed is a trusted user data source and add a trusted identifier.

进一步的，所述确定模块42，还用于针对不可信用户数据源添加不可信标识。Further, the determining module 42 is also configured to add an untrustworthy mark to the untrustworthy user data source.

本实施例的技术方案提供的用户数据的可信性分析系统，通过获取模块40获取预设数量的用户样本数据，并获取所述用户样本数据对应多个字段数据。利用分析模块41将待分析用户数据源的用户数据的各个字段数据与所述用户样本数据的对应字段数据逐一匹配，并统计出所述待分析用户数据源的各个字段数据与用户样本数据的匹配率。利用确定模块42根据统计的待分析用户数据源各个字段数据的匹配率，并按照预设的分析规则确定该待分析用户数据源是否为可信的用户数据源。本发明实施例的技术方案可以将待分析用户数据源与用户样本数据进行字段的逐一匹配以实现自动针对所述待分析用户数据源的可信性进行准确分析，从而提高数据分析的准确率及效率。The user data credibility analysis system provided by the technical solution of this embodiment acquires a preset number of user sample data through the acquisition module 40, and acquires a plurality of field data corresponding to the user sample data. Use the analysis module 41 to match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching of each field data of the user data source to be analyzed and the user sample data Rate. The utilization determination module 42 determines whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules. The technical solution of the embodiment of the present invention can match the user data source to be analyzed with the user sample data field by field one by one to realize automatic and accurate analysis of the credibility of the user data source to be analyzed, thereby improving the accuracy of data analysis and efficiency.

上述产品可执行本发明任意实施例所提供的方法，具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节，可参见本发明任意实施例所提供的方法。The above-mentioned product can execute the method provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method. For technical details not exhaustively described in this embodiment, reference may be made to the method provided in any embodiment of the present invention.

注意，上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解，本发明不限于这里所述的特定实施例，对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此，虽然通过以上实施例对本发明进行了较为详细的说明，但是本发明不仅仅限于以上实施例，在不脱离本发明构思的情况下，还可以包括更多其他等效实施例，而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention, and the present invention The scope is determined by the scope of the appended claims.

Claims

1. A credibility analysis method for user data, comprising:

S10: The server obtains a preset number of user sample data, and obtains multiple field data corresponding to the user sample data;

S11: Match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the matching rate of each field data of the user data source to be analyzed and the user sample data; and

S12: Determine whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules.

2. The method according to claim 1, wherein step S12 specifically comprises:

Determine the field whose matching rate in the user data source to be analyzed is greater than the preset matching rate, and count the number of fields in the user data source to be analyzed whose matching rate is greater than the preset matching rate; Determine the user data source to be analyzed as a credible user data source and add a credible mark; or

Determine the field whose matching rate in the user data source to be analyzed is greater than the preset matching rate; analyze whether the determined field corresponding to the user data source to be analyzed contains all the predetermined key fields; if it contains the predetermined key field, Then determine that the user data source to be analyzed is a trusted user data source and add a trusted identifier.

3. The method according to claim 1, wherein the field data includes any one or a combination of name, ID number, age, address, occupation, income, office address, deposit amount.

4. The method according to claim 2, wherein the preset matching rate is 99%.

5. The method according to claim 1, further comprising the steps of:

Add an untrusted flag for untrusted user data sources.

6. A credibility analysis system for user data, which is configured in a server, is characterized in that it includes:

An acquisition module, configured to acquire a preset number of user sample data, and acquire a plurality of field data corresponding to the user sample data;

The analysis module is used to match each field data of the user data of the user data source to be analyzed with the corresponding field data of the user sample data one by one, and calculate the difference between each field data of the user data source to be analyzed and the user sample data match rate; and

The determining module is configured to determine whether the user data source to be analyzed is a credible user data source according to the statistical matching rate of each field data of the user data source to be analyzed and according to preset analysis rules.

7. The system according to claim 6, wherein the determining module is specifically used for:

8. The system according to claim 6, wherein the field data includes any one or a combination of name, ID number, age, address, occupation, income, office address, deposit amount.

9. The system according to claim 7, wherein the preset matching rate is 99%.

10. The system according to claim 6, wherein the determining module is further configured to add an untrustworthy identification to untrustworthy user data sources.