Combat ‘fake news’ in social & legacy information dissemination networks leveraging Trustchain & AI

Roy Saurabh
10 min readApr 25, 2020

The flooding of ubiquitous misinformation, disinformation, propaganda, deepfakes, and post-truth (referred to as ‘fake news’ in the rest of the document [1]) raises unprecedented questions regarding the role of the legacy and digital/social media in modern societies. Due to its rapid and widespread dissemination, digital amplification of ‘fake news’ has not only an individual or societal cost, but it can lead to significant economic losses, political ramifications, or risks to national security.

“‘Fake information’ has emerged as a global topic of concern and there is a risk that efforts to counter it could lead to censorship, the suppression of critical thinking and other approaches contrary to human rights law.”

Mr. David Kaye, Office of the UN High Commissioner for Human Rights (OHCHR)[2]

As information is shared around virtual spaces, it is often devoid of the original context, opening the door for misinterpretations, misrepresentations, and false context while propagating through virtual space. Given the complexity, effort, and time-consuming nature of the manual discrimination process, we need the latest blockchain & AI solutions to disband fake news. As users consume more information on social platforms and in messaging apps, they are left without the tools to discriminate between true and fake. ‘Fake News’ is a complex and growing problem, accelerated by new technologies and exacerbated by nefarious players who aim to deliberately mislead.

Blockchain to capture the chain of custody of information

With the traceable and transparent nature of the blockchain, it can be possible to verify the authenticity of the information or its sources and build trust in the information displayed on the Internet. The blockchain enables the content to be produced and distributed over the internet in an immutable and secure way.

Fig. 1: Trustchain’s— key capabilities to combat ‘‘fake news” leveraging AI & Blockchain (Public/Private)

We propose “Trustchain” leveraging blockchain because its data structure can help maintain a transparent and immutable record of a piece of information’s origins: when, where, and by whom information was created, who published it, and how it has been used across information dissemination network (social or legacy). Blockchains have capabilities resulting in their suitability for determining integrity and authenticity because they are, essentially, an immutable database technology with inbuilt trust mechanisms. The source of information can be traced by keeping a record of time stamp service and the chain connections between the blocks and consequently propose a blockchain-based approach for decentralized distributed storage for tracing the origin of the piece of information.

‘Trustchain’ — Feature map of the proposed blockchain solution
Fig.2: ‘Trustchain’ — Feature map of the proposed blockchain solution

Blockchains also have the ability to execute smart contracts, which are verifiable scripts that automate a system’s ruleset. In essence, then, blockchains are a trusted ledger capable of running application logic. Furthermore, they cannot be controlled by any single entity. Those mechanisms mean we can use a blockchain to record data about our information resources and any entity that views those records will be satisfied that the information conveyed is authentic.

Technical Architecture

We propose a digital architecture that includes a modular extensible approach, interoperability, with an emphasis on highly secure solutions, a token-agnostic approach, and the development of a rich and easy-to-use Application Programming Interface (API).

The TrustChain Architecture would have the following modular components:

  • Consensus Layer [3]— Responsible for generating an agreement on the order and confirming the correctness of the set of transactions that constitute a block. (Ref: Hyperledger Architecture Consensus Paper)
  • Smart Contract Layer — Responsible for processing transaction requests and determining if transactions are valid by executing trust logic.
  • Communication Layer — Responsible for peer-to-peer message transport between the nodes that participate in a shared ledger instance.
  • Data Store Abstraction — Allows different data-stores to be used by other modules.
  • Crypto Abstraction — Allows different crypto algorithms or modules to be swapped out without affecting other modules.
  • Identity Services — Enables the establishment of a root of trust during the setup of a blockchain instance, the enrollment and registration of identities or system entities during network operation, and the management of changes like drops, adds, and revocations. Also, it provides authentication and authorization.
  • Policy Services — Responsible for policy management of various policies specified in the system, such as the endorsement policy, consensus policy, or group management policy. It interfaces and depends on other modules to enforce the various policies.
  • APIs — Enables clients and applications to interface to blockchains.
  • Interoperation — Supports the interoperation between different blockchain instances
The proposed architecture of the blockchain-based information verification framework
Fig. 3: The proposed architecture of the blockchain-based information verification framework has three components: (1) Broadcast Protocol; (2) Smart Contract for Information; and (3) The Info Trustchain.

Broadcast Protocol

  1. Information Broadcasting: The information creation smart contract is used to publish information on the network. It can be invoked by the account(s) of verified users willing to publish any information by giving their public key and digitally signed information. The smart contract will store the relevant information such as verified user’s name, status, public key, time stamp, and information string in a structure and will broadcast the information instance to the P2P network.
  2. Information Integrity: Ensuring the integrity of the information posted by two different information broadcasting entities about a specific topic is another challenge. A simple hash-based approach is not suitable for ensuring information credibility due to its lack of robustness.[4] We propose the use of the semantic similarity of the information posted by two or more different information outlets. The semantic similarity can be used by the system to gauge the integrity of an informative article by checking for information on the blockchain (i.e., whether it has been posted by a verified information source or not). This semantic similarity index can be measured by a tool such as word embeddings or other advanced NLP methods through which contextual similarity can be computed between words and across documents.

Smart Contract

In general, the smart contract layer works very closely with the consensus layer. Specifically, the smart contract layer receives a proposal from the consensus layer. This proposal specifies which contract to execute, the details of the transaction including the identity and credentials of the entity asking to execute the contract, and any transaction dependencies. The smart contract layer uses the current state of the ledger and input from the consensus layer to validate the transaction.

While processing the transaction, the smart contract layer uses the identity services layer to authenticate and authorize the entity asking to execute the smart contract. This ensures two things: that the entity is known on the blockchain network, and that the entity has the appropriate access to execute the smart contract. Identity can be provided through several methods: simple key-based identities, identities, and credentials managed through the ledger, anonymous credentials, or managed identity services from an external certificate authority.

Smart Contracts interaction with other architectural layers
Fig. 4: Smart Contracts interaction with other architectural layers (Ref: Hyperledger Architecture Consensus Paper)

Registration Smart Contract

The system has an existing mapping of public keys that are in use by various information media organizations, which can be used to verify their identities in real life. If there is no such key for a specific information entity the system can search the web through third-party APIs. Each time an information entity wants to register, the system verifies its identity by asking it to sign a message with its existing public key. If the verification process is successful, the information entity will be given a verified status; otherwise, the information entity can publish but only as a non-verified information publisher. In each case, the system assigns a pair of secret and public key to the information entity being enrolled that will be used for digital signatures scheme.

Update Identity Smart Contract

Any registered information publisher can update its identity and is allowed to obtain multiple identities — e.g., an information channel may want to publish information about sports with a different handle. To get another identity in the form of public and private key pair, a registered information publisher is required to verify its previous identity (i.e., its public key previously registered on the system). The updated identity smart contract is used to facilitate such requests.

Revoke Identity Smart Contract

The revocation of a smart contract deals with the termination of existing information publishers either on their own request or if the system has identified a certain information publisher to behave anomalously over a specified period of time. [5] This is accomplished by computing a reputation score for quantifying the credibility of a publisher.

Evolvable Reputation Set

Our system maintains a reputation set, which contains a set of authentic and credible information sources. To make this set evolvable, we assign an initial reputation score according to public profile data available for each non-verified user and allow this score to evolve with the passage of time if that information entity shares true information or the information posted by the verified users. If a non-verified information outlet manages to get a specified reputation score in a given time period, it will get a status of the verified source; otherwise, if a nonverified information outlet spreads fake news and false information, its identity will be revoked after that time period from the system. Another way to maintain an evolvable set is to get feedback from the consumers of the information, but this introduces the problem of subjectivity and bias and also opens the system to the risk of malicious actors.

Info Trustchain Consensus Protocol

The consensus in Trustchain based on Hyperledger is broken out into 3 phases: Endorsement, Ordering, and Validation.

  • An endorsement is driven by policy upon which users endorse a transaction.
  • The ordering phase accepts the endorsed transactions and agrees to the order to be committed to the information ledger.
  • Validation takes a block of ordered transactions and validates the correctness of the results, including checking endorsement policy and double-spending.
Possible transaction flow (common-case path) in permissioned blockchain
Fig. 5: Possible transaction flow (common-case path) in permissioned blockchain (courtesy: Hyperledger)

Trustchain supports pluggable consensus service for all 3 phases. Applications may plugin different endorsements, ordering, and validation models depending on their requirements. In particular, the ordering service API allows plugging in Byzantine fault tolerance(BFT)-based agreement algorithms.[6]

The ordering service API consists of two basic operations: broadcast and deliver.[7]

  • broadcast: a client calls this to broadcast an arbitrary message blob for dissemination over the channel. This is also called request in the BFT context when sending a request to a service.
  • deliver: the ordering service calls this on the peer to deliver the message blob with the specified non-negative integer sequence number and hash of the most recently delivered blob.

Conclusion

The progress made by artificial intelligence (AI) techniques for customization, dissemination, and generation of content and the increasingly digital habitat of human lives has created situations where sensationalized fake content and misinformation takes its own life; whereas establishing the authenticity of the truth and differentiating reality from fakes is becoming increasingly daunting. This is already creating a number of imposing technical, legal, and ethical challenges at a rate that we are not prepared. The technology of blockchain, being a decentralized ledger technology, promises to bring transparency and trust to this new “post-truth” world by enabling features such as smart contracts, decentralized consensus, reputation management, and tamperproof authentication. In this paper, we have introduced the modern approach of realistic-looking fake content and focused specifically on the phenomena of fake news. To address the ‘fake news’ problem, we have proposed a trustchain-based framework for the detection and mitigation of fake news and have described a high-level architecture of our solution.

References

[1] First Draft. (n.d.) Information Disorder: The Definitional Toolbox https://firstdraftnews.org/latest/infodisorder-definitional-toolbox/

[2] United Nations. (n.d.) Freedom of Expression Monitors Issue Joint Declaration on ‘Fake News’, Disinformation and Propaganda https://www.ohchr.org/en/NewsEvents/Pages/DisplayNews.aspx?NewsID=21287&LangID=E

[3] Permissioned blockchain consensus and various approaches within Hyperledger, please see the paper: Hyperledger Architecture Positioning Paper Volume 1: Introduction to Hyperledger Business Blockchain Design Philosophy and Consensus

[4] Steve Huckle and Martin White. Fake news: a technological approach to proving the origins of content, using blockchains. Big data, 5(4):356– 371, 2017.

[5] A. Qayyum, J. Qadir, M. Janjua and F. Sher, “Using Blockchain to Rein in the New Post-Truth World and Check the Spread of Fake News” in IT Professional, vol. 21, no. 04, pp. 16–24, 2019. doi: 10.1109/MITP.2019.2910503

[6] J. Sousa, A. Bessani and M. Vukolic, “A Byzantine Fault-Tolerant Ordering Service for the Hyperledger Fabric Blockchain Platform,” 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Luxembourg City, 2018, pp. 51–58.

[7] Hyperledger Fabric(n.d.): Platform for distributed ledger solutions https://hyperledger-fabric.readthedocs.io/en/release/arch-deep-dive.html

Interesting projects

--

--

Roy Saurabh
0 Followers

Lead data Scientist@ CRI Paris| Affective Computing in Stress Management| Views are my own (except the ones generated using LSTM RNNs :)).