Watermarking Source Code for Copyright and Intellectual Protection

June 25, 2020

By: Shivaramakrishnan Iyer, Technology Partner, Canvas by LTIMindtree. GTO

In the past, being able to download open-source code or clone it and then subsequently modify it, encouraged development of software without authorization. Often, developers and organizations mentioned the copyright of the open source, at the most. The obligation of adhering to GNU/GPL, Creative Commons, FreeBSD, MIT licenses and the like, is often an afterthought and leads to exposure due to penalization due to usage of the software and modification for commercial reasons.

Organizations today want to ensure copyright or intellectual property of their source code, as it is labor-intensive, and something their coders take pride in.

Watermarking code
Establishing identification of ownership of source code by a developer and a company can be done through a technique called source code Watermarking. Let’s understand how to leverage it for copyright and intellectual property rights.

How it works
Source code watermarking basically consists of embedding a unique identifier, aka, a watermark within the source code, to prove the author’s original ownership and prevents/ enables a deterrent for copyright violation. Hiding essential information that uniquely identifies the author/owner, in such a way that cannot be detected easily is where watermarking needs to be understood, both for implementation and relevance in business.

Characteristics of Watermarking
There are principally four characteristics of source code watermarking. Not all of them can be achieved together, rather can be on specific application and domain understanding use cases:

Fidelity
In source code, there are public ownerships (like the ones in GitHub, etc.), authorizations and permissions to use the code and more importantly unauthorized uses means that the watermark should persist in compiled code, byte-code and even persist reverse engineering be it intentionally or unintentionally.
Confidentiality
The author’s information/ copyright watermark must remain hidden from detection that is unauthorized. The outcomes of the watermarking process – often referred to as ‘payload’ too should remain hidden.
Unobtrusive
The watermark should always render the outcome in an unobtrusive manner, in that it should not be noticeable while comparing the outcome. The watermark should not cause the outcome to be noticeably changed.
Capacity
This is characterized by the number of bits encoded in a given time period. The thumb-rule being the higher the better.

Techniques of watermarking
Static watermarking – such as the ones that typically developers put as ‘dead-code’ or comments specifying author, date, ownership and references to licenses and others. Another way that static watermarking can exist in a C code.

Dynamic watermarking – is another technique where the watermark is hidden within the source code, extracted only during runtime. Depending on the complexity required, these can be layered within the User Experience layer, Application layer or the Data/Persistence layer.

Benefitting developers and organizations alike
Source code watermarking thus becomes useful to ensure protection of the author’s copyright and tamper-proofing, maintaining the integrity of the code. However, currently very few organizations and developers adopt this as there are overheads to incorporate the extension and ensure full pass through testing. It is needed in today’s competitive world, where product innovation and disruption along with open-source collaboration form the pivot of an organization’s strategy.

For developers, a sense of ownership makes it easy to adopt. For organizations, preventing intellectual property and copyright violations through incorporating marginal overheads in supplementary controls makes it a no-brainer.

Shivaramakrishnan Iyer

Technology Partner, Canvas by LTIMindtree. GTO

Shiva had 26+ years of cross-domain global industry experience across banking and financial services, media and entertainment, defense, education, e-governance, and Healthcare. For LTIMindtree, Shiva led several technology and architecture advisory and enablement for marquee clients. He has co-authored LTIMindtree’s enterprise architecture framework, technology and architecture roadmap for strategic programs, and technology and architecture assessment frameworks for CTOs.

Blogs

Watermarking Source Code for Copyright and Intellectual Protection

Blogger's Profile

Shivaramakrishnan Iyer

More from Shivaramakrishnan Iyer

Latest Blogs

Contact us

Blogs

Blogger's Profile

Shivaramakrishnan Iyer

More from Shivaramakrishnan Iyer

Latest Blogs