Creativity, copyright law and AI - a denialist takes stock

editorialMay 2025

In one form or another, for the longest time, it has been important for me to create – any sustained expenditure of time must culminate in something tangible, something demonstrable, and in its absence, the expenditure of time is wasteful. To be sure, most of what I’ve created are not the very embodiments of novelty or an expression of any exemplary ingenuity – but I know a little about what it’s like to go from a blank slate to something more.

When I was a seven-year-old kid that enjoyed playing pretend teacher, it meant I had to have a notebook that comprised of (imaginary, of course) student profiles, lesson notes and exam schedules. When I was a teenager that fell in love with cricket, I had to have an online repository that memorialized the best matches. The first couple of months I spent reading poetry, I handwrote a hundred pages of notes, and in the next few years, I created more than 90 videos. In fact, if someone took inventory of just everything I’ve written, printed, designed, bound and stowed away, they’d be able to make a reasonable estimate of everything I’ve read, studied, enjoyed and valued in this lifetime.

And this may speak to my (vain?) belief that the fragments of my mind merit to be brought alive, but as I’ve grown up, I’ve come to recognize that this self-imposed diktat stems from a deep fascination for the creative “process”, if it can be called that: the path from a single rogue thought of unbeknownst origin, to a nebulous visualization of the end product, to the conjured end product itself. Now, this process is not undemanding. While I might be able to identify point A (trigger), B (idea) and C (expression), there’s no assurance that I’ll hit all of them, or that I’ll do it without entirely losing my mind or resolve. Robert Frost’s After Apple Picking said it best (the apple-picking serves as a metaphor for the creative process) –

For I have had too much
Of apple-picking: I am overtired
Of the great harvest I myself desired.
There were ten thousand thousand fruit to touch,
Cherish in hand, lift down, and not let fall.

One can see what will trouble
This sleep of mine, whatever sleep it is.

[Robert Frost, After Apple Picking]

It is a weary and humbling feeling, to be imprisoned by the machinations of an overzealous mind. And yet, that struggle is entirely worth it, because there is also the deep love – for the art, for the language, for the sport, and for the subject – that you feel, drown in and cherish, every single time.

Naturally, then, I adopted – and have maintained – a most critical stance against generative artificial intelligence. As stingy as I am with liking posts on social media, I have recorded one for nearly every anti-AI post that’s crossed my timeline. (“let my digital legacy speak to my hate!”) My opposition does not come from a place of ill-informed resistance to change. As someone reasonably cognizant of both mankind’s potential to seismically change the way we live lives, and its indefatigable urge to find an easier, simpler, shorter way out of everything, I don’t nurture any doubts that artificial intelligence – and all of its advanced mutants to come – will change the way the world works. I’m still the person that wrote this, after all:

[…] . What perhaps helped generative AI capture the imagination of millions was that it seemed to have passed the famed Turing Test. The test is a simple enough proposition – if a machine could deceive a human into thinking it was another human, it was intelligent. And nothing can be more deceptive than the ability to generate content where there was previously none. It is also this ability that makes the technology undisputedly horizontal in its application.
[Me!]

A denialist is defined in Merriam Webster to be a “person who denies the existence, truth, or validity of something despite proof or strong evidence that it is real, true, or valid.” And so, if it is seen that there are multiple “truths” to generative AI as it is today, it is only two that I oppose: first, the commodification of creativity that generative AI models are being perceived to enable, and two, the irresponsible (and illegal?) trawling of creative output it takes to build these models. This essay will address both of these: in the first part, I’ll primarily use the cited entry in the Stanford Encyclopaedia of Philosophy to discuss why AI-enabled output can never constitute a product of creative process, (the entry also undertakes the very exercise, but I can attest to the honesty of my attempt – whatever that attestation may be worth in an era of a de minimis deception becoming accepted) and in the second, I’ll appraise extant copyright law.

—————————

At the outset, it would be pertinent to refresh understanding of a generative AI model. It has been trained on data sets that are bigger than the extent of man’s imagination, to look for, replicate and extrapolate patterns. Through this, it is able to achieve human-like output.

“The apparatus of a large language model really is remarkable. It takes in billions of pages of writing and figures out the configuration of words that will delight me just enough to feed it another prompt.”
(Malesic, 2025)

Merriam Webster keeps it simple in its definition – creativity is the ability to create. The American Psychological Association defines it as the ability to produce or develop original work, theories, techniques, or thoughts. An article written for the British Psychological Society puts forward what it calls the standard definition - creativity is the production of ideas which are both novel and useful. Each of these definitions is incremental over the earlier – it is not sufficient to create, what you create must be original, and it is not sufficient to be original, it must also be useful. (That last limb I don’t agree with – if you asked me, we spend enough of our lives being evaluated against an “output” benchmark without it also constraining the joy of creating.) At surface level, it would not be a reach for someone to contend that generative AI easily passes muster for these three parameters – ability, originality (if you’re willing to look at it differently than regurgitation, which I am not), and utility (yes, even rage bait images count). But even the standard definition is a reductive distillation, and ignores other pre-requisites – and presuppositions we tend to make when perceiving creative output.

With or without the value condition, some theorists argue that a product must satisfy one or more further conditions, beyond being new, in order to count as creative. The four most prominent proposals are that the product must be (i) surprising, (ii) original (i.e., not copied), (iii) spontaneous, and/or (iv) agential. Each of these is a condition on the process of creativity.
(Stanford University, 2024)

I discuss these in detail:

Margaret Boden contends that creative product must be new, surprising and valuable. She argues that while creativity may come about differently – sometimes by melding what already exists (combinatorial), sometimes within boundaries (exploratory – this is of particular interest for this essay, and I dive deeper –reluctantly so, I must add – at a later stage), and sometimes by redefining what is possible (transformative) – they all elicit surprise upon their culmination. If one were to go by the dictionary definition of “surprising”, this simply requires that the result of a generative AI model not be what is expected. Prompt engineering is essentially antithetical to any form of surprise – you acquire the skill of manipulative semantics well enough, and spend enough time, and the LLM will write you the story exactly as you like.

The second prerequisite, of course, is “psychological novelty”, or originality – the content cannot be copied. In its current state of sophistry – and opacity as to its training data – this is easily achieved by generative AI. It consumes pre-existing creative output, not merely to duplicate, but in an earnest (?) attempt to replicate the path demonstrated by billions of human endeavours to go from Point A (trigger) to Point C. (expression)

The third is spontaneity – you didn’t time that trigger to come to you while you were on the seventieth pavement tile back home, or in the 9thminute of your shower, or in the fourth fever dream of a restless night. It just did.

“An idea occurs spontaneously to the extent that it is produced without foresight or intentional control.”

“If you are going to act creatively, Gaut argues, you cannot set out to follow an “exact plan”—a mechanical procedure, routine, or algorithmic rule—which would give you advance knowledge of exactly what the outcome will be and exactly the means you'll take to achieve it.”
(Stanford University, 2024)

It is undisputed that none of generative AI’s “ideas”, or resultantly, expressions are organic, or spur-of-the-moment; they are merely engineered responses to (engineered) prompts – but there is still the question of that trigger. After all, at the end of the day, whether I sit down and labour through an essay, or leverage an LLM for it, what got me to do it is still spontaneous.

The fourth, and final, they say, is agency. And intent.

“Creative” is a term of praise, and we do not extend praise (or blame) for things that are not done by an agent, or for things that an agent doesn’t do in some sense intentionally.
(Stanford University, 2024)

Agency is defined by the Cambridge Dictionary to mean the ability to take or choose action – and maybe the limits of a fragmented, scattered human mind can result in quite the struggle in exercising said agency – but it is precisely the struggle against that lack of control that imbues the ultimate expression with spirit. While there is no dispute that an LLM does not have agency of its own, one question remains – is there struggle in merely prompting, and even then, is that level of intervention sufficient to qualify as “agency”?

As you can probably tell at this point, I’ve gotten myself into quite the circular mess – generative AI is original enough, but also not at all; its results can be unexpected, but not if you’re a good enough prompt engineer; its output is entirely prompted (and programmed), but the trigger that got you to open the chat is still psychologically unexplained and well… unprompted. Perhaps all of these are truths, and perhaps they can sit together – but when Socrates said:

“when poets produce truly great poetry, they do it not through knowledge or mastery, but rather by being divinely “inspired” by the Muses, in a state of possession that exhibits a kind of madness”
(Stanford University, 2024)

… maybe he had reason for it. As I see it, there are broadly two ways of looking at creative process. One, of course, is what I’ve already done – as an exercise requiring an acquirable skill that must check certain boxes to qualify as creative. This is also, in some sense, the premise of what Margaret Boden calls “exploratory creativity.” Paraphrasing: exploratory creativity emerges from a “conceptual space” (roughly a system comprising a set of basic elements (e.g., basic ideas or representations)), and within rules or “constraints.”

Boden argues that the elements as well as the operating rules of a conceptual space can be, and in some cases have been, captured in computer programs. She has used this point not only to argue that computers can be creative (…), but also to …

(Stanford University, 2024)

A dispute as to the existence or quality of “creativity” of an LLM against a person that holds this view would be entirely pointless – and or messy, as I’ve (ably?) demonstrated. And so, I turn to the second, less facile and less schematic perspective – that creativity cannot be learned:

[Edward Young’s] idea is that originality emerges naturally from something implanted in us by nature, and it can only be hindered by learning. Young seems to think of learning as proceeding either through imitation or through the following of rules, and both, he thinks, are detrimental to originality.

(Stanford University, 2024)

The argument here, of course, is that for output to be beyond, and thereby creative it cannot build on consciously installed infrastructure, whatever shape or form that might take. And maybe that is a theist acceptance of “blessings” – the idea that some people are born with it, and some aren’t – and in response to that I quote –

“You cannot reconcile creativeness with technical achievement. You may be perfect in playing the piano, and not be creative; you may play the piano most brilliantly, and not be a musician. You may be able to handle color, to put paint on canvas most cleverly, and not be a creative painter. You may create a face, an image out of a stone, because you have learned the technique, and not be a master creator. Creation comes first, not technique, and that is why we are miserable all our lives. […] You may be a good engineer, you may be a pianist, you may write in a good style in English or Marathi or whatever your language is, but creativeness is not found through technique. If you have something to say, you create your own style; […] When the joy is there, the technique can be built up from nothing; you will invent your own technique, you won’t have to study elocution or style. When you have, you see, and the very seeing of beauty is an art.”

(Krishnamurthi, 1995)

And so, maybe, just maybe, what actually makes something creative is not the fact that it’s original or valuable or spontaneous or surprising or that the creator was a “genius” or talented, but the fact it homes a little, tiny piece of the creator himself, that he has so willingly given for the world to indulge in. Every story, poem, painting, song, if you search long enough, will reveal a sliver of a living, thinking being – and fundamentally, it is the fact that no two beings are alike that makes such product “creative.”

And so, I submit, dear reader, that an LLM cannot be creative, because it cannot … well, be.

—————————

We’d be well served by starting this section with this extract from Britannica’s entry on copyright law:

“the ideal of artistic genius […] provides much of the moral force of modern copyright law.”

Law – this conscientiously curated body of doctrines that society has come to recognize as representative of the ideals of “consent, democracy and fair play” (Ryu, 2024) over the years – sees both the paroxysm of creative energy and the ensuing toil as meritorious and deserving of protection. (And maybe today that protection translating to monetization raises some question as to whether “true” creativity should be result from extrinsic motivation, but that’s a discussion for another occasion.) And yet, law doesn’t – and potentially cannot – evolve as quickly as society responds to technology, and that’s left a nice few gaps for the LLM corporations to exploit.

Jurisdiction, they say – your law cannot extend to us, because we use your data outside of your borders, where our servers reside. Transformation, they say – we aren’t reproducing your content, we’re only learning from them. Content does not compete in the same market, they say – this I understand, because the person that’s using an LLM to write a book report is probably not also the person that’s reading through the entire book.

Copyright law is at a crucial juncture; courts and governments are approaching the issue with great trepidation – press too hard and LLM corporations will flock towards the countries that don’t, leaving your country behind in this famed race to robotopia, or do nothing and risk decimation of human intelligence as we know it. And to be sure, this isn’t some hyperbolic take: this extract is from the US Copyright Office’s report on Generative AI Training:

The stakes are high, and the consequences are often described in existential terms. Some warn that requiring AI companies to license copyrighted works would throttle a transformative technology, because it is not practically possible to obtain licenses for the volume and diversity of content necessary to power cutting-edge systems. Others fear that unlicensed training will corrode the creative ecosystem, with artists’ entire bodies of works used against their will to produce content that competes with them in the marketplace.

(US Copyright Office, May 2025)

In this section of this essay, I appraise extant Indian copyright law, in light of the ANI Media vs. Open AI case sub judice at the Delhi High Court, with help from the US Copyright Office Report (supra).

Section 14 of the Copyright Act, 1957 (hereinafter “Act”) defines copyright as the exclusive right to do or authorise the doing of any of a specified list of acts in respect of a work or any substantial part thereof, which includes, inter alia,

to reproduce the work in any material form including the storing of it in any medium by electronic means;
to make any adaptation of the work.

Section 51 of the Act deems copyright to be infringed when any person, without a licence granted by the owner of the copyright does anything, the exclusive right to do which is by this Act conferred upon the owner of the copyright.

Section 52 makes several exceptions to infringement, which includes, inter alia,

a fair dealing with any work, for the purposes of private / personal use, criticism / review, and reporting of current events and current affairs.
the transient or incidental storage of a work or performance purely in the technical process of electronic transmission or communication.

Prima facie, the storage of training data would appear to constitute a copyright infringement under Section 51 read with Section 14 of the Act. That LLMs are infringing reproduction rights appears to be a reasonably settled position:

The steps required to produce a training dataset containing copyrighted works clearly implicate the right of reproduction. Developers make multiple copies of works by downloading them; transferring them across storage mediums; converting them to different formats; and creating modified versions or including them in filtered subsets. In many cases, the first step is downloading data from publicly available locations, but whatever the source, copies are made—often repeatedly.

(US Copyright Office, May 2025)

Open AI’s defence rests, not on non-infringement, but on the fair use principle in Section 52. However, unlike American law, which codifies the principles for determination of fairness of use, by listing these four factors:

(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

(4) the effect of the use upon the potential market for or value of the copyrighted work.

[Section 107 of the 1976 Copyright Act.]

… Indian law is significantly narrower, permitting only personal research, review / criticism, and reportage as fair uses. Courts, have, however, taken the liberty to place reliance on other legal literature – including American law - to adjudicate on fair use. Of specific interest in this case, as I gather from my reading, is the principle of transformativeness – Open AI’s primary claim is that it is not using the data as is, and that therefore, it qualifies for a fair use exemption.

Delhi High Court in the case of University of Oxford vs. Narendra Publishing House held:

It, therefore, must receive a liberal construction in harmony with the objectives of copyright law. Section 52 of the Act only details the broad heads, use under which would not amount to infringement. Resort, must, therefore be made to the principles enunciated by the courts to identify fair use.

The purpose and manner of use of the questions found in the plaintiff's textbooks, by the defendants is thus different; additionally, in their books, missing in the plaintiff's works are the steps or process of problem solving. Thus, the defendants' works can be said to be ‘transformative’ […]

Delhi High Court in the case of University of Cambridge vs. B D Bhandari held:

If the guide book is different in character and not a mere substitute of the original work/textbook, it would be treated as transformative. However, the character must be substantially different and it is not sufficient that superficial changes are made with basic character of the textbook creeping in the guide book.

The fundamental premise for the “transformativeness” principle can be found in the next principle – that the derivative / impugned work does not compete with the original body of work in the market. Transformativeness goes back, also, at some level to the philosophical / psychological parameter of originality. And as I said, the sophistry of today’s LLM models and the vastness of their sources means that it would be entirely implausible for the output of an LLM model to be traced directly back to any constituent news report, editorial, author style, or book, singular or plural.

Unsurprisingly, this is also the US Copyright Office’s view:

In the Office’s view, training a generative AI foundation model on a large and diverse dataset will often be transformative. The process converts a massive collection of training examples into a statistical model that can generate a wide range of outputs across a diverse array of new situations.

(US Copyright Office, May 2025)

An LLM is not trying to sneak your painstakingly crafted, beautiful sentences into its own responses. It’s only trying to identify what makes the sentence beautiful. In that sense, it isn’t merely using and reproducing the sentence, it is dissecting the sentence in its pursuit to identify the “essence of linguistic expression.” (US Copyright Office, May 2025) Looked at that way, it would not be difficult for the Delhi HC to follow its own judgements – and yet, the stakes are significantly different.

Even for a Court that offered leeway for commercialised guidebooks, the commercial implications of scraping (i.e., the fourth principle) are too pressing to be dismissed. This is also where the burden of proof may prove cumbersome for ANI – for ANI to be able to demonstrate “clear harm which includes financial loss, for example - disruptions to its paywall revenue model, reduced subscriptions, or reputational damage” might prove a tall ask – but it is undisputable that a compassionate view for data laundering can lead us down a path where the creative ecosystem sees no economic incentive to exist.

Copyright underpins the success of our creative industries because it guarantees their economic and moral rights.
(Intellectual Property Office, UK, 17 December, 2024)

The US Copyright Office also recognizes this risk: a model that can produce substantially similar content can lead to lost sales, and even material that is only stylistically similar can dilute the market. If an LLM has been trained on a sufficiently wide spectrum of fictional novels, nobody would go out and buy one anymore. Well, not nobody. Those of us raised on a rich diet of seeking out and appreciating the vagaries of the human brain still would.

Another argument observed in copyright litigations hinges on the idea-expression dichotomy. In the seminal R.G. Anand v Delux Films judgement, the Apex Court established several principles, including:

There can be no copyright in an idea, subject matter, themes, plots or historical or legendary facts and violation of the copyright in such cases is confined to the form, manner and arrangement and expression of the idea by the author of the copyright work.

Where the same idea is being developed in a different manner, it is manifest that the source being common, similarities are bound to occur. If the defendants work is nothing but a literal imitation of the copyrighted work with some variations here and there it would amount to violation of the copyright.

Where the theme is the same but is presented and treated differently so that the subsequent work becomes a completely new work, no question of violation of copyright arises.

[R.G Anand vs M/S. Delux Films & Ors on 18 August, 1978, 1978 AIR 1613, Supreme Court.] (Singh, 2020) (Panwar, 2025)

The crux, of course, is that copyright rests only in expressions, and not ideas or facts, and thereby, non-expressive use cannot constitute copyright infringement. This is how an LLM corporation makes its case for non-expressive use:

For example, Anthropic asserted that “[t]o the extent copyrighted works are used in training data, it is for analysis (of statistical relationships between words and concepts) unrelated to any expressive purpose of the work.”

Google stated that because training is a process for “deconstructing existing works for the purposes of modeling mathematically how language works,” it serves a different purpose than the “communicative, expressive purpose for which these works were created.

(US Copyright Office, May 2025)

If you asked me, this is a flimsy argument buttressed by technical language – because how can an LLM ever be characterized as non-expressive? Sure, maybe the immediate purpose is statistical and non-expressive, but no circumvention of law is evaluated basis the intermediate steps. You look at the end goal – and the end goal, clearly, here, is to express. And so, for one last time, I invoke the US Copyright Office:

We view this argument as mistaken. Language models are trained on examples that are hundreds of thousands of tokens in length, absorbing not just the meaning and parts of speech of words, but how they are selected and arranged at the sentence, paragraph, and document level—the essence of linguistic expression. Image models are trained on curated datasets of aesthetic images because those images lead to aesthetic outputs. Where the resulting model is used to generate expressive content, or potentially reproduce copyrighted expression, the training use cannot be fairly characterized as “non-expressive.”

(US Copyright Office, May 2025)

The Indian government has recently set-up a committee to revisit and refresh intellectual property law. Any new legislation must be able to toe the fine line between encouraging investment in technology and not invoking the wrath of the creatives. Take the case of the UK government – it proposed an opt-out licensing model (an LLM would scrape by default, and if you didn’t like it, the onus was on you, the creative, to opt out), only to be stopped in its tracks (ish) by Sir Paul McCartney himself. And hey, when a Beatle isn’t happy ...

What also merits acknowledgement is how much has already happened – data sets have already been scraped, stored, learnt from, and employed. The eventual closure of these cases, along with new legislation, could either de-fog the path forward for this technology, or encourage disgruntlement to translate to litigation. Either way, you should get ring-side tickets.

—————————

I started this essay by quoting from After Apple Picking by Robert Frost, a poem I hold particularly close. I end this by quoting ChatGPT’s “modern reimagining” of it:

I had a ladder leaning
since late September—
angled into sky like a half-finished question.
That old tree gave more than it should’ve.
Hands aching, the basket heavy,
boots muddy with hours.

I kept picking.
Even when the good ones blurred
with the half-bruise, the overripe,
I kept at it—like habit, like hunger.
You get used to the rhythm:
reach, twist, drop, sigh.

This was one of three interactions (when I hit a roadblock with tax law, to show to my mum the extent of indolence it encourages, and now) I’ve had with ChatGPT. And if I can find a way through the annoyance, I can probably admit to it being half-decent – it gets the point of the poem, it has just enough stylistic merit – and that’s what truly, truly scares me, a painfully honest creative. That one day, someone will read through my writing – my writing full of em-dashes that the LLMs have today hijacked for impersonation – and believe it to be AI-generated.

Because if Robert Frost doesn’t stand a chance, what chance do I have?

I’m neither a student of psychology, nor intellectual property law. There’s undoubtedly a lot more literature out there that dives deeper, wider into this premise than I did. But I’ve spent time thinking about this – sometimes angrily, sometimes sadly – and like I said, any sustained expenditure of time must culminate in something tangible, something demonstrable, and in its absence, the expenditure of time is wasteful.

And lastly, seeing as I’m only human: E & OE.

Bibliography

Stanford University. (2024, Spring). Creativity. Retrieved from Stanford Encyclopedia of Philosophy: https://plato.stanford.edu/entries/creativity/

Malesic, J. (2025, May 21). The Hedgehog Review. Retrieved from ChatGPT Is a Gimmick: https://hedgehogreview.com/web-features/thr/posts/chatgpt-is-a-gimmick

Krishnamurthi, J. (1995). The Book of Life: Daily Meditations with Krishnamurti. HarperOne.

US Copyright Office. (May 2025). Copyright and Artificial Intelligence, Part 3: Generative AI Training (Pre-Publication Version).

Intellectual Property Office, UK. (17 December, 2024). Copyright and Artificial Intelligence. gov.uk.

Ryu, A. (2024). How Reasons Make Law. Oxford Journal of Legal Studies, Volume 44, Issue 1, Spring 2024, 133-155.

Singh, J. P. (2020). Evolution of Copyright Law: The Indian Journey. Indian Journal of Law and Technology.

Panwar, A. (2025, January 8). Tech Policy Press. Retrieved from Generative AI and Copyright Issues Globally: ANI Media v OpenAI: https://www.techpolicy.press/generative-ai-and-copyright-issues-globally-ani-media-v-openai/

← previous

What is poetry, anyway?

Should we all re-emphasise effort in learning?