Copyright and Machine Learning IP

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Copyright and Machine Learning IP

Post by chrisw »

Copyright.

1. Only Human Authors can have copyright. Copyright must involve creative acts.

As the famous "Monkey Copyright Case" shows, only humans are entitled to copyright. Copyright law recognizes the right of an author based on whether the work actually is an original creation, rather than based on whether it is unique; two authors may own copyright on two substantially identical works, if it is determined that the duplication was coincidental, and neither was copied from the other. Copyright is dependent on creative human activity.

2. Machine created works where the machine supports the creative process of the author, but is not, in itself, "creative".

Traditionally, the ownership of copyright in computer-generated works was not in question because the program was merely a tool that supported the creative process, very much like a pen and paper. Creative works qualify for copyright protection if they are original, with most definitions of originality requiring a human author. Most jurisdictions state that only works created by a human can be protected by copyright.

3. Machine generated works where both the machine and the author provide the "creativity".
AI systems can also generate new works protectable by copyright, such as creating new artwork or music. However, most copyright statutes do not yet not clearly define who owns machine-generated works. It is currently a point of contention in respect of some such works whether the work is generated by a machine, and or the role played by the humans in creation of the work.

4. Machine learning by neural nets.

But with the latest types of artificial intelligence, the computer program is no longer a tool; it actually makes many of the decisions involved in the creative process without human intervention. Artificial intelligence is already being used to generate works in music, journalism and gaming. These works could in theory be deemed free of copyright because they are not created by a human author.
A computer program developed for machine learning purposes has a built-in algorithm that allows it to learn from data input, and to evolve and make future decisions that may be either directed or independent. When applied to art, music and literary works, machine learning algorithms are actually learning from input provided by programmers. They learn from these data to generate a new piece of work, making independent decisions throughout the process to determine what the new work looks like. An important feature for this type of artificial intelligence is that while programmers can set parameters, the work is actually generated by the computer program itself – referred to as a neural network – in a process akin to the thought processes of humans.

5. Neural net learning, options for allocating copyright.

There are two ways in which copyright law can deal with works where human interaction is minimal or non-existent. It can either deny copyright protection for works that have been generated by a computer or it can attribute authorship of such works to the creator of the program. Both options (denial of copyright to work produced by machine, and giving copyright protection to the author who made the arrangements necessary for the machine to produce the work) have been ruled in courts worldwide.
This leaves open the question of who the law would consider to be the person making the arrangements for the work to be generated. Should the law recognize the contribution of the programmer or the user of that program? In the analogue world, this is like asking whether copyright should be conferred on the maker of a pen or the writer. The copyright lies with the user, i.e. the author who used the program to create his or her work.
But when it comes to artificial intelligence algorithms that are capable of generating a work, the user’s contribution to the creative process may simply be to press a button so the machine can do its thing. The second option above, giving authorship to the programmer of the machine, creates an exception to all human authorship requirements by recognizing the work that goes into creating a program capable of generating works, even if the creative spark is undertaken by the machine.

6. Conclusion.

There are two copyright options possible for the creative output (the work) of a neural network engine system where the user has provided minimal or no creative input. One, no copyright, because it is a machine. Two, copyright exists with the machine authors. In no case does copyright exist with the user.
Specific case concerning "DeusX" a chess playing entity. We define first the entity "LCZero" as a chess engine comprising a neural net chess play program LC0.EXE (Leela Chess, copyright Leela Authors) and a set of "Weights" which affect the connections between neural net neurons and thus the numerical value(s) output by the neural net. These Weights were produced by a machine learning neural net algorithm (Leela NN Trainer copyright Leela Authors) which was "trained" on several million self-play chess games. Once trained the Weights could be said to encapsulate high level chess knowledge.
It is known that "DeusX" is made up of LC0.EXE (Leela Chess, copyright Leela Authors) and a set of neural network Weights. These Weights were produced by exactly the same process (Leela Trainer copyright Leela Authors) acting on another large set of human chess games. The selection of these games is functional rather than creative.
We thus have two legal possibilities for the IP of the "DeusX" Weights. Nobody owns the IP. Or, Leela Authors own the IP.
And therefore, one legal possibility for the IP of "DeusX". Leela authors own the IP.

References:
https://www.financierworldwide.com/arti ... derations/
http://www.wipo.int/wipo_magazine/en/20 ... _0003.html
https://techpolicyinstitute.org/2018/03 ... -creation/
https://en.wikipedia.org/wiki/Copyright
https://en.wikipedia.org/wiki/Monkey_se ... ht_dispute
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: Copyright and Machine Learning IP

Post by noobpwnftw »

Given that, do the contributors who produce self-play games for this particular purpose share the IP or the authorship of the NNs or just whoever builds the training software does? One thing I know is people may not copyright the games that they played(which are used in the DeusX case), so if machine generated games are also not copyright-able, then anyone is free to train their NNs with that data, just stay away from using the exact same training code.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

Isn't it a bit of a moot point whether there is copyright on the produced NN? If the human interaction was indeed minimal, everyone else could repeat the process, and generate the same result. As stated above, uniqueness was not a criterion, they are perfectly allowed to make exactly the same thing by other processes than copying (i.e. independent of the original instance). If person A was allowed to use software written by X to create (and distribute) product P without effort, a person B could do the same, no matter who holds the copyright on P. De-facto P would always be unprotected against identical products flooding the market.

Only when the use of the software would be restricted in some way it could make a difference who owns the copyright. But it seems to me that this could be satisfactorily solved by the licence agreement for use of X. This could stipulate a condition that anything produced with the aid of X should be considered a 'derivative work' of X (comparable to translations of a book to a different language). Whether A should be considered to be co-owner of the copyrights would depend on the amount of effort he had to put in.

So in practice: if I use LC0 to produce a NN by feeding it 'millionbase', the NN would be public domain, as the authors of LC0 did not put any conditions on its use, and I did not put in any effort. If I make a much stronger NN by feeding it carefully selected (by me) games, I would own the copyright of the resulting NN.
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: Copyright and Machine Learning IP

Post by noobpwnftw »

hgm wrote: Sat Aug 11, 2018 2:54 pm If I make a much stronger NN by feeding it carefully selected (by me) games, I would own the copyright of the resulting NN.
Leela folks would probably argue that the effort they put into making the GPLv3 supervised NN training code(currently broken) is much more than your effort in "careful selection", so they would in turn own your resulting NN(as I understand from OP, either nobody owns the NN or they do).
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

I would not contest that at all, but just argue that it is irrelevant. I would just have used it as a tool, just like I would use the gcc compiler as a tool to create the executable for a simplistic mini-Shogi engine that I would sell in the app store. I am sure the effort in creating gcc was many orders of magnitude more than that of creating the source code for the mini-Shogi app. Yet no one would contest that I am the sole owner of the copyrights on the executable.

The only relevant question is whether I would have put in enough creative work to make the result copyrightable. Not how complex the tools were that I used. Even an electric typewriter would take many times the effort to develop than it would take to write the average book on it.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Copyright and Machine Learning IP

Post by chrisw »

if you used LC0 to make a NewNN, NewNN would be in your possession. Assuming Legal Case 1, that Leela Authors don't claim or have copyright, If you "sold" it or gave a copy, the recipient could give it away 10000 times if he wanted, because you don't own the IP.

If Legal Case 2, and Leela Authors decided to assert copyright, then it would be up to them.

If you "selected" games, and the selection criteria was "creative" rather than functional, ELO>2400 wouldn't count, then in Case 2, you and Leela Authors would share copyright, presumably according to creative effort involved. Subjective, but one would image theirs was going to be far greater than any chess games selection criteria.

Isn't it fascinating? I assumed the weights creator owned the weights but it doesn't seem to be so, if he used someone else's tube with the tube being responsible for the creative element.
Last edited by chrisw on Sat Aug 11, 2018 3:18 pm, edited 1 time in total.
User avatar
AdminX
Posts: 6339
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Copyright and Machine Learning IP

Post by AdminX »

noobpwnftw wrote: Sat Aug 11, 2018 3:00 pm
hgm wrote: Sat Aug 11, 2018 2:54 pm If I make a much stronger NN by feeding it carefully selected (by me) games, I would own the copyright of the resulting NN.
Leela folks would probably argue that the effort they put into making the supervised NN training code(currently broken) is much more than your effort in "careful selection", so they would in turn own your resulting NN.
Is all this really about the NN or is it about the Engine? At the core I would think it should be about the Engine, maybe I am missing something. Is DeusX using a diifferent executable (lc0.exe) as well? As it stands now, I view the NN as I would a opening book with different moves. Only in this case different weights.
Last edited by AdminX on Sat Aug 11, 2018 3:16 pm, edited 1 time in total.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Copyright and Machine Learning IP

Post by chrisw »

hgm wrote: Sat Aug 11, 2018 3:08 pm I would not contest that at all, but just argue that it is irrelevant. I would just have used it as a tool, just like I would use the gcc compiler as a tool to create the executable for a simplistic mini-Shogi engine that I would sell in the app store. I am sure the effort in creating gcc was many orders of magnitude more than that of creating the source code for the mini-Shogi app. Yet no one would contest that I am the sole owner of the copyrights on the executable.

The only relevant question is whether I would have put in enough creative work to make the result copyrightable. Not how complex the tools were that I used. Even an electric typewriter would take many times the effort to develop than it would take to write the average book on it.
Read my post on machines used to make the "work". It is not how much development went into the typewriter. Obviously you type, you own the resulting text IP.

It's whether or not the machine has creative algorithms than make the work. eg the programmer programs the machine and the machine then creates the work unaided by the human.
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: Copyright and Machine Learning IP

Post by noobpwnftw »

hgm wrote: Sat Aug 11, 2018 3:08 pm I would not contest that at all, but just argue that it is irrelevant. I would just have used it as a tool, just like I would use the gcc compiler as a tool to create the executable for a simplistic mini-Shogi engine that I would sell in the app store. I am sure the effort in creating gcc was many orders of magnitude more than that of creating the source code for the mini-Shogi app. Yet no one would contest that I am the sole owner of the copyrights on the executable.

The only relevant question is whether I would have put in enough creative work to make the result copyrightable. Not how complex the tools were that I used. Even an electric typewriter would take many times the effort to develop than it would take to write the average book on it.
In that case whether your creativity is "enough" becomes very subjective.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

Well, LC0 supervised learning doesn't do anything creative. It just 'mechanically' applies the mathematical expression for back-propagation, in response to the games you feed it. The algorithm for that was not even invented by the LC0 programmers. IMO it is not more creative than compiling a source code with a clever optimizer.

There also is the practical difficulty that no one can force me to divulge what tools I used for generating the NN, or even what games I fed it.