Copyright and Machine Learning IP

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

syzygy wrote: Sun Aug 12, 2018 8:32 pm
hgm wrote: Sun Aug 12, 2018 4:32 pm
syzygy wrote: Sun Aug 12, 2018 3:30 pmA random ordering of words is not protected. Nor is an alphabetical ordering or any other type of functional ordering, however ingenious.
It seems to me this is an untenable distinction. Ultimately 'creativity' is also just the product of an (admittedly complex) functional process.
Well, it's the law. For example the Endstra tapes were ultimately found not to be copyrighted (even though I think that outcome was wrong).
I don't see what you want to prove with this example. It seems to me the Endstra speech simply fell well short of the complexity mark were functional can be called 'creative'. I am merely objecting to your claim "functional, no matter how ingenious". IMO creativity is nothing but very ingenious functionality. The Endstra speech was obviously far from ingenious.
I doubt that you can outperform the network trained on the unimaginative collection of all games between top players selected from some pre-existing collection of games. But even if you do outperform it, that alone does not make the resulting NN copyrightable, precisely because it is an objective criterion.
Not necessarily; you can also outperform something in a subjective way, e.g. because people like the playing style better. I don't see where this criterion suddenly comes from. If I make something that is obviously different (by objective measurement or whatever) from what you get when you do the obvious, unimaginative thing, I must have done something very special to achieve it, and must have been creative to figure out what special thing I had to do to get such an anomalous result.

Whether this can actually be done in the case of LC is immaterial; this is just a thought experiment, and LC was taken as an arbitrary example.
We can tell the difference, or at least lawyers pretend that we can tell the difference between random drops of paint and compositions of paint.
We can also tell the difference between good chess and a patzer, or pretend we do.
Err towards one side seems to imply that there could be room for doubt here.
Well, that was what you were supposing, right? A case where from the result you could not see whether the 'teacher' had made a creative effort selecting or generating training examples, or had just dumped the first huge came collection he could download on the NN. That would leave doubt as to whether the teacher had been creative. If I understood you correctly, you would deny him copyrights just because the possibility exists that he could have not been creative, which would be an error if he had been creative, (but apparently without much success).
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Copyright and Machine Learning IP

Post by Milos »

hgm wrote: Sun Aug 12, 2018 11:01 pm We can also tell the difference between good chess and a patzer, or pretend we do.
There is absolutely no way any one can tell from the way one engine plays with certain NN how creative that NN is or how much creative effort has been put into creation of that NN if NN was basically generated by the "compiler" i.e. training program.
So judging creativity of NN is a totally moot point.
If training program provides deterministic output for a given input (which is not the case) then only copyright you can hold is on the collection of training games.
But there you have to prove the creativity and purpose of your collection. Not every random collection of something is worth a copyright.
One way or the other, a copyright on NN is a moot point.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Copyright and Machine Learning IP

Post by chrisw »

syzygy wrote: Sun Aug 12, 2018 9:25 pm
chrisw wrote: Sun Aug 12, 2018 4:51 pm An MP3 file consists of a 1-D structured set of numbers. But it represents music. That's copyrightable because of what it represents. Helpfully it's a one-to-one mapping, and decodable by MP3 player. And the music is tangible to the ear.
This is a good example.

If you feed an mp3 encoder copyrighted music, the resulting mp3 file will be covered by the copyright on the music (because the mp3 file easily preserves enough of the human-perceivable features of the music). It will not be covered by the copyright on the mp3 encoder because the mp3 encoder determines only the functional structure of the mp3 file.

Instead of an mp3 encoder taking as input music and producing as output an mp3 file, here we have an NN trainer program taking as input a collection of games and producing as output a set of weights. For the set of weights to be copyrighted we at least need a (somewhat) originally collection of games and to somehow be able to convince a judge that enough of that originality is perceivable from the set of weights. That's going to be tough.

At the moment a case is pending before the CJEU in which the Court will have to decide whether a taste can be a copyrighted work. The advocate general recently issued his opinion advising the court to decide that the notion of "work" encompasses only subject matter that can be perceived through sight or hearing and "with precision, stability and objectivity".

A neural network as a set of numbers can be perceived by the human eye but is then just meaningless. One will not be able to say that a particular network infringes on the copyright of another network by just looking at the similarity between the weights. (The same holds true for an mp3 file: as a set of numbers it is meaningless to the human eye or ear. You have to run it through an mp3 player to perceive it meaningfully.)

So a neural network will have to be run through the Lc0 client, but can we then perceive it "with precision, stability and objectivity"? I'm not convinced that "playing ability/style" is something that can be the object of copyright. Of course playing strength can be measured to some extent, but that is just a functional criterion (like how aerodynamically a car is shaped).

It is possible that the criteria formulated by the CJEU will be more relaxed than those proposed by the advocate general.
The NN represents chess knowledge. It's not any form of one-to-one mapping, it's intangible, and it's decodable by LC0.EXE and into LC0.EXE language and performing weird transformations on it and then outputting chess moves AND THEN a smart chess knowledgable person decoding what she thinks the system "knows". OR, by a possibly less smart person comparatively looking at the numeric move value outputs.
..... the NN as a structured set of weights. There is no copyright on that structure.
Because the LCO(Trainer) carries out a repetitive, functional series of steps on the data. Not creative. Therefore no copyright. I think is your argument.
My argument is that the Lc0 code does not determine the values of weights but probably only the number of weights. If Lc0 requires that the NN has 5x5 weights, then you will agree that the copyright on Lc0 does not extend to all collections of 5x5 numbers. Even if it determines much more structure than something like "5x5", it will not be enough because the structure is just that what is necessary to make it work with the Lc0 client. Just like the structure imposed by an mp3 encoder is just that what is necessary to allow the mp3 file to be decoded by an mp3 player. (There is probably a bit more to it in case of mp3 encoders since they do not all produce the same quality output, but any variation between mp3 encoders is not an expression of creative freedom but of an attempt to reach a high quality or a good quality/time trade off -- technical criteria. And the perceivable differences between two encodings of the same music or between an mp3 file and the original music will anyway be extremely unlikely to establish a new/additional copyright.)
On Advocate Generals ... well, yes, but. Humans are used to copyright applying to tangible, maximum 3-D things. We are entering a world of intangibles and many dimensions. Maybe super-smart stuff that we can't understand just isn't amenable to the ancient law of copyright.

On weight numbers ...
I think LC0.EXE is able to use weights from past periods of training when different net sizes were used. Probably there's some code to detect net type-structure and LC0.EXE adjusts its software accordingly.

On your "perception" points ...
Because the net is modelled to some extent on (human) visual system, it might be possible to detect visual patterns forming in the net layer neurons as the "understanding" becomes deeper, so to speak. I would not like to be the poor guy given the task of hunting for these across different game positions inputs, comparing them, and trying to generate some visual material to convince your advocate general, though.

Nevertheless, weights do apparently show distinct patterning themselves, viewed as layered heat maps, so perhaps detecting if one net is a retrained derivative of an earlier net might also be possible. Again, I would not want to be the person given the task of hunting. In all cases we would be suffering from the how close is close problem and how in general to test for nets trained or more or fewer or different games, how similar they are anyway.

Probably the best way to "protect" weights (commercially) is to protect the reading code, since that's a vital component of using the weights.

Okay, I am outnumbered by good arguments. I drop that machine generated weights can be copyrighted. Does that apply to PST's too? Runs and hides ....
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

hgm wrote: Sun Aug 12, 2018 11:01 pm
syzygy wrote: Sun Aug 12, 2018 8:32 pm
hgm wrote: Sun Aug 12, 2018 4:32 pm
syzygy wrote: Sun Aug 12, 2018 3:30 pmA random ordering of words is not protected. Nor is an alphabetical ordering or any other type of functional ordering, however ingenious.
It seems to me this is an untenable distinction. Ultimately 'creativity' is also just the product of an (admittedly complex) functional process.
Well, it's the law. For example the Endstra tapes were ultimately found not to be copyrighted (even though I think that outcome was wrong).
I don't see what you want to prove with this example. It seems to me the Endstra speech simply fell well short of the complexity mark were functional can be called 'creative'. I am merely objecting to your claim "functional, no matter how ingenious".
OK, so you agree that a random ordering is not protected.

That functional features are not protected is also well established. See e.g. here:
4.2 (...) De aldus door het hof toegepaste maatstaf is juist. Elementen van het werk die louter een technisch effect dienen of te zeer het resultaat zijn van een door technische uitgangspunten beperkte keuze, zijn van auteursrechtelijke bescherming uitgesloten (vgl. HR 22 februari 2013, ECLI:NL:HR:2013:BY1529, NJ 2013/501, rov. 3.4 onder (c), en de aldaar aangehaalde, door het onderdeel ingeroepen BSA-uitspraak van het HvJEU).
The mechanics of the Rubik's cube were not protected by copyright. The specific color scheme in combination with the 3x3 grid on each of the six faces of the cube was protected.
Err towards one side seems to imply that there could be room for doubt here.
Well, that was what you were supposing, right? A case where from the result you could not see whether the 'teacher' had made a creative effort selecting or generating training examples, or had just dumped the first huge came collection he could download on the NN. That would leave doubt as to whether the teacher had been creative. If I understood you correctly, you would deny him copyrights just because the possibility exists that he could have not been creative, which would be an error if he had been creative, (but apparently without much success).
If it cannot be seen from the end result, then copyright in that end result must be denied. No room for doubt. It does not matter if something creative happened in the production of something entirely uncreative. I gave the example of writing a poem to seed a random generator and then using that random generator to generate random text. The random text is not protected even though creativity was used in the production process. (On the other hand, if nothing creative happened in the production process, then the end result obviously cannot be creative.)

But even in the rare case where something creative in the end result can somewhat plausible be seen, I suspect there is no room for doubt: "playing style" probably does not legally qualify as something that can be protected. But here I am speculating.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

chrisw wrote: Mon Aug 13, 2018 1:14 am On Advocate Generals ... well, yes, but. Humans are used to copyright applying to tangible, maximum 3-D things. We are entering a world of intangibles and many dimensions. Maybe super-smart stuff that we can't understand just isn't amenable to the ancient law of copyright.
Limiting copyright to stuff that can be seen or heard might indeed turn out to be too limiting. Let's see what the CJEU will decide.
On your "perception" points ...
Because the net is modelled to some extent on (human) visual system, it might be possible to detect visual patterns forming in the net layer neurons as the "understanding" becomes deeper, so to speak. I would not like to be the poor guy given the task of hunting for these across different game positions inputs, comparing them, and trying to generate some visual material to convince your advocate general, though.
Even then, I have trouble seeing those visual patterns as the intellectual creation of the guy selecting the training games. It wasn't his aim to create particular numbers or visual patterns.

There is still the possibility of the numbers being protected by a database right (in the EU). I would have to take a very long look first to say anything sensible about that.
Okay, I am outnumbered by good arguments. I drop that machine generated weights can be copyrighted. Does that apply to PST's too? Runs and hides ....
PST values more directly influence playing style, but I doubt that is enough to make them protected.
pilgrimdan
Posts: 405
Joined: Sat Jul 02, 2011 10:49 pm

Re: Copyright and Machine Learning IP

Post by pilgrimdan »

if we take it down to its basic level … zero's and one's … when does one group of zero's and one's become original … and another group of zero's and one's does not …
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

syzygy wrote: Mon Aug 13, 2018 1:17 amOK, so you agree that a random ordering is not protected.
Of course, that was never an issue. Also, the decimal representation of the value of pi would be unprotected (list of facts).
That functional features are not protected is also well established. See e.g. here:
4.2 (...) De aldus door het hof toegepaste maatstaf is juist. Elementen van het werk die louter een technisch effect dienen of te zeer het resultaat zijn van een door technische uitgangspunten beperkte keuze, zijn van auteursrechtelijke bescherming uitgesloten (vgl. HR 22 februari 2013, ECLI:NL:HR:2013:BY1529, NJ 2013/501, rov. 3.4 onder (c), en de aldaar aangehaalde, door het onderdeel ingeroepen BSA-uitspraak van het HvJEU).
IMO "to serve solely a technical effect" here would apply to the requirement that it plays legal Chess, not that it plays good chess. As Chess is yet unsolved, move selection takes place by heuristics, which require creativity to design.

But even in the rare case where something creative in the end result can somewhat plausible be seen, I suspect there is no room for doubt: "playing style" probably does not legally qualify as something that can be protected. But here I am speculating.
Playing style would merely be the output of the NN, though. I don't want to protect the style, but the NN that generated it.

So it is more comparable to the case where I write a pseudo-random number generator. We agree that the output of such a PRNG is not copyrightable. But would the code for the PRNG be copyrightable? It might be far better than any existing PNRG, in terms of metrics used to judge PRNG (e.g. periodicity, long-range correlations).

It is not at all inconceivable that I could obtain code for a PRNG by randomly generating C programs from the C syntax, with some restriction (claimed by me to be creatively designed) on the syntax (e.g. so that the resulting function always terminates and returns an integer). Would that disqualify me from getting copyrights on any PRNG code I carefully designed by hand, with the aid of deep mathematical theories?
Fulvio
Posts: 395
Joined: Fri Aug 12, 2016 8:43 pm

Re: Copyright and Machine Learning IP

Post by Fulvio »

hgm wrote: Mon Aug 13, 2018 8:49 am So it is more comparable to the case where I write a pseudo-random number generator. We agree that the output of such a PRNG is not copyrightable. But would the code for the PRNG be copyrightable? It might be far better than any existing PNRG, in terms of metrics used to judge PRNG (e.g. periodicity, long-range correlations).
In my opinion both the code and the output of the PRNG will have copyrights.
For example it will be possible to sell a high quality list of pseudo-random numbers and customers will not have the rights to distribute that list to others.

In the NN case there is a big difference, its creator didn't write anything.
The creation of a NN seems more a manufacturing process then an intellectual work, and it is questionable if it is protected by copyright's law.
For example copyright apply to iOS but not to the iPhone (which is protected by patents).
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

Fulvio wrote: Mon Aug 13, 2018 10:33 amThe creation of a NN seems more a manufacturing process then an intellectual work, ...
That could also be said from a painting, e.g. a portrait. You just apply paint to a canvas, in a pattern specified by a photograph (handed in by the subject). Apparently selecting the colors is considered creative enough to make the painting copyrightable. Feeding a specific learning example to the NN, is not fundamentally different from selecting a color, and making a brush stroke with it.
Fulvio
Posts: 395
Joined: Fri Aug 12, 2016 8:43 pm

Re: Copyright and Machine Learning IP

Post by Fulvio »

hgm wrote: Mon Aug 13, 2018 10:52 am That could also be said from a painting, e.g. a portrait. You just apply paint to a canvas, in a pattern specified by a photograph (handed in by the subject). Apparently selecting the colors is considered creative enough to make the painting copyrightable. Feeding a specific learning example to the NN, is not fundamentally different from selecting a color, and making a brush stroke with it.
It is completely different.
Even if designers select the colors for manufactured Tesla cars, that doesn't make them copyrightable.
On the contrary even Pollock's paintings are clearly intellectual works.

Let's consider the games as source code, the training software as a compiler and megadatabase as a collection of source code files.
If you remove some files from that collection and compile the result are you entitled to any copyright?